Part 2. Does Table Column Order Affect Performance?
I decided to take another look at the performance impact of selecting or updating a column based on its table column order.
I received some really good and some really bad feedback from my previous "column order" articles related to SELECT and UPDATE statements. As a result, I enhanced my experiment and ran it again. The results were the same but also different.
If you are tired of all this talk, then the summary is still this; if you can influence the decision, have the most frequently SELECTED and UPDATED columns at the beginning of a table definition.
However, if you are interested in the actual experimental results, the feedback I received and how I changed the experimental design then you will love this article.
Feedback: Good and Lame
I received an unusual amount of feedback on these posts. Some of the questions and comments were really good. Others were characteristic of the lack of depth and integrity that can be associated with a VERY senior job title. I think we can all learn something from the comments.
Hey Craig, Thanks! There results are what I have always believed and have been told yet never had a data or reference point.
Thanks! This is why I run so many experiments. As professionals, we need references, experiments and data points. It's important if you want to speak with honest authority.
The experimental UPDATE statement doesn't always update rows.
My most feared response! Good thing I can always make a change re-run the experiment!
One (and only one) very astute DBA emailed me about this. And, he is correct. Check out the below SQLPLUS session, focusing on the final statement.
SQL> drop table bogus; Table dropped. SQL> create table bogus (c1 number, c2 number); Table created. SQL> insert into bogus select level just_a_column, 2 from dual connect by level <= 50000; 50000 rows created. SQL> commit; Commit complete. SQL> update bogus set c2=1 where rownum between 1 and 1; 1 row updated. SQL> update bogus set c2=1 where rownum between 1 and 5; 5 rows updated. SQL> update bogus set c2=1 where rownum between 2 and 5; 0 rows updated. SQL> update bogus set c2=1 where rownum between 10 and 11; 0 rows updated. SQL>
Zero rows updated? Are you kidding me? I bet you just learned something too. Or, perhaps you have a new interview question?
The problem with my update statement was I used "where rownum between X and Y" and did not always set X to 1. I did not notice the problem in my testing because I always tested with X set to 1. My bad. Also, because I always gather lots of experimental samples, the "0 rows updated" statements were masked by the other samples that did update rows. This "bad" was cause enough for me to re-run the UPDATE experiment.
The UPDATEs became slower because of row chaining and physical reads.
That could be true and makes sense. The good news is, I created a relatively large buffer cache to reduce the likelihood of any physical reads. The bad news is, I did not test for this. I also did not test for row chaining. Dang!
Just to be sure this is not a problem, when I re-ran the experiment, I checked for physical reads and rows chaining. If a sample had even a single physical read or row chain access, the result was discarded. As you will see below, this occurred only a few times.
Why are you testing something that is documented and has been known for 20 years?
Interesting question and here is my short two-part response: 1) And you "know" this because the documentation says so? 2) Do you have an experimental or real-life experience to reinforce your belief?
It is important for senior Oracle professionals to go beyond the documentation and folklore. No one is perfect, but if you want to speak with authority and confidence then you need real data points. This is one reason why I run so many experiments. They give me reference points. My experiments are not meant to tell the entire story. Just one very specific point that I can always reference.
The experiment is not real.
Good observation. The beauty of experiments they do not have to be real to be extremely useful. For example, Boeing creates non-real airplane (think: models) to learn very specific things about their design. Remember those non-real experiments about gravity bending light? I'm not sure how "real" observing bending light is but I suspect it has been very useful for some people.
The beauty of experimentation is you can study an extremely complicated or dangerous system by focusing on something very specific.
Experimental Design Changes
So, with all of that, I decided to re-run both the SELECT and the UPDATE experiments. I of course changed the UPDATE statement to always update a row. And, in both experiments I checked for physical reads (direct and buffered) and chained rows. If any check failed, I did not include the sample in the resulting statistical analysis.
To reduce the likelihood that Oracle would do direct reads (process the blocks in PGA memory and not in the buffer cache), I set _small_table_threshold=999999. That usually works!
I also disabled asynchronous IO by setting filesystemio_options=none and disk_asynch_io=false. I wanted the reads to be very simple. I could have enabled async IO or run the experiment both ways... but I didn't... I've got other stuff to do...
You can download the Analysis Pack, which contains all the experimental scripts, including the numeric and statistical results, along with additional graphics HERE.
Here are some of the general experimental design details. I ran the experiment on Oracle Enterprise Linux (Linux sixcore 4.1.12-61.1.23.el6uek.x86_64 #2 SMP Tue Dec 20 16:51:41 PST 2016 x86_64 x86_64 x86_64 GNU/Linux) using Oracle Database 12c Enterprise Edition Release 22.214.171.124.0 - 64bit Production. For each of the nearly 200 columns accessed, the SQL was run 35 times resulting in 35 samples. The starting and ending wall times, the instance statistics "physical reads cache", "physical reads direct" and "table fetch continued row" were collected for each of the 35 samples for each of the nearly 200 columns accessed. For the analysis, the first 3 of the 35 samples was thrown out as well as any sample with a recorded physical read or row chain access.
Results: Bad Data... Good News
For the UPDATE experiment, there were only 21 experimental samples that failed out of the 6265 samples collected. Only one of the 21 was the result of a chained row and no physical IOs occurred. So, that's great news!
For the SELECT experiment, there were only 38 experimental samples that failed out of the 6265 samples collected. Seven of the samples touched a chained row. Interestingly, any sample that recorded a chained row access also recorded physical reads.
This means the experimental setup produced very clean data. Having removed the "offending" experimental samples, I'm really comfortable with the data used to do the statistical analysis.
Bet you can't wait to see the results, eh?
Results: SELECT Statement
Visually, numerically and statistically column order did affect the SQL SELECT statement elapsed time.
Below are the summary statistics, using the free R statistics package, for four different columns accessed. For example, when selecting column 2 (C2) the mean elapsed time was 2.096 seconds. In contract, for column 179 (c179) the mean elapsed time is 6.327 seconds. The mean elapsed time difference between selecting from column 2 and from column 179 is 4.231 seconds (6327ms-2096ms), which is a 3x elapsed time increase. Statistically comparing C2 and C5 (not c179, but c5) the p-value is 0.0000000, which means there is a real difference.
> summary(c2) Min. 1st Qu. Median Mean 3rd Qu. Max. 2089 2092 2098 2096 2100 2103 > summary(c5) Min. 1st Qu. Median Mean 3rd Qu. Max. 2225 2227 2232 2232 2237 2240 > summary(c100) Min. 1st Qu. Median Mean 3rd Qu. Max. 6446 6448 6450 6451 6454 6460 > summary(c179) Min. 1st Qu. Median Mean 3rd Qu. Max. 6313 6319 6324 6327 6336 6351
Below is a smoothed histograms chart comparing the samples related to selecting from column 2 and column 179. There is an obvious different in the below two histograms, which is reflected in both the above elapsed time and time series charts.
So, how bad is the bad? 3X bad!!
Results: UPDATE Statement
Yes, it is true that visually, numerically and statistically column order did affect the UPDATE SQL statement elapsed times. However, based on this experiment you are not going to get any "major awards" for persuading people to put heavily updated columns near the top of the create table column list.
Below are the summary statistics for four different columns accessed. For example, when updating column 2 (C2) the mean elapsed time was 3.784 seconds. In slight contrast, for column 179 (c179) the mean elapsed time is 3.793 seconds. The mean elapsed time difference between updating column 2 and column 179 is only 0.009ms. Statistically comparing C2 and C5 (not c179, but c5) the p-value is again 0.0000000, which means there is a real difference... from a statistical perspective.
> summary(c2) Min. 1st Qu. Median Mean 3rd Qu. Max. 3779 3782 3784 3784 3785 3793 > summary(c5) Min. 1st Qu. Median Mean 3rd Qu. Max. 3784 3787 3789 3789 3790 3795 > summary(c100) Min. 1st Qu. Median Mean 3rd Qu. Max. 3787 3792 3795 3796 3800 3804 > summary(c179) Min. 1st Qu. Median Mean 3rd Qu. Max. 3787 3790 3793 3793 3795 3799 >
Below is a smoothed histogram comparing the samples related to updating column 2 and column 179. You can see there is a difference in these two sets of data. But I would like to see the "tails" not overlap at all. To indicate a likely production-user-impacting system performance improvement, we want to see a lot of space between the two histograms... like we see with the SELECT experiment.
So, how bad is the bad? Not bad enough. Based on this experiment, I would not push too hard to get heavily updated columns at the top of the create table column list. But do not forget that the column order did make a difference, just not much.
What Have We Learned?
What is the lesson learned here? Table column order makes a difference. So, if you have the chance to influence column order, then influence! Especially when dealing with SELECT statements.
Let's get real honest here for a second. Most of us who have been doing Oracle work for many years expected there to be a difference. But we did not know if the impact was massive as with the SELECT statement or minimal like with the UPDATE statement experiment.
But now we all have a reference point... a data point. We have a professional response when someone asks why we should arrange columns a certain way.
Thanks for reading and all the best in your Oracle tuning work!
If you have any questions or comments, feel free to email me directly at craig at orapub.com.
|Oracle Database Locks And Latches...What A Difference!||Watch Oracle Database Elapsed Time And Wall Time With Parallel Query||Variable Load Impact On Oracle Database Hard Parsing|