Oracle Database Row versus Set Processing, Surprise!

Posted on 13-May-2012 by Craig Shallahamer, craig@orapub.com

On the back end you've got a huge Oracle Database server yet you're row processing on the Java client... Surprise!

Two weeks ago I joined a good friend on-site at one of his clients. Upon arriving a discouraged developer wanted to know why his report was currently running so slow, when previously it had run just fine. For sure it was the database, right?

Long story short, the report sucked data from a couple of database tables into a Java client front end placing this data into memory structures, and then the Java code did all sorts of (I'm sure) amazing sequential row/record processing. This worked great with smaller quantities of data, but as the developer confessed, this past run was processing a larger volume of data. The database processing only took a few seconds, while the Java portion took much longer. We're not sure just how much longer since the report never finished!

Once the database and Java processing timing situation was explained and we think understood, the developer was still very visibly unsettled. He was taught that the database was there to simply retrieve, insert, or change mass amounts of data and all the complex business rule processing should be done in client-side Java code. It was obvious to both my friend and I that the benefits of set processing compared to row processing were not something the developer thought about and perhaps understood.

We gently explained that perhaps he could do some of the processing in the database using the benefits of set processing. After a short discussion about this, it was clear his solution was the way it "should be done" and that meant all the business processing should be done in the Java client.

There are two things he just didn't get. First, that set processing rocks! And second, row processing is not likely to scale as data volumes increase. He was encountering both of these truths.

While all Oracle DBAs have at one time or another been told about the benefits of set processing and the scalability issues with row processing, no one has ever pointed me to a real test. And I have never personally created a test to prove (at least in a test situation) the benefits of set processing.

This posting is the results of a simple experiment comparing row to set processing to specifically see for myself:

1. Does set processing truly bring increased performance, in terms of increased throughput?

2. Does set processing scale better then row processing, in relation to increased data processing volume?

If you want to know what happened... read on! (You can also skip to the Conclusion section below.)

Experimental Design

I wanted to keep the experimental design simple and give row processing as much advantage as possible.

View the complete analysis script HERE, below I pull just the key snippets I need to present.

View the raw experimental data results from the op_results table here.

The processing is pretty straightforward. I'm inserting rows into the op_interim table containing the number and amount of orders for each customer who has a status of '1'.

Below is the set processing query that runs on the database server.


insert into op_interim
  select c.id, c.status, c.name, count(o.amount), sum(o.amount)
  from   op_customer c,
         op_order o
  where  c.status = '1'
    and  c.id = o.cust_id (+)
  group by c.id, c.status, c.name;
commit; 

Below is row processing code written in simple plsql. The plsql code was run on the database server, not from a separate machine.


order_sum_check_v := 0;

open cust_cur for
  select id, status, name
  from   op_customer
  order by id;
loop
  fetch cust_cur into
    cust_cur_id_v, cust_cur_status_v, cust_cur_name_v ;
  exit when cust_cur%NOTFOUND;

  order_count_v      := 0 ;
  order_sum_amount_v := 0 ;

  open order_cur for
    select id, cust_id, amount
    from   op_order
    order by cust_id, id;
  loop
    fetch order_cur into
      order_cur_id_v, order_cur_cust_id_v, order_cur_amount_v ;
    exit when order_cur%NOTFOUND;

    if (cust_cur_status_v = '1') AND (cust_cur_id_v = order_cur_cust_id_v) then
      order_count_v      := order_count_v + 1 ;
      order_sum_amount_v := order_sum_amount_v + order_cur_amount_v ;
      order_sum_check_v  := order_sum_check_v + order_cur_amount_v ;
    end if;
  end loop;
  close order_cur;

  insert into op_interim values
  (cust_cur_id_v, cust_cur_status_v, cust_cur_name_v, order_count_v,
     order_sum_amount_v);
end loop;
close cust_cur;
commit; 

A big benefit to the row processing is I pull the data from Oracle in sorted order. This makes the row-by-row nested loop easy to code and "fast." To ensure both the row and set processing actually did the equivalent from a business processing perspective, I kept a counter, order_sum_check_v so I knew the total order amount processed. To my relief, comparing set to row processing, the counter always matched perfectly.

Just comparing the two above snippets of code makes me run towards set processing. The chances of buggy code is obviously far greater in the row processing. (...reminds me a lot of my first "MIS" class on COBOL programming! Run away!!!)

For my experiment I varied the number of rows processed from 100K to 1M in 10 sets. That is, 100K, 200K,..., 1000K. I took 12 samples for both row and set processing at each row set level. I would have liked to take more samples, but the row processing took a long time. The entire experiment ran for around 40 hours.

The detailed results were stored in the op_results table. I wrote a couple simple queries to make analyzing the data in Mathematica easier. I ran the test on (cat /proc/version): Linux version 2.6.32-300.3.1.el6uek.x86_64 (mockbuild@ca-build44.us.oracle.com) (gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC) ) #1 SMP Fri Dec 9 18:57:35 EST 2011. The Dell Precision T5500, 6 CPU core box has 24GB of ram, and Oracle 11.2.0.1.0.

Experimental Results

View the raw experimental data results here. You can download all the experimental files (scripts, results data, significant test details, thoughput and scalability details in both PDF and Mathematica... everything HERE.

The experiment was designed to answer these two questions:

1. Does set processing truly bring increased performance, in terms of increased throughput?

2. Does set processing scale better then row processing, in relation to increased data processing volume?

Let's look at each of these in more detail (not that much detail though).

1. Does set processing truly bring increased performance?

Take a quick look at the below chart. Each point represents 12 samples for a set number of rows to process. The red square points are the set processing results and the round blue points are the row processing results.

The chart above clearly shows set processing operates at a much higher throughput than row processing. In fact, with 100K rows to processing, set processing throughput was 3741 times greater than row processing throughput (1526476 rows/sec vs 408 rows/sec).

Our experiment showed from 100K to 1M rows processed, set based processing's throughput was much higher compared to row based processing.

2. Does set processing scale better then row processing, in relation to increased data processing volume?

Have you ever heard someone say, "I can't understand why it taking so much longer now. It took 10 minutes to process 10K rows but with 50K rows it still isn't finished after four hours!" What the person is not understanding is referred to is application scalability (among other things).

The misconception is believing if 10K rows are processing in 10 minutes, then 50K rows will be processed in 50 minutes and on and on. Set processing is much more likely to scale linearity than row processing because it processes in group or sets. In contrast, row processing just plows through the data one row at a time missing out on the natural grouping of rows that set processing provides.

The above chart shows the time related to process x number of rows. As in the first chart, the row processing points are the blue circles and the set processing are the red squares. The set based processing results is difficult to see because all the points on virtually on the x-axis! The set based points are the red square points. So obviously, set based processing is faster and stays faster as the number of rows increases. Don't ever let someone convince you row based processing is faster than set based processing unless they can clearly demonstrate using your real data.

For the statistically minded, the time difference is the smallest with the fewest number of rows processes, that is, 100000. Because I'm plotting the averages, we can't visually tell for sure if there is a statistically significant difference between row and set processing. For example, the variance could be massive so in reality, there really isn't a difference. So I performed a statistical significance test (download link at top of Experimental Results section) using the 12 elapsed time samples from both the row and the set based processing. The p-value is 0.00000 (really) indicating it is highly unlikely the two sample sets came from the same population. This means, they are different. OK... no surprise here.

Contrasting set and row processing scalability is very evident both visually and numerically. Looking at the above graph, you can see that difference between row and set processing increases as the rows to process increases. Numerically, when creating a linear trend line, the slope of the row based processing is 0.00259 and the slope for the set processing data is 0.00000. (For set processing, the slope is actually 6.17x10-7.)

This means that as the number of rows to processes increases, the time difference between row and set processing continues to increase.

So not only is row based processing slower as the amount of work increases, the time difference continues to increase as the amount of work increases.

Conclusions

If you're a DBA you already knew in your gut what the conclusions would be. But now for both of us, we can stand in front of people and point to experimental evidence clearly showing:

Thanks for reading!

Craig.

If you have any questions or comments, feel free to email me directly at craig at orapub.com.

Speed Sometimes Means Changing The Rules Simple Way to Calculate OS CPU Utilization From Oracle Database V$ Views Which Is Better; Time Model Or ASH Data?