Oracle Focus: Log File Switch Checkpoint Incomplete
Users are livid and the top Oracle wait event is - log file switch checkpoint incomplete. That's how we found our users in my previous post. So, we started the performance analysis by gathering some baseline performance statistics.
In this article we will focus on the Oracle analysis by doing an Oracle time-based analysis to highlight the Oracle performance improving opportunities. As you will see, even without looking at the operating system or the application, we will make a big impact. How I did this and how massive was the impact? That's the focus of this post!
Oracle: Time-Based Analysis
An Oracle time-based analysis identifies performance improving opportunities by highlighting categories containing lots of Oracle time. Because both our analysis and the user experience is based on time, we can create a quantitative link between our analysis and the user experience. It's not going to be perfect, but there will be a relationship.
Essentially, I'm logically categorizing all the Oracle time (cpu and non-idle wait time) and then turning my focus on those categories containing lots of time. If my objective is to reduce time and especially time to process a piece of work like a buffer get, then those "big time" categories will likely present me with significant performance improving opportunities. It is that simple.
SQL> @ttpctx Remember: This report must be run twice so both the initial and final values are available. If no output, press ENTER about 11 times. DB/Inst: prod30/prod30 12-May 11:05am Report: ttpctx.sql OSM by OraPub, Inc. Page 1 Total Time Activity (359 sec interval) Avg Time Time Wait Time Component % TT % WT Waited (ms) (sec) Count(k) -------------------------------------- ------- ------- ----------- ----------- -------- CPU consumption: Oracle SP + BG procs 45.82 0.00 0.000 1164.066 0 log file switch (checkpoint incomplete 30.75 56.75 634.083 781.190 1 db file async I/O submit 7.09 13.08 323.831 180.050 1 db file scattered read 4.88 9.00 0.452 123.920 274 control file parallel write 2.73 5.04 20.151 69.400 3 log file switch completion 2.13 3.93 163.505 54.120 0 log file parallel write 1.95 3.59 29.032 49.470 2 enq: CF - contention 1.39 2.56 108.889 35.280 0 enq: CI - contention 1.07 1.97 460.000 27.140 0 db file single write 0.87 1.61 12.449 22.110 2 asynch descriptor resize 0.35 0.65 0.001 8.980 ##### log file single write 0.33 0.61 18.986 8.430 0 PX qref latch 0.10 0.18 0.001 2.500 2468 db file sequential read 0.06 0.11 0.007 1.480 221 latch: cache buffers chains 0.06 0.10 2.132 1.420 1 PX Deq: Slave Session Stats 0.05 0.09 0.177 1.270 7 buffer busy waits 0.05 0.09 3.944 1.270 0 log file sync 0.04 0.07 125.000 1.000 0 ...
Including both the CPU and around 80% of the non-idle wait time, I will categorize the time as follows.
CPU 1164 sec Wait 1154 sec IO 1154 sec Read 124 sec db file scattered read 124 sec Write 1030 sec log file switch checkpoint incl 781 sec db file async I/O submit 180 sec control file parallel write 69 sec Other 0 sec
Clearly, we have an opportunity to reduce time by focusing on both CPU consumption and non-idle wait time. Regarding the non-idle wait time, there is a clear opportunity to focus on IO write time and specifically time related to the log file switch checkpoint incomplete wait time.
Next, I will develop some very "Oracle focused" solutions to directly go after these big chunks of time!
Oracle Focused Solutions
If you have followed my work, you will know I always analyze performance and develop solutions using three different perspectives; Oracle, the application (SQL, processes, modules, etc.) and the operating system. In this article, I will focus only on the Oracle focused solutions. In my next post, I will focus on the operating system analysis and solutions. And finally, I'll focus on the application analysis and solutions along with bringing the entire analysis to a close.
Clearly, our two big opportunities are to reduce CPU consumption and the log file switch time. Remember, I am purposely excluding both operating system and application focused solutions.
Solution: Reduce CPU Consumption
Off the top of my head, I can't think of any Oracle adjustments I can make to reduce CPU consumption. I'm also not very concerned because I have a hunch the real problem is something else...
Solution: Increase Time Between Log Switches
In my previous post our baseline recorded the online redo logs switched an average of 36.33 times a minute, which is an average of one switch every 1.65 seconds! I don't think I'll get much pushback when I say, THAT IS CRAZY!
Every log switch forces a checkpoint and every checkpoint forces the DBWRs to write every dirty buffer to disk. Plus there is a fixed amount of overhead for each checkpoint. For efficiency reasons (think: batching), Oracle wants to write when it wants to, so a lot of forcing can cause unnecessary overhead and wait time.
Personally, I like redo logs to switch no more often than once every 15 minutes and 30 minutes is even better. However, because employment is important to me, recovery is more important than performance...be sure you can recover quick enough if you increase the time between log switches.
The simplest Oracle focused way to increase the time between log switches is to increase the size of each redo log. The question now becomes how big should the new redo logs be.
How Big Should The Redo Logs Be?
By increasing the redo log size, the log switch frequency will decrease.
If we know the current switch rate and the current redo log size, one low precision calculation method requires only a simple Algebraic equation. For this example, let's say our goal is one switch once every 3 minutes, not once every 30 minutes as I would do with a real production system.
First, I need to know the current switch rate. Our baseline recorded an average switch rate of 36.33 switches per minute. Second, I need to know the size of the current online redo logs. Here are their sizes:
$ ls -l /*/oradata/prod30/*log -rw-r----- 1 oracle oinstall 4194816 May 2 11:40 /home/oradata/prod30/redoA1.log -rw-r----- 1 oracle oinstall 4194816 May 2 11:40 /home/oradata/prod30/redoA2.log
Wow! Someone made what I call a, "career decision." This Oracle system has two 4MB redo logs. No wonder they are switching so fast! Let's calculate the new redo log size.
Here's how I would set up the equation.
180 secs between switches / ? MB redo log = 1.65 sec between switches / 4 MB redo log 180/x = 1.65/4 x = 180*(4/1.65) x = 436.36 MB
So, for each redo log switch to be around 180 seconds, each redo log needs to be around 437 MB. I'm not quite done yet, because having only two redo logs require 50% of all my redo log space to be checkpointed before the LGWR can switch into the next redo log.
How Many Redo Logs Should There Be?
Now I'm going to focus on the number of online redo logs. Adding more online redo logs decreases the likelihood of receiving a checkpoint incomplete wait.
Basically, I am trying to reduce the amount of checkpoint activity Oracle must complete that is associated with the log the LGWR will attempt to switch into. If the checkpoint associated with the next redo log is not finished when the LGWR attempts to switch into it, the LGWR will have to wait... for the checkpoint to complete.
To help make this clear, here's an example. Suppose the total redo log space is 200MB. With two 100MB redo logs, 50% of all my redo log space must be available before the LGWR can switch into the next redo log. However, with four 25MB redo logs, only 25% of all my redo must be be available before the LGWR can switch into the next redo log.
So, having four redo logs instead of two redo logs, I am requiring Oracle to only complete a checkpoint associated with 25MB instead of 100MB. 25MBs will surely be checkpointed faster than 100MB, thereby reducing the chances of the LGWR having to wait for the checkpoint to complete.
By combining both of the above redo size and number strategies when sizing my redo logs, I should be able to solve just about any log file switch checkpoint incomplete wait situation. This is especially true if I am also able to alter the operating system and application.
Making The Redo Log Changes
To keep this simple, I'm going to create three 440MB redo logs, for a total of 1320MB. The 440MBs will reduce the log switch rate and the third redo log will directly address my goal of reducing the likelihood of a checkpoint incomplete. Here's how I did this:
SQL> ALTER DATABASE ADD LOGFILE '/home/oradata/prod30/redo41.log' size 440m reuse; Database altered. SQL> ALTER DATABASE ADD LOGFILE '/home/oradata/prod30/redo42.log' size 440m reuse; Database altered. SQL> ALTER DATABASE ADD LOGFILE '/home/oradata/prod30/redo43.log' size 440m reuse; Database altered. SQL> !ls -l /*/oradata/prod30/*log -rw-r----- 1 oracle oinstall 461373952 May 12 11:22 /home/oradata/prod30/redo41.log -rw-r----- 1 oracle oinstall 461373952 May 12 11:21 /home/oradata/prod30/redo42.log -rw-r----- 1 oracle oinstall 461373952 May 12 11:22 /home/oradata/prod30/redo43.log -rw-r----- 1 oracle oinstall 4194816 May 12 11:21 /home/oradata/prod30/redoA1.log -rw-r----- 1 oracle oinstall 4194816 May 12 11:21 /home/oradata/prod30/redoA2.log SQL> alter database drop logfile '/home/oradata/prod30/redoA2.log'; Database altered. SQL> alter database drop logfile '/home/oradata/prod30/redoA1.log'; Database altered. SQL> select group#, member from v$logfile; GROUP# MEMBER ---------- -------------------------------- 2 /home/oradata/prod30/redo41.log 3 /home/oradata/prod30/redo42.log 5 /home/oradata/prod30/redo43.log 3 rows selected.
All done? Not really. Aren't you just a little curious if this made a difference? I know I am!
Next, I once again gathered my baseline statistics. Below are the results.
Wow, what happened? Essentially, I opened the redo flood gates (a little) and the redo just started streaming through Oracle.
Notice the log switches per minute. That decreased from over 36 per minute to 0.35 per minute, which is about 2.9 minutes per log switch. I you recall from above, my objective was to size the redo logs for one around every three minutes. Looks like my simple sizing formula worked pretty good!
Notice the blocks changed per second jumped from around 13k/sec to 21k/second. That's around a 63% increase and represents a fantastic throughput improvement. I'm very pleased with this result.
But what about the log switch wait time percentages? Well... the redo was so intense and my IO subsystem so feeble, my Oracle time-based script, ttpctx.sql did not complete within one minute. It's because my script inserts some of the interim results into a table.. I think I need to work on my script. Let's try another approach.
Another way to check if there is lots of checkpoint incomplete time is too look at the alert log file. Here is an excerpt.
... Thu May 12 12:56:09 2016 Thread 1 cannot allocate new log, sequence 345677 Checkpoint not complete Current log# 1 seq# 345676 mem# 0: /home/oradata/prod30/redo41.log Thu May 12 13:01:21 2016 Thread 1 advanced to log sequence 345677 (LGWR switch) Current log# 4 seq# 345677 mem# 0: /home/oradata/prod30/redo42.log Thu May 12 13:02:13 2016 Thread 1 cannot allocate new log, sequence 345678 Checkpoint not complete Current log# 4 seq# 345677 mem# 0: /home/oradata/prod30/redo42.log Thu May 12 13:05:30 2016 Thread 1 advanced to log sequence 345678 (LGWR switch) Current log# 5 seq# 345678 mem# 0: /home/oradata/prod30/redo43.log ...
We can see two things. First, there are still checkpoint not complete's occurring. But the log switches are occurring much less frequently; not in seconds but in minutes. That's a good thing! I suspect by adding additional redo logs of the same size, I could completely eliminate the checkpoint not complete wait time.
Wow, what a difference our changes made. Notice everything changed; the redo log switch rate, the workload rate and the timing profile (not returning counts as a change). So, is performance good enough? Should we stop?
Should We Stop?
A valid question is, with the massive increase in performance should we just stop? To be honest, I would NOT have implemented the solution to add more and larger redo logs until after I completed my analysis. I like to know all my options before I begin implementing a solution.
Remember, all I have done is focused on Oracle. I have not looked at the operating system or the application to see what is generating all the redo. So, should we stop the analysis? The short answer is NO.
In my next post I will continue this analysis by focusing on the operating system. We will look closely at the CPU, memory, IO and if necessary the network subsystem.
Until then, enjoy your work!
If you have any questions or comments, feel free to email me directly at craig at orapub.com.
|What Is Oracle DB Time, DB CPU, Wall Time and Non-Idle Wait Time||Operating System Focus: Log File Switch Checkpoint Incomplete||Impact Of Oracle Database Consistent Reads|