The Situation: Detailing Oracle Process CPU Consumption
Detailing an Oracle Database process's CPU consumption is amazing, a lot of fun, and can lead to some "ah ha" moments. To make this posting daily digestible, I'm breaking it up into a multi-part series. Today I'm posting the first part entitle, The Situation. Here we go...
The Situation: The Way It Is...
By now, most Oracle DBAs evaluate Oracle performance based on time, that is a time based analysis or simply TBA. For sure there are other methods. And many methods are in part, time based. But I digress.
The beauty of a time based analysis is it helps us understand what a user is experiencing. Because users experience time, when our analysis is time based we can quantify their experience. Completely? No, but for sure in part. And in part is better than no part at all and much better than many alternatives.
Oracle processes yearn for CPU! It is paramount to understand that an Oracle server or background process either consumes CPU or does not. Currently, there is no third alternative. The "does not consume CPU" is called wait time and the wait time can be classified as either "idle" or "non-idle". I like to say that Oracle processes want to burn CPU, they want to consume CPU, they yearn for CPU. If they can't, there must be a very good reason why they most stop and wait before consuming CPU. And it's the wait events that provide clues as to why CPU can not be consumed.
What Is In The Time?
An Oracle time based analysis includes both Oracle CPU consumption and Oracle wait time. And usually it excludes the time between the Oracle client process and the Oracle server process. This is shock to many, but capturing the time between the Oracle client and server process is not that simple. And it requires more than simply looking at Oracle trace files or v$ views.
The "big bar" figure below displays the "total time." That is, all the Oracle process CPU and wait time over a specific snapshot interval. This is one way to communicate the total CPU and total wait time.
There are a number of ways to collect the basic timing information. Most techniques use Oracle trace files or the Oracle v$ views.
Oracle's wait interface is amazing! Multiple database performance tool vendors will tell you Oracle's wait interface is far superior than DB2 and SQL Server. Below is a graphical example taken from the Stori product visually showing the top five wait events, that is, all the Oracle wait time for all Oracle processes over a specified snapshot interval grouped by the top five events.
The number of wait events and the detailed data available is amazing. As of Oracle Database 12c (12.1) there are just under 1600 wait events. If this seems overwhelming, most DBAs routinely only work with a handful at a time and probably truly understand even less.
Available CPU Consumption Detail
Well... the detail is limited. However, Gathering CPU consumption is very straightforward via trace files and the Oracle Database v$ views. From the v$ views, CPU consumption can be gathered for a session from v$sess_time_model and for an entire instance from v$sys_time_model. I have blogged numerous times about the quality of these CPU consumption values for both server processes and background processes here and here. (While I will not cover this here, the closer you look at Oracle CPU consumption, the more likely you understand even Oracle is making some CPU consumption assumptions.)
The glaring hole in Oracle time based analysis is Oracle does not provide detailed CPU consumption numbers. For an Oracle process the most detailed information we can obtain from Oracle is the total consumption.
If your boss asked you what Oracle is doing "with all that CPU" the best you can do is provide the total amount and then make inferences based on the Oracle wait events, the operating system situation, the application code and the user experience. That's a lot of analysis information and obviously allows for a very detailed and spot-on analysis.
But still, answering the "What is Oracle doing with all that CPU?" should in my opinion be much easier to answer and more clear. And compared to the wait interface detail, available CPU consumption detail is simply incomplete. I think as DBAs, we can do better.
What I want is the same level of CPU detail as wait event detail. Or at least something so I can reduce the amount of inferencing required when answering the question, "What is Oracle doing with all that CPU?" It would be nice to gather Oracle CPU consumption for any Oracle process grouped by Oracle kernel function. Below is an example of what I would like to see.
Once I know the Oracle function, then I can find out, with some good detail, what the process is doing while consuming the CPU...and answer "the question."
Is This Really Worth My Time?
Perhaps not. But consider these benefits. As I mentioned previously, I want to better answer the question, "What is Oracle doing with all that CPU?" Another benefit is to confirm the performance story. For example, if there is a parsing issue, I will expect to see some type of shared pool and/or library cache related CPU consumption. Another less obvious benefit is understanding the performance situation when a small amount of wait time is present. Without the wait time, our inferencing capability is limited. Knowing the CPU consumption details can negate the absence of wait time detail limitation. And finally, you may find an Oracle bug.
This is a good place to end this posting. In the next posting, we will explore the tools available to gather Oracle process CPU consumption down to the Oracle kernel function.
But if you can't wait until the next posting (or you have lots of time), here are some resources.
1. Online Seminar: Detailing Oracle CPU Consumption: The Missing Link. You can download the seminar PowerPoint for free. The seminar is awesome and you can watch the introduction right here!
2. Tool: Oracle wait time and kernel function CPU monitor. Download the fulltime.sh script HERE.
Thanks for reading,
If you have any questions or comments, feel free to email me directly at craig at orapub.com.
|Do Oracle Database LGWRs (10g, 11g, 12c) Always Sleep For The Full Three Seconds?||Which Is Better; Time Model Or ASH Data?||Impact Of Oracle Database Consistent Reads|