How To Tell If Similar SQL Statements Are Causing Parsing Issues

Posted on 14-Mar-2017 by Craig Shallahamer / OraPub / craig@orapub.com

In this second parse post, I focus on some diagnostic clues or tells related to parsing issues caused by similar yet unique SQL statements.

In Part 1, we knew similar statements was the problem because I was the one subjecting the system with the similar statements! Hopefully no one is doing this in your production system!

I will show you three similar SQL problem clues. The first clue is a performance ratio, the second are wait events and the third is creatively using the force matching signature.

What are these clues and how do we effectively use them? That's the focus of this post... and the video!

Part 1 Review

In the first part of this series I focused on demonstrating how lots of similar yet unique SQL statements caused performance issues.

I did this by flooding Oracle with thousands of carefully crafted statements. The result was increased CPU consumption and wait time without a throughput increase. The throughput was stymied because of the increased memory management activities, which include cursor allocation and deallocation and the associated shared pool latch and library cache related mutex acquisition wait time.

Throughout this article, the Oracle instance parameter cursor_sharing is set to its default value of "exact". I will specifically address cursor_sharing in my next parsing post.

Something Smells Fishy

I need to write, that before you take your time to actually go down this path, you need an indicator, that is a clue, that parsing is contributing or about to contribute to a performance incident.

Sometimes a parsing problem may be developing but we can't see it yet in the wait interface. The potential issue could still be hidden within Oracle process CPU consumption.

This may sound strange, but it is centered around the fact that while Oracle does a great job instrumenting wait time, Oracle simply does NOT instrument CPU consumption. This means a problem could be brewing by increased CPU consumption, yet is hidden from us!

I blogged about how to get Oracle CPU consumption details HERE which includes a tool and an associated video.

So, before the problem shows itself in a time based analysis, you will notice more CPU time is spent on parsing. By "more" I mean, more than when performance is OK.

For example, look at one of your AWR reports, specifically in the Instance Efficiency Percentages section. On the lower right, you will see a the statistic named, "% Non-Parse CPU:". The best case scenario is 100%, which is technically not possible. The LOWER this ratio, the MORE CPU time Oracle processes are using to parse. So, we want this ratio to be higher, rather than lower. (Yeah... I agree, it seems kind of backward.)

What's a good number? It depends and that is why Ratio Analysis sucks. If you suspect a performance issue, look at this parameter now and also look when performance is good. If you notice a drop of more than a couple percentage points, then it warrants some attention. Not that there is a problem, just that you need to take a closer look.

In my experience, this number is usually greater than 95%. But every system is different. Remember ratios are indicators, they do not tell you specifically what is wrong or what to do about it... just an indicator.

The "% Non-Parse CPU:" ratio is kind of a backdoor way to get around Oracle lack of CPU instrumentation.

Something Smells Really Fishy

When a parsing problem begins to manifest, you can tell by the top wait event. It will be something like, "latch: shared pool", "library cache: mutex X" or perhaps "cursor: pin S".

When you see wait events like this, it is time to consider the problem could be that lots of similar SQL statements are flooding into Oracle and overwhelming it. The problem could easily be something else, but it is related to Oracle's library cache.

On a related note, usually your CPU subsystem is highly utilized... but not always.

Something Is Fishy

I'm sure there are lots of ways to figure this out. But I'm going to show you my three favorite, which are based on the same concept. The first references on v$sql, the second v$active_session_history and the third dba_hist_active_sess_history.

To begin, check out the below SQL, especially the final statement referencing v$sql. There should be something that "jumps out" at you. Do you see it?

SQL> alter session set cursor_sharing=exact;

Session altered.

SQL> alter system flush shared_pool;

System altered.

SQL> select dummy from dual where dummy = '-';

no rows selected

SQL> select dummy from dual where dummy = 'S';

no rows selected

SQL> select dummy from dual where dummy = 'M';

no rows selected

SQL> select dummy from dual where dummy = 'L';

no rows selected

SQL> select dummy from dual where dummy = 'X';

D
-
X

SQL> select dummy from dual where dummy = '+';

no rows selected

SQL> set tab off
SQL> set linesize 300
SQL> col sql_text format a50
SQL> col fms format 99999999999999999999

SQL> def yourSQL='dummy from dual'

SQL> select force_matching_signature fms, sql_id, sql_text  
  2  from   v$sql 
  3  where  lower(sql_text) like '%&yourSQL%' 
  4    and  lower(sql_text) not like '%v$sql%';
old   3: where	lower(sql_text) like '%&yourSQL%'
new   3: where	lower(sql_text) like '%dummy from dual%'

                  FMS SQL_ID        SQL_TEXT
--------------------- ------------- --------------------------------------------------
  8392995687108182889 fs970tvybwqks select dummy from dual where dummy = '+'
  8392995687108182889 f1z17mqqdx37t select dummy from dual where dummy = 'L'
  8392995687108182889 b5k69vyp3jdnx select dummy from dual where dummy = 'S'
  8392995687108182889 fd5a3zsgdu0gx select dummy from dual where dummy = 'X'
  8392995687108182889 bv4x5rft7um64 select dummy from dual where dummy = '-'
  8392995687108182889 0vnnj9yhdzqqh select dummy from dual where dummy = 'M'

6 rows selected.

SQL>

Notice that each SQL statement is textually unique. This is why their associated SQL_IDs are also unique. Each SQL_ID is associated with an Oracle parent cursor (and other very cool stuff) inside the Oracle Library Cache, which is located within the Shared Pool, which is located within the System Global Area (SGA).

But what is really interesting is we can see that Oracle knows these statements are similar because their force_matching_signature (FMS) is the same!

This Fish Stinks

If we have a lot of similar statements, they will each have their own unique SQL_ID but they are likely to have the same force matching signature. (Remember that cursor_sharing is set the default value of "exact".) We can use this understanding to create a very simple script to alert us of a possible similar statement situation.

It's as simple as this:

def yourSQL='dummy from dual'
def threshold=1

select force_matching_signature fms, count(distinct(sql_id))
from   v$sql
where  lower(sql_text) like '%&yourSQL%'
  and  lower(sql_text) not like '%v$sql%' 
group by force_matching_signature
having count(distinct(sql_id)) >= &threshold;

Here is the output from my prior example above.

SQL> select force_matching_signature fms, count(distinct(sql_id))
  2  from   v$sql
  3  where  lower(sql_text) like '%&yourSQL%'
  4    and  lower(sql_text) not like '%v$sql%'
  5  group by force_matching_signature
  6  having count(distinct(sql_id)) >= &threshold;
old   3: where  lower(sql_text) like '%&yourSQL%'
new   3: where  lower(sql_text) like '%dummy from dual%'
old   6: having count(distinct(sql_id)) >= &threshold
new   6: having count(distinct(sql_id)) >= 1

                  FMS COUNT(DISTINCT(SQL_ID))
--------------------- -----------------------
  8392995687108182889                       6

Of course in a production system with a similar statement problem, you won't have six distinct SQL_IDs related to one force matching signature (FMS), you will have hundreds, thousands or even millions!

This Fish Really Stinks

We can take this a step further and really nail it! While having millions of similar statements related to a single FMS is cause for alarm, what if each statement is only executed once? Remember, the first execution is a hard parse, which requires significantly more memory management (see my post HERE and HERE).

Here's how we can figure out if each of these above six statements (or millions in a production situation) is being hard parsed, that is executed only once.

SQL> def threshold=1
SQL> def yourSQL='dummy from dual'
SQL> select force_matching_signature fms, count(distinct(sql_id)), sum(executions)
  2  from   v$sql
  3  where  lower(sql_text) like '%&yourSQL%'
  4    and  lower(sql_text) not like '%v$sql%'
  5  group by force_matching_signature
  6  having count(distinct(sql_id)) >= &threshold;
old   3: where	lower(sql_text) like '%&yourSQL%'
new   3: where	lower(sql_text) like '%dummy from dual%'
old   6: having count(distinct(sql_id)) >= &threshold
new   6: having count(distinct(sql_id)) >= 1

                  FMS COUNT(DISTINCT(SQL_ID)) SUM(EXECUTIONS)
--------------------- ----------------------- ---------------
  8392995687108182889                       6               6

It's so simple is ridiculous! There are six distinct SQL_IDs and six total executions, which means only one execution per SQL_ID... and we know the first execution is a hard parse! Therefore, in the above situation each of the six SQL_IDs has been executed only one time! Six is no big deal, but thousands or millions? That is a very big deal and will literally eat up your CPU capacity until there is nothing left.

By the way, if your cloud provider charges you based on CPU consumption, you will likely receive a VIP invitation to their next event ;)

From a practical standpoint, I would not run the above script until either my CPU parse ratio looks relatively bad or if the Oracle timing profile indicates there are potential parsing issues. By "Oracle timing profile" I mean, the top wait event is parsing related.

Let's Watch The Fish Decay

While reading the above is amazingly wonderful and insightful, I'm with most of you in that, I like to see live action! So I created a video. I will eventually be integrating the above scripts into my free OSM toolkit.

Just in case you are wondering... I am not the decaying fish!

Using ASH and AWR ASH

I did promise at the beginning of this post, the SQL so you can do the same thing above using ASH data and it's AWR brother. So, below is the SQL and the based on the "select dummy from dual" SQL statements I ran above.

Here is an ASH example. Just copy and paste the SQL into your system... after review the SQL of course.

set linesize 300
col fms format 9999999999999999999999999

select force_matching_signature fms, count(distinct(sql_id))
from   v$active_session_history
where  sample_time >= current_timestamp - interval '300' minute
group by force_matching_signature
having count(distinct(sql_id)) > 1
order by 2 desc
fetch first 10 rows only
/

Here is the result of a few runs from production systems.

SQL> /

                  FMS COUNT(DISTINCT(SQL_ID))
--------------------- -----------------------
  7740087316336291928                    5973
  7456868002725994366                      65
                    0                      44

Here is an AWR example. Just reset the dbid, instnum, snap_start and snap_end defines. If you have an AWR report close by, all the key numbers are located near the top.

DEFINE DBID            = "4243212345" (CHAR)
DEFINE INSTNUM         = "3" (CHAR)
DEFINE SNAP_START      = "31092" (CHAR)
DEFINE SNAP_END        = "31099" (CHAR)

set linesize 300
col fms format 99999999999999999999
select force_matching_signature fms, count(distinct(sql_id)),
       sum(executions_delta), max(version_count)
from   DBA_HIST_SQLSTAT
where  dbid = &dbid
  and  instance_number = &instnum
  and  snap_id between &snap_start and &snap_end
group by force_matching_signature
having count(distinct(sql_id)) > 1
order by 2 desc
fetch first 10 rows only
/

Here is the result from a production system. (Confession: Actually I combined the results from a few different systems.)

FMS                   COUNT(DISTINCT(SQL_ID)) SUM(EXECUTIONS_DELTA) MAX(VERSION_COUNT)
--------------------- ----------------------- --------------------- ------------------
                   0                     127              56979630                603
10559245208183986822                       3                116240                  1
13070442437085537404                       2                  2371                 31
 1015971764210959111                       5                     2                  1
 1943202972861263874                       5                     5                  1
 2958112992603642846                       4                     4                  1
 4898127320101163080                       3                     3                  1
10559245208183986822                       3                116240                  1
13070442437085537404                       2                  2371                 31
  440198342091707221                       2                     2                  1
 2088505473859546747                       2                     4                  1
 3988590000534298404                       2                   766                  2
14052807162544669559                       2                     2                  1
 2401367505072438446                       2                     8                  2

Getting Value From A Stinky Fish

At this point, you may be wondering. Gee Craig, thanks for showing us how to diagnose the problem. That's great, but how do we solve the problem? And how do we prevent it from happening in the future.

As you might have guessed, that's the focus on my next post!

I hope you enjoy the video and thanks for reading!

Craig.

If you have any questions or comments, feel free to email me directly at craig at orapub.com.

Is It OK To Be An Oracle Database Tuning Heretic? Oracle Database SQL Arrival Patterns And Impact How To Tell If Similar SQL Statements Are Causing Parsing Issues