Why my execution plan has not been shared

As mentioned in the previous three articles of the series there are sixty-four reasons for which the Oracle Optimizer might decide to hard parse a new execution plan instead of sharing the existing one. Out of these sixty-four reasons, we have already examined and explained six: load_optimizer_stats and hash_match_failed in part I, roll_invalid_mismatch and top_level_rpi_cursor in part II, and bind_mismatch and stb_obj_mismatch in part III. The current article outlines two new supplementary reasons: pq_slave_mismatch and optimizer_mismatch. The former is not very popular in real-life applications. It occurs when a parallel slave(PX) of a parallelized query refuses to share the execution plan of its Query Coordinator (QC). The latter is, however, very popular. It kicks in whenever an optimizer property,stored in the child cursor metadata,is modified. Oracle has foreseen thirteen sub-reasons for an optimizer parameter mismatch, which we will try to examine in this article.

PQ_SLAVE_MISMATCH

Oracle defines this reason as follows:

(Y|N) Top-level slave decides not to share cursor

To say the least, Oracle has not made a big effort to define this reason with a great precision. As always, the best way to make its proper opinion about this reason for non-sharing is to figure out how to force it. Here’s how I have managed to produce this reason in Oracle version 12.1.0.1.0; first the model and then the query:

create table t1
as
with generator as (
     select --+ materialize
        rownum id
     from dual
     connect by
        level <= 1000
)
select
    trunc((rownum-1)/2) product_id
   ,mod(rownum-1,2)     flag1
   ,mod(rownum-1,3)     flag2
   ,rownum              n1
   ,lpad(rownum,30)     v1
from
   generator v1
  ,generator v2
where
   rownum <= 1e6;
  
create table t2
as select
     level id
    ,date '2012-06-07' + mod((level-1)*5,10) + interval '5' minute start_date
    ,rpad('xx',10) padding
from dual
connect by level <= 1e6;
 
begin
   dbms_stats.gather_table_stats(user, 't1');
   dbms_stats.gather_table_stats(user, 't2');
end;
/

The above script creates two simple heap tables and gathers their statistics without histogram. Let’s run the following query, get its execution plan, and check its execution statistics:

with got_my_dates as
 (select /*+ materialize */
      id
     ,padding
     ,start_date
  from t2
  where start_date > to_date('07/06/2012','dd/mm/yyyy')
  and  start_date <= to_date('10/06/2012','dd/mm/yyyy')
  )
select
  /*+ parallel(4) */
   t1.*
  ,cte.padding
  ,cte.start_date
from
   t1
  ,got_my_dates cte
where
    t1.product_id = cte.id
and t1.product_id <= 1e3;
 
SQL_ID  c5y75njp34rmf, child number 0
-----------------------------------------------------------------------------------------
| Id  | Operation                  | Name                       | Rows  |    TQ  |IN-OUT|
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |                            |       |        |    |
|   1 |  TEMP TABLE TRANSFORMATION |                            |       |        |    |
|   2 |   PX COORDINATOR           |                            |       |        |    |
|   3 |    PX SEND QC (RANDOM)     | :TQ10000                   |   998K|  Q1,00 | P->S   |
|   4 |     LOAD AS SELECT         |                            |       |  Q1,00 | PCWP       |
|   5 |      PX BLOCK ITERATOR     |                            |   998K|  Q1,00 | PCWC   |
|*  6 |       TABLE ACCESS FULL    | T2                         |   998K|  Q1,00 | PCWP   |
|   7 |   PX COORDINATOR           |                            |       |        |    |
|   8 |    PX SEND QC (RANDOM)     | :TQ20001                   |  1723 |  Q2,01 | P->S  |
|*  9 |     HASH JOIN              |                            |  1723 |  Q2,01 | PCWP  |
|  10 |      JOIN FILTER CREATE    | :BF0000                    |  1723 |  Q2,01 | PCWP  |
|  11 |       PX RECEIVE           |                            |  1723 |  Q2,01 | PCWP  |
|  12 |        PX SEND BROADCAST   | :TQ20000                   |  1723 |  Q2,00 | P->P  |
|  13 |         PX BLOCK ITERATOR  |                            |  1723 |  Q2,00 | PCWC  |
|* 14 |          TABLE ACCESS FULL | T1                         |  1723 |  Q2,00 | PCWP  |
|* 15 |      VIEW                  |                            |   998K|  Q2,01 | PCWP   |
|  16 |       JOIN FILTER USE      | :BF0000                    |   998K|  Q2,01 | PCWP   |
|  17 |        PX BLOCK ITERATOR   |                            |   998K|  Q2,01 | PCWC   |
|* 18 |         TABLE ACCESS FULL  | SYS_TEMP_0FD9D6628_1E139EE |   998K|  Q2,01 | PCWP   |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   6 - access(:Z>=:Z AND :Z<=:Z)
       filter(("START_DATE">TO_DATE(' 2012-06-07 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND "START_DATE"<=TO_DATE(' 2012-06-1000:00:00', 'syyyy-mm-dd hh24:mi:ss')))
   9 - access("T1"."PRODUCT_ID"="CTE"."ID")
  14 - access(:Z>=:Z AND :Z<=:Z)
       filter("T1"."PRODUCT_ID"<=1000)
  15 - filter("CTE"."ID"<=1000)
  18 - access(:Z>=:Z AND :Z<=:Z)
       filter(SYS_OP_BLOOM_FILTER(:BF0000,"C0"))
Note
-----

– Degree of Parallelism is 4 because of hint

As you can see, this query has been run parallely with a Degree of Parallelism of 4 as indicated by the Note at the bottom of its execution plan. But interestingly, it produced two execution plans (child number 0 and 1) at its very first execution, as shown below:

SQL> @gv$sql
Enter value for sql_id: c5y75njp34rmf
SQL_ID        CHILD_NUMBER FIRST_LOAD_TIME     LAST_LOAD_TIME      EXECUTIONS  END_FETCH
------------- ------------ ------------------- ------------------- ---------- ----------
c5y75njp34rmf            0 2017-04-17/10:10:55 2017-04-17/10:10:55          1          1
c5y75njp34rmf            1 2017-04-17/10:10:55 2017-04-17/10:10:55          0          0

According to the end_of_fetch_count column, it is the child number 0 that has been effectively used to run this query. Let’s see now why Oracle has decided to compile two execution plans, using Tanel Poder’s nonshared script.

SQL> @nonshared c5y75njp34rmf

Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)…

SQL_ID                        : c5y75njp34rmf
ADDRESS                       : 00007FFC9157FAF0
CHILD_ADDRESS                 : 00007FFC9157E640
CHILD_NUMBER                  : 0
REASON                           : <ChildNode><ChildNumber>0</ChildNumber><ID>9</ID>
                               <reason>PQSlavemismatch(5)</reason><size>2x4</size>
                               <ctxpq_StmtId_pqctx>3455170</ctxpq_StmtId_pqctx>
                               <fxu_kxfxutqidb>3389067</fxu_kxfxutqidb></ChildNode>
CON_ID                        : 1
-----------------
SQL_ID                        : c5y75njp34rmf
ADDRESS                       : 00007FFC9157FAF0
CHILD_ADDRESS                 : 00007FFC9152B2C8
CHILD_NUMBER                  : 1
PQ_SLAVE_MISMATCH             : Y
REASON                        : <ChildNode><ChildNumber>1</ChildNumber><ID>9</ID>
                               <reason>PQSlavemismatch(5)</reason><size>2x4</size>
                               <ctxpq_StmtId_pqctx>3389067</ctxpq_StmtId_pqctx>
                               <fxu_kxfxutqidb>3455170</fxu_kxfxutqidb></ChildNode>
CON_ID                        : 1
-----------------

As you can see, we have finally simulated this reason. Simply put, we can say that it happens inevitably with a parallel query and during which one of the parallel server refuses to share the execution plan of its query coordinator. In the model that I have engineered in this article it seems that this non-sharing reason is due to the materialization of the CTE (Common Table Expression, aka the WITH clause) which I have forced with the hint /*+ materialize */. Bear in mind, however, that I have been playing with the same model into different Oracle releases without always being able to reproduce this pq_slave_mismatch reason.

OPTIMIZER_MISMATCH

Oracle defines this reason as follows:

(Y|N) The optimizer environment does not match the existing child cursor

When Oracle compiles a new execution plan (child cursor) for a parent cursor it will store information that identifies as fully as possible the environment characterizing the compilation and the execution of this execution plan. The optimizer parameters represent an important part of this stored cursor metadata. When the same parent cursor is launched again, its existing child cursor will be shared and reused provided all of its meta-ata optimizer parameters remain intact. If, however, Oracle realizes that one or many values of those stored optimizer parameters change, it will then invalidate the existing child cursor and hard parse a new one. It seems, too, that Oracle has implemented thirteen optimizer mismatch situations:

Optimizermismatch(1)
Optimizermismatch(2)
Optimizermismatch(3)
Optimizermismatch(4) → points to a px_mismatch for serial plan (not clear)
Optimizermismatch(5)
Optimizermismatch(6)
Optimizermismatch(7)
Optimizermismatch(8)
Optimizermismatch(9)
Optimizermismatch(10) → optimizer mode (first_rows, all_rows) mismatch
Optimizermismatch(11)
Optimizermismatch(12) → optimizer parameter mismatch
Optimizermismatch(13) → cardinality feedback

The best and easiest way to decipher the meaning of those 13 situations is to run the following query in different running systems and get details of the corresponding non-sharing reason using a Tanel Poder script:

SQL> select distinct sql_id
    from gv$sql_shared_cursor
    where reason like '%Optimizer mismatch%'
    order by sql_id;
 
SQL_ID
-------------
00ajuu5902429
00zqy3yd0r3p3
04kug40zbu4dm
0b0wj2ykgnnzg
0fr8zhn4ymu3v
0gx3b09qrx9f5
0m78skf1mudnb
0v3dvmc22qnam
10s3r3f17ccu3
SQL> @nonshared <sql_id>

Proceeding as such I realized that Optimizermismatch(13) is linked with cardinality feedback, 12 with a difference in the optimizer parameters, 10 with a difference in the optimizer mode, 4 with a not yet very clear px_mismatch linked with serial queries, and so on, as shown below:

SQL> @nonshared 00ajuu5902429

Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)…

SQL_ID                        : 00ajuu5902429
ADDRESS                       : 00007FFC9F7DC5D8
CHILD_ADDRESS                 : 00007FFC9F7DB128
CHILD_NUMBER                  : 0
REASON                    : <ChildNode><ChildNumber>0</ChildNumber><ID>3</ID>                                            
                                <reason>Optimizermismatch(10)</reason>
                                <size>3x4</size><optimizer_mode_hinted_cursor>0
                                </optimizer_mode_hinted_cursor> <optimizer_mode_cursor>4
                                </optimizer_mode_cursor>                                                 
                                <optimizer_mode_current>1</optimizer_mode_current>
                                </ChildNode>
CON_ID                        : 1
-----------------
SQL_ID                        : 00ajuu5902429
ADDRESS                       : 00007FFC9F7DC5D8
CHILD_ADDRESS                 : 00007FFC97D79000
CHILD_NUMBER                  : 1
OPTIMIZER_MODE_MISMATCH       : Y
REASON                        : <ChildNode><ChildNumber>1</ChildNumber><ID>3</ID>                                        
                                <reason>Optimizermismatch(10)</reason>
                                <size>3x4</size> <optimizer_mode_hinted_cursor>0
                                </optimizer_mode_hinted_cursor>
                                <optimizer_mode_cursor>1</optimizer_mode_cursor                               
                                <optimizer_mode_current>4</optimizer_mode_current>
                                </ChildNode>
CON_ID                        : 1
-----------------
SQL> @nonshared gd90ygn1j4026

Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)…

SQL_ID                        : gd90ygn1j4026
ADDRESS                       : 00007FFCA113DC38
CHILD_ADDRESS                 : 00007FFCA113C788
CHILD_NUMBER                  : 0
REASON                        : <ChildNode><ChildNumber>0</ChildNumber><ID>3</ID>
                                <reason>Optimizermismatch(12)</reason>
                                <size>2x312</size>
                                <_parallel_syspls_obey_force> true  false
                                </_parallel_syspls_obey_force></ChildNode>
CON_ID                        : 1
-----------------
SQL_ID                        : gd90ygn1j4026
ADDRESS                       : 00007FFCA113DC38
CHILD_ADDRESS                 : 00007FFC9AF3DDD0
CHILD_NUMBER                  : 1
OPTIMIZER_MISMATCH            : Y
REASON                        : <ChildNode><ChildNumber>1</ChildNumber><ID>3</ID>
                                <reason>Optimizermismatch(12)</reason>
                                <size>2x312</size>
                                <_parallel_syspls_obey_force> false  true
                                </_parallel_syspls_obey_force></ChildNode>
CON_ID                        : 1
-----------------

What we should essentially remember here is that whatever the value x in Optimizermismatch(x), it inevitably indicates that a mismatch has been found in the optimizer parameters stored in the existing child cursor metadata. This optimizer parameter difference will almost always force Oracle to create a new child cursor with new corresponding metadata and store the non-sharing reason into the dedicated gv$sql_shared_cursor view.

SUMMARY

In this article we examined two new reasons from gv$sql_shared_cursor view: pq_slave_mismatch and optimizer_mismatch. Generally, watch out for the former when you are running a parallel query. We have shown an example of this non-sharing reason occurring when a parallel slave refuses to share the execution plan of its query coordinator because of the materialization of a subquery factoring. The latter is a very common reason in real-life applications. It happens whenever Oracle optimizer fails to find a hundred percent matching metadata of the current child cursor with that of the existing one(s).

PQ_SLAVE_MISMATCH

OPTIMIZER_MISMATCH

SUMMARY

Start the discussion at forums.toadworld.com