As mentioned in the previous three articles of the series there are sixty-four reasons for which the Oracle Optimizer might decide to hard parse a new execution plan instead of sharing the existing one. Out of these sixty-four reasons, we have already examined and explained six: load_optimizer_stats and hash_match_failed in part I, roll_invalid_mismatch and top_level_rpi_cursor in part II, and bind_mismatch and stb_obj_mismatch in part III. The current article outlines two new supplementary reasons: pq_slave_mismatch and optimizer_mismatch. The former is not very popular in real-life applications. It occurs when a parallel slave(PX) of a parallelized query refuses to share the execution plan of its Query Coordinator (QC). The latter is, however, very popular. It kicks in whenever an optimizer property,stored in the child cursor metadata,is modified. Oracle has foreseen thirteen sub-reasons for an optimizer parameter mismatch, which we will try to examine in this article.
PQ_SLAVE_MISMATCH
Oracle defines this reason as follows:
(Y|N) Top-level slave decides not to share cursor
To say the least, Oracle has not made a big effort to define this reason with a great precision. As always, the best way to make its proper opinion about this reason for non-sharing is to figure out how to force it. Here’s how I have managed to produce this reason in Oracle version 12.1.0.1.0; first the model and then the query:
create table t1 as with generator as ( select --+ materialize rownum id from dual connect by level <= 1000 ) select trunc((rownum-1)/2) product_id ,mod(rownum-1,2) flag1 ,mod(rownum-1,3) flag2 ,rownum n1 ,lpad(rownum,30) v1 from generator v1 ,generator v2 where rownum <= 1e6; create table t2 as select level id ,date '2012-06-07' + mod((level-1)*5,10) + interval '5' minute start_date ,rpad('xx',10) padding from dual connect by level <= 1e6; begin dbms_stats.gather_table_stats(user, 't1'); dbms_stats.gather_table_stats(user, 't2'); end; /
The above script creates two simple heap tables and gathers their statistics without histogram. Let’s run the following query, get its execution plan, and check its execution statistics:
with got_my_dates as
(select /*+ materialize */
id
,padding
,start_date
from t2
where start_date > to_date('07/06/2012','dd/mm/yyyy')
and start_date <= to_date('10/06/2012','dd/mm/yyyy')
)
select
/*+ parallel(4) */
t1.*
,cte.padding
,cte.start_date
from
t1
,got_my_dates cte
where
t1.product_id = cte.id
and t1.product_id <= 1e3;
SQL_ID c5y75njp34rmf, child number 0
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | TQ |IN-OUT|
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | |
| 1 | TEMP TABLE TRANSFORMATION | | | | |
| 2 | PX COORDINATOR | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 998K| Q1,00 | P->S |
| 4 | LOAD AS SELECT | | | Q1,00 | PCWP |
| 5 | PX BLOCK ITERATOR | | 998K| Q1,00 | PCWC |
|* 6 | TABLE ACCESS FULL | T2 | 998K| Q1,00 | PCWP |
| 7 | PX COORDINATOR | | | | |
| 8 | PX SEND QC (RANDOM) | :TQ20001 | 1723 | Q2,01 | P->S |
|* 9 | HASH JOIN | | 1723 | Q2,01 | PCWP |
| 10 | JOIN FILTER CREATE | :BF0000 | 1723 | Q2,01 | PCWP |
| 11 | PX RECEIVE | | 1723 | Q2,01 | PCWP |
| 12 | PX SEND BROADCAST | :TQ20000 | 1723 | Q2,00 | P->P |
| 13 | PX BLOCK ITERATOR | | 1723 | Q2,00 | PCWC |
|* 14 | TABLE ACCESS FULL | T1 | 1723 | Q2,00 | PCWP |
|* 15 | VIEW | | 998K| Q2,01 | PCWP |
| 16 | JOIN FILTER USE | :BF0000 | 998K| Q2,01 | PCWP |
| 17 | PX BLOCK ITERATOR | | 998K| Q2,01 | PCWC |
|* 18 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6628_1E139EE | 998K| Q2,01 | PCWP |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - access(:Z>=:Z AND :Z<=:Z)
filter(("START_DATE">TO_DATE(' 2012-06-07 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
AND "START_DATE"<=TO_DATE(' 2012-06-1000:00:00', 'syyyy-mm-dd hh24:mi:ss')))
9 - access("T1"."PRODUCT_ID"="CTE"."ID")
14 - access(:Z>=:Z AND :Z<=:Z)
filter("T1"."PRODUCT_ID"<=1000)
15 - filter("CTE"."ID"<=1000)
18 - access(:Z>=:Z AND :Z<=:Z)
filter(SYS_OP_BLOOM_FILTER(:BF0000,"C0"))
Note
-----
– Degree of Parallelism is 4 because of hint
As you can see, this query has been run parallely with a Degree of Parallelism of 4 as indicated by the Note at the bottom of its execution plan. But interestingly, it produced two execution plans (child number 0 and 1) at its very first execution, as shown below:
SQL> @gv$sql Enter value for sql_id: c5y75njp34rmf SQL_ID CHILD_NUMBER FIRST_LOAD_TIME LAST_LOAD_TIME EXECUTIONS END_FETCH ------------- ------------ ------------------- ------------------- ---------- ---------- c5y75njp34rmf 0 2017-04-17/10:10:55 2017-04-17/10:10:55 1 1 c5y75njp34rmf 1 2017-04-17/10:10:55 2017-04-17/10:10:55 0 0
According to the end_of_fetch_count column, it is the child number 0 that has been effectively used to run this query. Let’s see now why Oracle has decided to compile two execution plans, using Tanel Poder’s nonshared script.
SQL> @nonshared c5y75njp34rmf
Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)…
SQL_ID : c5y75njp34rmf ADDRESS : 00007FFC9157FAF0 CHILD_ADDRESS : 00007FFC9157E640 CHILD_NUMBER : 0 REASON : <ChildNode><ChildNumber>0</ChildNumber><ID>9</ID> <reason>PQSlavemismatch(5)</reason><size>2x4</size> <ctxpq_StmtId_pqctx>3455170</ctxpq_StmtId_pqctx> <fxu_kxfxutqidb>3389067</fxu_kxfxutqidb></ChildNode> CON_ID : 1 ----------------- SQL_ID : c5y75njp34rmf ADDRESS : 00007FFC9157FAF0 CHILD_ADDRESS : 00007FFC9152B2C8 CHILD_NUMBER : 1 PQ_SLAVE_MISMATCH : Y REASON : <ChildNode><ChildNumber>1</ChildNumber><ID>9</ID> <reason>PQSlavemismatch(5)</reason><size>2x4</size> <ctxpq_StmtId_pqctx>3389067</ctxpq_StmtId_pqctx> <fxu_kxfxutqidb>3455170</fxu_kxfxutqidb></ChildNode> CON_ID : 1 -----------------
As you can see, we have finally simulated this reason. Simply put, we can say that it happens inevitably with a parallel query and during which one of the parallel server refuses to share the execution plan of its query coordinator. In the model that I have engineered in this article it seems that this non-sharing reason is due to the materialization of the CTE (Common Table Expression, aka the WITH clause) which I have forced with the hint /*+ materialize */. Bear in mind, however, that I have been playing with the same model into different Oracle releases without always being able to reproduce this pq_slave_mismatch reason.
OPTIMIZER_MISMATCH
Oracle defines this reason as follows:
(Y|N) The optimizer environment does not match the existing child cursor
When Oracle compiles a new execution plan (child cursor) for a parent cursor it will store information that identifies as fully as possible the environment characterizing the compilation and the execution of this execution plan. The optimizer parameters represent an important part of this stored cursor metadata. When the same parent cursor is launched again, its existing child cursor will be shared and reused provided all of its meta-ata optimizer parameters remain intact. If, however, Oracle realizes that one or many values of those stored optimizer parameters change, it will then invalidate the existing child cursor and hard parse a new one. It seems, too, that Oracle has implemented thirteen optimizer mismatch situations:
- Optimizermismatch(1)
- Optimizermismatch(2)
- Optimizermismatch(3)
- Optimizermismatch(4) → points to a px_mismatch for serial plan (not clear)
- Optimizermismatch(5)
- Optimizermismatch(6)
- Optimizermismatch(7)
- Optimizermismatch(8)
- Optimizermismatch(9)
- Optimizermismatch(10) → optimizer mode (first_rows, all_rows) mismatch
- Optimizermismatch(11)
- Optimizermismatch(12) → optimizer parameter mismatch
- Optimizermismatch(13) → cardinality feedback
The best and easiest way to decipher the meaning of those 13 situations is to run the following query in different running systems and get details of the corresponding non-sharing reason using a Tanel Poder script:
SQL> select distinct sql_id from gv$sql_shared_cursor where reason like '%Optimizer mismatch%' order by sql_id; SQL_ID ------------- 00ajuu5902429 00zqy3yd0r3p3 04kug40zbu4dm 0b0wj2ykgnnzg 0fr8zhn4ymu3v 0gx3b09qrx9f5 0m78skf1mudnb 0v3dvmc22qnam 10s3r3f17ccu3 SQL> @nonshared <sql_id>
Proceeding as such I realized that Optimizermismatch(13) is linked with cardinality feedback, 12 with a difference in the optimizer parameters, 10 with a difference in the optimizer mode, 4 with a not yet very clear px_mismatch linked with serial queries, and so on, as shown below:
SQL> @nonshared 00ajuu5902429
Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)…
SQL_ID : 00ajuu5902429 ADDRESS : 00007FFC9F7DC5D8 CHILD_ADDRESS : 00007FFC9F7DB128 CHILD_NUMBER : 0 REASON : <ChildNode><ChildNumber>0</ChildNumber><ID>3</ID> <reason>Optimizermismatch(10)</reason> <size>3x4</size><optimizer_mode_hinted_cursor>0 </optimizer_mode_hinted_cursor> <optimizer_mode_cursor>4 </optimizer_mode_cursor> <optimizer_mode_current>1</optimizer_mode_current> </ChildNode> CON_ID : 1 ----------------- SQL_ID : 00ajuu5902429 ADDRESS : 00007FFC9F7DC5D8 CHILD_ADDRESS : 00007FFC97D79000 CHILD_NUMBER : 1 OPTIMIZER_MODE_MISMATCH : Y REASON : <ChildNode><ChildNumber>1</ChildNumber><ID>3</ID> <reason>Optimizermismatch(10)</reason> <size>3x4</size> <optimizer_mode_hinted_cursor>0 </optimizer_mode_hinted_cursor> <optimizer_mode_cursor>1</optimizer_mode_cursor <optimizer_mode_current>4</optimizer_mode_current> </ChildNode> CON_ID : 1 ----------------- SQL> @nonshared gd90ygn1j4026
Show why existing SQL child cursors were not reused (V$SQL_SHARED_CURSOR)…
SQL_ID : gd90ygn1j4026 ADDRESS : 00007FFCA113DC38 CHILD_ADDRESS : 00007FFCA113C788 CHILD_NUMBER : 0 REASON : <ChildNode><ChildNumber>0</ChildNumber><ID>3</ID> <reason>Optimizermismatch(12)</reason> <size>2x312</size> <_parallel_syspls_obey_force> true false </_parallel_syspls_obey_force></ChildNode> CON_ID : 1 ----------------- SQL_ID : gd90ygn1j4026 ADDRESS : 00007FFCA113DC38 CHILD_ADDRESS : 00007FFC9AF3DDD0 CHILD_NUMBER : 1 OPTIMIZER_MISMATCH : Y REASON : <ChildNode><ChildNumber>1</ChildNumber><ID>3</ID> <reason>Optimizermismatch(12)</reason> <size>2x312</size> <_parallel_syspls_obey_force> false true </_parallel_syspls_obey_force></ChildNode> CON_ID : 1 -----------------
What we should essentially remember here is that whatever the value x in Optimizermismatch(x), it inevitably indicates that a mismatch has been found in the optimizer parameters stored in the existing child cursor metadata. This optimizer parameter difference will almost always force Oracle to create a new child cursor with new corresponding metadata and store the non-sharing reason into the dedicated gv$sql_shared_cursor view.
SUMMARY
In this article we examined two new reasons from gv$sql_shared_cursor view: pq_slave_mismatch and optimizer_mismatch. Generally, watch out for the former when you are running a parallel query. We have shown an example of this non-sharing reason occurring when a parallel slave refuses to share the execution plan of its query coordinator because of the materialization of a subquery factoring. The latter is a very common reason in real-life applications. It happens whenever Oracle optimizer fails to find a hundred percent matching metadata of the current child cursor with that of the existing one(s).
Start the discussion at forums.toadworld.com