Cost Based Optimization ! Query Performance Issue after creating a new Index…
March 27, 2010 14 Comments
Creating an Index to optimize performance of a Query is usually an adopted strategy by most of the Database Administrators or Application Vendors. While it may optimize the performance of the query for which it has been created, but can have an impact on other queries, as these may start picking up the newly created index and this new index may not be optimal for these queries. This is especially true where Indexing Strategy is not up to the mark. My first AIOUG Session at Hyderabad, I presented something on Proactive and Reactive Tuning. If Optimization is ignored during Application Development Life cycle, then these efforts are done in an reactive mode and this is when the database administrators or developers end up creating unwanted or more number of Indexes. With these redundant indexes in place, there is always a threat of performance impact after creating a new index and this is what transpired at one of my customer place.
Recently, the application team at one of my customer site introduced a new Batch Process that runs 2-3 times a day. After Implementation, they found the performance of this process not up-to-the mark and therefore created a new Index to optimize one of the resource intensive query of that process. After creating this new Index, the performance of the batch process improved, but few other queries started using this newly created index, with the sessions waiting on “read by other session”. Now, the customer was in a big dilemma. If they drop the new index, the batch process is impacted and if it is created, 2-3 other queries are impacted. When this was escalated to the Application Vendor, they washed off their hands stating this to be an Oracle Issue. In my opinion, this seems to be an application issue and as you read further you will come to know the reason for this, but the issue can also be attributed to Oracle Optimizer. Why the Optimizer chose a newly created index, when the performance of the query was optimal with the existing index ? Let us walk through a detailed issue and the analysis.
After Implementing a new Batch process, Query No.1 below was doing a Full Table Scan of a 80 GB table. To optimize this query and improve the completion time of the Batch process, an Index BLINE_INDX was created. There was already an Index N_T_ADVICE_DTL_IDX3 on the CASEID column as a leading column, but the optimizer chose not to use this Index, may be since this column has been used as a Range Predicate. Instead of Investigation the reason for not using this Index, the Application Vendor reverted with a new index BLINE_INDX and therefore this index was created. Once this new Index was created, the execution plan of Query No.2, which was optimally using N_T_ADVICE_DTL_IDX3 ceased to use this index and started using the New Index BLINE_INDX. While there were several more indexes on this table, I have considered only two indexes that are relevant for this posting.
Query No.1 (BLINE PROCESS)
--------------------------
SELECT ROWID,CASEID,TO_CHAR(ADVICEDATE,'YYYYMMDD'),TXNADVICEID,CHARGEID,TXNID,
TXNTYPE,ADVICETYPE,ADVICEAMT,ORIGNALAMT,STATUS,DESCRIPTION,REMARKS,BPTYPE
FROM flea.N_T_ADVICE_DTL
WHERE caseid BETWEEN :1 AND :2
AND ADVICEDATE BETWEEN :3 AND :4
AND status='A'
ORDER BY CASEID, ADVICEDATE, TXNADVICEID;
Query No.2
----------
SELECT SUM(txnadjustedamt)
FROM flea.n_t_advice_dtl
WHERE caseid = :b1
AND chargeid =93
AND status ='A';
Index Details
-------------
INDEX_NAME COLUMN_NAME COLUMN_POSITION
------------------------------ ------------------------------ ---------------
BLINE_INDX CASEID 1 <== Local
ADVICEDATE 2
STATUS 3
N_T_ADVICE_DTL_IDX3 CASEID 1 <== Local
CHARGEID 2
My services in this issue was sought when the Customer wanted the analysis on why Query 2 started picking up a new Index, while, both the columns of IDX3 index are present in Query 2 and therefore should be an obvious choice for the optimizer. This seems to be true and therefore, I started my analysis on this issue with a query to v$sql_bind_capture to get the actual production runtime value passed to the bind :1 by the application user. I executed this query and passed the bind value to a bind variable b1. The runtime plan of the query shows BLINE_INDX index being used and this matches the production issue.
SQL>@bind_capture
new 2: where sql_id='b7nkg6m692jgq'
NAME POSITION DATATYPE_STRING VALUE_STRING
------------------------------ ---------- -------------------- ------------------------------
:1 1 NUMBER 2568494
SQL>variable b1 number;
SQL>exec :b1:=2568494;
SELECT /*+ VIVEK_2568494 */ SUM(txnadjustedamt)
FROM flea.n_t_advice_dtl
WHERE caseid = :b1
AND chargeid =93
AND status ='A';
Plan Table
============
------------------------------------------------------------------+-----------------------------------+---------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time | Pstart| Pstop |
------------------------------------------------------------------+-----------------------------------+---------------+
| 0 | SELECT STATEMENT | | | | 4 | | | |
| 1 | SORT AGGREGATE | | 1 | 16 | | | | |
| 2 | PARTITION RANGE SINGLE | | 1 | 16 | 4 | 00:00:01 | KEY | KEY |
| 3 | TABLE ACCESS BY LOCAL INDEX ROWID | N_T_ADVICE_DTL | 1 | 16 | 4 | 00:00:01 | KEY | KEY |
| 4 | INDEX RANGE SCAN | BLINE_INDX | 47 | | 1 | 00:00:01 | KEY | KEY |
------------------------------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
3 - filter("CHARGEID"=:SYS_B_0)
4 - access("CASEID"=:B1 AND "STATUS"=:SYS_B_1)
4 - filter("STATUS"=:SYS_B_1)
This being a partitioned table, the value 2568494 falls under the high_value of Partition 3 of the table. Therefore, the statistics for this partition was referred for the calculation and analysis. Shown below are the Optimizer Statistics for Partition 3 and the calculation that optimizer does to derive a cost of an Index Scan.
## Calculation for Partition 3 ------------------------------- TABLE_OWNER PARTITION_NAME HIGH_VALUE NUM_ROWS BLOCKS PARTITION_POSITION ------------ ------------------------ ------------- ---------- ---------- ------------------ FLEA N_T_ADVICE_DTL_FO_P3 3000000 90588615 2019809 3 Partition Column Stats ----------------------- ## Stats for Partition 3 based on Bind Capture : 2568494 COLUMN_NAME NUM_DISTINCT NUM_NULLS DENSITY HISTOGRAM --------------- ------------ ---------- ---------- --------------- CASEID 954287 0 1.0479E-06 NONE CHARGEID 105 0 .00952381 NONE STATUS 2 0 .5 NONE Partition Index Stats --------------------- INDEX_NAME PARTITION_NAME BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR AVDPK ALBKP ------------------------- ------------------------------ ---------- ----------- ------------- ----------------- ------- ---------- BLINE_INDX N_T_ADVICE_DTL_FO_P3 3 335600 8818906 46069060 5 1 N_T_ADVICE_DTL_IDX3 N_T_ADVICE_DTL_FO_P3 3 293115 1638635 48089155 29 1
Once the data was collected, it was time to compute the cost of both the Indexes. The formula for computing the cost of an Index Scan and then Table Access by Index step is :
Cost of Index Scan (A) = BLEVEL + CEIL(LEAF_BLOCKS * INDEX SELECTIVITY)
Cost of Table Scan via Index (B) = CEIL(CLUSTERING_FACOR * INDEX SELECTIVITY WITH FILTERS)
Total Cost = A + B
Analysis & Cost Calculation
In case of BLINE_INDX Index, for the selectivity, only CASEID column is to be considered. Revisit the Execution Plan above and check for the Predicate Information. While CASEID and STATUS columns exists in this Index, Access_Predicate Information shows that only CASEID column is used for Index Scan and further STATUS column is used as a Filter_predicate. Therefore, Num_Distinct of CASEID has to be used to derive Index Selectivity and product of Num_Distinct of CASEID and STATUS has to be used to compute the Index Selectivity with Filters.
Similary, for N_T_ADVICE_DTL_IDX3, since both the columns present in the Index are used in the Query, the Selectivity (Index and with Filters) will have to be computed on the product of the two columns present i.e.CASEID and CHARGEID.
One point also to be noted here is that the parameter OPTIMIZER_INDEX_COST_ADJ has been set to 15 and this is also to be taken into consideration when computing the Final Cost.
Calculation For bline Index --------------------------- select 3+ceil(335600*1/954287), ceil(46069060*1/954287*1/2) from dual; Index Cost : 4 Table Access : 4+25 = 29 OICA = 29*15/100 = 4.35 Rounded Up to 4 ---------------- For IDX3 Index -------------- select 3+ceil(293115*1/954287*1/105), ceil(48089155*1/954287*1/105) from dual; Index Cost : 4 Table Access : 4+1 = 5 OICA : 5*15/100 = 0.75 Rounded Up to 1 ---------------
Based on this calculation, the cost of IDX3 seems to be cheaper and should have been an obvious choice. Why is Optimizer computing BLINE_INDX to be cheaper than IDX3 ? There’s another better place to compare our analysis with the Optimizer Computation and this is 10053 trace file. For this, the query needs to be hard parsed and therefore, with a minor change in the query, it was reexecuted for the same bind value and a trace file was generated.
Once the 10053 trace was generated, it was easier to get into the rationale behind Optimizer Choice of BLINE_INDX index for this query. Pasted below is the relevant content of the 10053 trace file. I have also underlined the relevant value that will be used for analysis and comparison.
Table Stats::
Table: N_T_ADVICE_DTL Alias: N_T_ADVICE_DTL Partition [2]
#Rows: 90588615 #Blks: 2019809 AvgRowLen: 164.00
Index Stats::
Index: BLINE_INDX Col#: 28 27 17 PARTITION [2]
LVLS: 3 #LB: 335600 #DK: 8818906 LB/K: 1.00 DB/K: 5.00 CLUF: 46069060.00
Index: N_T_ADVICE_DTL_IDX3 Col#: 28 3 PARTITION [2]
LVLS: 3 #LB: 293115 #DK: 1638635 LB/K: 1.00 DB/K: 29.00 CLUF: 48089155.00
***************************************
SINGLE TABLE ACCESS PATH
Column (#28): CASEID(NUMBER) Part#: 2
AvgLen: 6.00 NDV: 954287 Nulls: 0 Density: 1.0479e-06 Min: 2000000 Max: 2999998
Column (#3): CHARGEID(NUMBER) Part#: 2
AvgLen: 4.00 NDV: 105 Nulls: 0 Density: 0.0095238 Min: 2 Max: 1000018
Column (#17): STATUS(VARCHAR2) Part#: 2
AvgLen: 2.00 NDV: 2 Nulls: 0 Density: 0.5
Table: N_T_ADVICE_DTL Alias: N_T_ADVICE_DTL
Card: Original: 90588615 Rounded: 1 Computed: 0.45 Non Adjusted: 0.45
Access Path: TableScan
Cost: 441312.32 Resp: 441312.32 Degree: 0
Cost_io: 425225.00 Cost_cpu: 81419528497
Resp_io: 425225.00 Resp_cpu: 81419528497
kkofmx: index filter:"N_T_ADVICE_DTL"."STATUS"=:B1
kkofmx: index filter:"N_T_ADVICE_DTL"."CHARGEID"=:B1 AND "N_T_ADVICE_DTL"."STATUS"=:B2
kkofmx: index filter:"N_T_ADVICE_DTL"."CASEID"=:B1 AND "N_T_ADVICE_DTL"."CHARGEID"=:B2 AND "N_T_ADVICE_DTL"."STATUS"=:B3
Access Path: index (RangeScan)
Index: BLINE_INDX
resc_io: 29.00 resc_cpu: 195479
ix_sel: 1.0703e-06 ix_sel_with_filters: 5.3516e-07
Cost: 4.36 Resp: 4.36 Degree: 1
Access Path: index (AllEqRange)
Index: N_T_ADVICE_DTL_IDX3
resc_io: 34.00 resc_cpu: 216124
ix_sel: 6.1026e-07 ix_sel_with_filters: 6.1026e-07
Cost: 5.11 Resp: 5.11 Degree: 1
Best:: AccessPath: IndexRange Index: BLINE_INDX
Cost: 4.36 Degree: 1 Resp: 4.36 Card: 0.45 Bytes: 0
The Cost of BLINE_INDX (resc_io) matches our calculation. RESC_IO for BLINE_INDX is 29 and with OICA set to 15, the final cost is 4.36, rounded off to 4. Surprisingly, the Cost of IDX3 (resc_io), as per Optimizer is 34, as against our calculation of 5. Optimizer always chose a least costed plan and in this case, 29 against 34 and 4.36 agains 5.11 is cheaper, therefore Optimizer opted for BLINE_INDX Index. Now, the action plan was to check for the Calculation made by the Optimizer that derives the cost of IDX3 index to be 34. BLEVEL and CLUSTERING_FACTOR of the IDX3 index is unchanged and therefore, I doubted on the Selectivity part. The ix_sel (index_selectivity) and ix_sel_with_filters (Index Selectivity with Filters) for IDX3 index is 6.1026e-07. The Selectivity, as derived by me, was 1/NDV*1/NDV (for two columns). I decided to do some reverse engineering and therefore, calculated the value of 1/6.1026e-07 which is 1638645.82. Then I looked for a value closer to 1638645.92 and observed that the DISTINCT_KEYS of IDX3 is 1638635, which is very much close. Now, if you check the Index and Index with Filters Selectivity based on 1/1638635, then it exactly matches the ix_sel and ix_sel_with_filters value of 10053 trace. Further, note the line Access Path: index (AllEqRange) above Index: N_T_ADVICE_DTL_IDX3 in the trace file. AllEqRange means, all the columns of an Index has been used as an Equality predicate and therefore, instead of using individual selectivity of each columns, optimizer has considered the computed DISTINCT_KEYS of an Index and this has flawed the Optimizer Computation.
By this time, it was clear and evident that the flaw in this calculation is due to AllEqRange, it was time to implement a change that would resolve the issue. There were two solutions for this issue. These are :
- Modify the Clustering Factor for the problematic partitions.
- Add a new Column to IDX3 Index that is not used in the Problematic Query (query no.2)
I computed the cost for each of the partitions and observed that the clustering_factor for almost 70% of the index partitions will have to be modified and this means, re-calculating this everytime the stats are computed. Locking of Stats after modifying the Clustering_factor would have helped but then, Optimizer Behaviour Change and the risk of performance impact with other queries was on top of the mind and therefore, this option was not implemented. I tested this for one partition, where after modifying the Clustering_factor, the optimizer started picking up IDX3 index.
The other option was to modify the index by way of adding a new column. Again, in order to eliminate any potential risk after adding a third column, the best option implemented thought of and implemented was adding a column with single distinct value. Therefore, a column CURRENCYID with a single distinct value was appended to this index and was tested on UAT. Before making this change on the UAT, first we made sure that the issue is reproduced. Once we were sure that the similar issue exists on the UAT enviroment as well, we dropped and recreated the index on (CASEID, CHARGEID, CURRENCYID) and the change had a dramatic impact. With this change, the AllEqRange changed to RangeScan and the optimizer computed the cost based on Individual Selectivities of the two columns rather than DISTINCT_KEYS from dba_indexes. Pasted below is the relevant portion of 10053 trace generated on UAT post-index change.
Table Stats::
Table: N_T_ADVICE_DTL Alias: N_T_ADVICE_DTL Partition [2]
#Rows: 25494530 #Blks: 10874 AvgRowLen: 165.00
Index Stats::
Index: BLINE_INDX Col#: 28 27 17 PARTITION [2]
LVLS: 2 #LB: 92720 #DK: 8843108 LB/K: 1.00 DB/K: 1.00 CLUF: 11966280.00
Index: N_T_ADVICE_DTL_IDX3 Col#: 28 3 18 PARTITION [2]
LVLS: 2 #LB: 86050 #DK: 2764430 LB/K: 1.00 DB/K: 4.00 CLUF: 12732735.00
***************************************
SINGLE TABLE ACCESS PATH
Column (#28): CASEID(NUMBER)
AvgLen: 6.00 NDV: 406027 Nulls: 240 Density: 2.4629e-06 Min: 0 Max: 51734139
Column (#3): CHARGEID(NUMBER)
AvgLen: 4.00 NDV: 165 Nulls: 0 Density: 0.0060606 Min: 2 Max: 10000048
Column (#17): STATUS(VARCHAR2)
AvgLen: 2.00 NDV: 3 Nulls: 0 Density: 0.33333
Table: N_T_ADVICE_DTL Alias: N_T_ADVICE_DTL
Card: Original: 25494530 Rounded: 1 Computed: 0.13 Non Adjusted: 0.13
Access Path: TableScan
Cost: 2544.46 Resp: 2544.46 Degree: 0
Cost_io: 2380.00 Cost_cpu: 414356148
Resp_io: 2380.00 Resp_cpu: 414356148
kkofmx: index filter:"N_T_ADVICE_DTL"."STATUS"=:B1
kkofmx: index filter:"N_T_ADVICE_DTL"."CHARGEID"=:B1 AND "N_T_ADVICE_DTL"."STATUS"=:B2
kkofmx: index filter:"N_T_ADVICE_DTL"."CASEID"=:B1 AND "N_T_ADVICE_DTL"."CHARGEID"=:B2 AND "N_T_ADVICE_DTL"."STATUS"=:B3
Access Path: index (RangeScan)
Index: BLINE_INDX
resc_io: 14.00 resc_cpu: 127837
ix_sel: 2.5335e-06 ix_sel_with_filters: 8.4450e-07
Cost: 2.11 Resp: 2.11 Degree: 1
Access Path: index (RangeScan)
Index: N_T_ADVICE_DTL_IDX3
resc_io: 4.00 resc_cpu: 29395
ix_sel: 1.4926e-08 ix_sel_with_filters: 1.4926e-08
Cost: 1.00 Resp: 1.00 Degree: 1
Best:: AccessPath: IndexRange Index: N_T_ADVICE_DTL_IDX3
Cost: 1.00 Degree: 1 Resp: 1.00 Card: 0.13 Bytes: 0
============
Plan Table
============
-----------------------------------------------------------------------+-----------------------------------+---------------+
| Id | Operation | Name | Rows | Bytes | Cost | Time | Pstart| Pstop |
-----------------------------------------------------------------------+-----------------------------------+---------------+
| 0 | SELECT STATEMENT | | | | 1 | | | |
| 1 | SORT AGGREGATE | | 1 | 15 | | | | |
| 2 | PARTITION RANGE SINGLE | | 1 | 15 | 1 | 00:00:01 | KEY | KEY |
| 3 | TABLE ACCESS BY LOCAL INDEX ROWID | N_T_ADVICE_DTL | 1 | 15 | 1 | 00:00:01 | KEY | KEY |
| 4 | INDEX RANGE SCAN | N_T_ADVICE_DTL_IDX3 | 1 | | 1 | 00:00:01 | KEY | KEY |
-----------------------------------------------------------------------+-----------------------------------+---------------+
Predicate Information:
----------------------
3 - filter("STATUS"=:SYS_B_1)
4 - access("CASEID"=:B1 AND "CHARGEID"=:SYS_B_0)
The change is still pending for Implementation on production and will be done post month end. With the analysis, it seems that the issue should resolve post this change. If you have come across any such issues, please do share the findings. For the other 2 queries, I do have an interesting fact about Cost Based Optimizer, but unfortunately, have not collected artifacts and data to post on the blog. While the change in BLINE_INDX Index suggested by me has resolved these issues as well, for posting purpose, I am doing further study on this and will share the findings as and when I am ready with it.