今天下午一哥們遇到個case.他說如下SQL語句
SELECT WORKITEMID
FROM WFWIPARTICIPANT
WHERE PARTICIPANT IN ('771', '99999', '41', '146', '李錦');
用DBMS_STATS收集統計信息之後會走全表掃描,而用
analyze table WFWIPARTICIPANT compute statistics ---收集統計信息會走索引
analyze table WFWIPARTICIPANT delete statistics ----刪除統計信息也會走索引,Oracle採用動態採樣
表結構如下:
SQL> desc WFWIPARTICIPANT;
Name Type Nullable Default Comments
--------------- ------------- -------- ------- --------
WIPARTICID NUMBER
WORKITEMID NUMBER Y
PARTICIPANTTYPE VARCHAR2(20) Y
PARTICIPANT VARCHAR2(256) Y
PARTICIPANT2 VARCHAR2(64) Y
WORKITEMSTATE NUMBER(2) Y
PARTIINTYPE VARCHAR2(20) Y
EXTEND1 VARCHAR2(64) Y
並且在列partintype 上面有個索引。
我叫那位哥們把表用exp導出,然後我自己導入到我電腦測試一下,測試過程:
首先用DBMS_STATS收集統計信息:
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TSSA',TABNAME=>'WFWIPARTICIPANT',ESTIMATE_PERCENT=>100,CASCADE=>TRUE);
PL/SQL procedure successfully completed
執行該SQL
SQL> SELECT WORKITEMID
2 FROM WFWIPARTICIPANT
3 WHERE PARTICIPANT IN ('771', '99999', '41', '146', '李錦');
已選擇60行。
執行計劃
----------------------------------------------------------
Plan hash value: 1708170390
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 530 | 7420 | 103 (5)| 00:00:02 |
|* 1 | TABLE ACCESS FULL| WFWIPARTICIPANT | 530 | 7420 | 103 (5)| 00:00:02 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("PARTICIPANT"='146' OR "PARTICIPANT"='41' OR
"PARTICIPANT"='771' OR "PARTICIPANT"='99999' OR "PARTICIPANT"='李錦')
發現確實是走了全表掃描,於是改用ANALYZE 收集統計信息
SQL> analyze table WFWIPARTICIPANT delete statistics;
表已分析。
SQL> analyze table WFWIPARTICIPANT compute statistics;
表已分析。
SQL> SELECT WORKITEMID
2 FROM WFWIPARTICIPANT
3 WHERE PARTICIPANT IN ('771', '99999', '41', '146', '李錦');
已選擇60行。
執行計劃
----------------------------------------------------------
Plan hash value: 1217134846
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 71 | 852 | 64 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| WFWIPARTICIPANT | 71 | 852 | 64 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | WF_IDX_PARTICIPANT | 71 | | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("PARTICIPANT"='146' OR "PARTICIPANT"='41' OR "PARTICIPANT"='771' OR
"PARTICIPANT"='99999' OR "PARTICIPANT"='李錦')
用ANALYZE命令分析之後,發現確實走了索引掃描
解決問題的思路:遇到這類問題,首先應該查詢索引的選擇率:
SQL> select a.owner,a.index_name,a.index_type,partitioned,b.num_rows,b.distinct_keys,b.num_rows/b.distinct_keys avg_row_per_key
2 ,b.distinct_keys/b.num_rows SELECTIVITY,b.last_analyzed,b.stale_stats from dba_indexes a,dba_ind_statistics b
3 where a.owner=b.owner and a.index_name=b.index_name and a.index_name='WF_IDX_PARTICIPANT';
OWNER INDEX_NAME INDEX_TYPE PARTITIONED NUM_ROWS DISTINCT_KEYS AVG_ROW_PER_KEY SELECTIVITY LAST_ANALYZED STALE_STATS
----- -------------------- ---------- ----------- ---------- ------------- --------------- ----------- ------------- -----------
TSSA WF_IDX_PARTICIPANT NORMAL NO 57820 4073 14.195924380063 0.070442753 2010-4-15 17: NO
注意觀察,表一共有57820列,但是索引的列上面只有4073個不同的值,也就是說索引選擇性爲7%,很顯然,這列數據分佈不均衡。於是猜測該問題和直方圖有關。現在查看列上面有沒有直方圖:
SQL> select owner,table_name,column_name,num_distinct,histogram,num_buckets from dba_tab_col_statistics
2 where table_name='WFWIPARTICIPANT' and column_name='PARTICIPANT';
OWNER TABLE_NAME COLUMN_NAME NUM_DISTINCT HISTOGRAM NUM_BUCKETS
----- ------------------------------ ------------------------------ ------------ --------------- -----------
TSSA WFWIPARTICIPANT PARTICIPANT 4073 NONE 1
NUM_BUCKETS爲1表示該列沒有直方圖,恩,這裏用ANALYZE命令收集的統計信息裏面沒有直方圖信息,於是改用
DBMS_STATS收集統計信息,看看是否有直方圖統計信息:
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TSSA',TABNAME=>'WFWIPARTICIPANT',ESTIMATE_PERCENT=>100,CASCADE=>TRUE);
PL/SQL procedure successfully completed
SQL> select owner,table_name,column_name,num_distinct,histogram,num_buckets from dba_tab_col_statistics
2 where table_name='WFWIPARTICIPANT' and column_name='PARTICIPANT';
OWNER TABLE_NAME COLUMN_NAME NUM_DISTINCT HISTOGRAM NUM_BUCKETS
----- ------------------------------ ------------------------------ ------------ --------------- -----------
TSSA WFWIPARTICIPANT PARTICIPANT 4073 HEIGHT BALANCED 254
注意觀察,NUM_BUCKETS=254,Oracle自動的對該列收集了直方圖統計信息,於是懷疑直方圖的存在影響了執行計劃
,現在我刪掉直方圖的統計信息:
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(OWNNAME=>'TSSA',TABNAME=>'WFWIPARTICIPANT',ESTIMATE_PERCENT=>100,DEGREE=>16,method_opt=>'for columns size 1 PARTICIPANT',CASCADE=>TRUE);
PL/SQL procedure successfully completed
SQL> select owner,table_name,column_name,num_distinct,histogram,num_buckets from dba_tab_col_statistics
2 where table_name='WFWIPARTICIPANT' and column_name='PARTICIPANT';
OWNER TABLE_NAME COLUMN_NAME NUM_DISTINCT HISTOGRAM NUM_BUCKETS
----- ------------------------------ ------------------------------ ------------ --------------- -----------
TSSA WFWIPARTICIPANT PARTICIPANT 4073 NONE 1
再次運行該查詢語句
SQL> SELECT WORKITEMID
2 FROM WFWIPARTICIPANT
3 WHERE PARTICIPANT IN ('771', '99999', '41', '146', '李錦');
已選擇60行。
執行計劃
----------------------------------------------------------
Plan hash value: 1217134846
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 71 | 994 | 64 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| WFWIPARTICIPANT | 71 | 994 | 64 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | WF_IDX_PARTICIPANT | 71 | | 6 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("PARTICIPANT"='146' OR "PARTICIPANT"='41' OR "PARTICIPANT"='771' OR
"PARTICIPANT"='99999' OR "PARTICIPANT"='李錦')
當刪掉直通圖統計信息之後,優化器選擇了我們期望的訪問路徑
這個案例提醒我們,對錶收集統計信息的時候,要寫好參數,另外,我們在對訪問路徑做優化的時候,首先應該查看的就是索引的選擇率,索引的類型,以及列上面關於直方圖的統計信息,有了直方圖的統計信息並不總是會給我們帶來好處,當然也不總是會對我們帶來壞處,具體問題具體對待。另外一個值得注意的就是,關於統計信息收集的方式,一定要寫好參數,如果Oracle自動去收集了直方圖統計信息,而我們不知道,這樣對於性能診斷會帶來麻煩的。