首先,我們應該知道,對於看懂一個執行計劃,對我們的SQL優化,有很大的幫助。
首先,我們應該知道這兩個概念,一個是RBO,一個是CBO。RBO,是Oracle8i之前使用的,一種基於規則的一種優化器。這種規則,太過死板,在Oracle10g 以後的版本就已經徹底“拋棄“了。
CBO,基於成本的優化器。它的思路是讓Oracle獲取所有的執行計劃,通過對執行計劃的分析,獲取一個Cost最少的一個,然後,產生一個最終的執行計劃。
對於CBO來講,最重要的參數則是Cardinality .如果,CBO獲取的Cardinality不夠準確(或者是過期了),都可能導致制定錯誤的CBO執行計劃。
這裏,這展示的是,CBO無法獲得準確的Cardinality 值時,將會發生什麼。
SQL>create table t as select 1 id,object_name from dba_objects;
Tablecreated.
SQL>update t set id=99 where rownum=1;
1row updated.
SQL>commit;
Commitcomplete.
SQL>create index t_ind on t (id);
Indexcreated.
SQL>select count(*) from t;
COUNT(*)
----------
50320
SQL>set autotrace trace exp;
SQL>select * from t where id=1;
50319rows selected.
ExecutionPlan
----------------------------------------------------------
Planhash value: 1601196873
--------------------------------------------------------------------------
|Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time|
--------------------------------------------------------------------------
|0 | SELECT STATEMENT | | 56425 | 4353K| 53 (4)| 00:00:01|
|*1 | TABLE ACCESS FULL| T | 56425 | 4353K| 53 (4)| 00:00:01|
--------------------------------------------------------------------------
PredicateInformation (identified by operation id):
---------------------------------------------------
1- filter("ID"=1)
Note
-----
-
dynamicsampling used for this statement
--注:這裏,我們沒有對這個表T進行統計分析,這個是Oracle自動通過動態採樣的方式收集分析數據),CBO估算的記錄有56425(Card),但是,這個也是比較接近表T的實際統計數據,50320,CBO走的是全表掃遍,這也是比較正確的。
當我們 統計一下。
SQL>SQL> exec dbms_stats.gather_table_stats(user,'T',cascade=>true);
PL/SQLprocedure successfully completed.
SQL>select * from t where id=1;
50319rows selected.
ExecutionPlan
----------------------------------------------------------
Planhash value: 1601196873
--------------------------------------------------------------------------
|Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time|
--------------------------------------------------------------------------
|0 | SELECT STATEMENT | | 50311 | 1326K| 52 (2)| 00:00:01|
|*1 | TABLE ACCESS FULL| T | 50311 | 1326K| 52 (2)| 00:00:01|
--------------------------------------------------------------------------
PredicateInformation (identified by operation id):
---------------------------------------------------
1- filter("ID"=1)
這裏,就是比較準確數據了。CBO選擇的全表掃描很是正確。我們這樣看一下,當id=99.CBO的選擇。
SQL>select * from t where id=99;
ExecutionPlan
----------------------------------------------------------
Planhash value: 1376202287
--------------------------------------------------------------------------------
-----
|Id | Operation | Name | Rows | Bytes | Cost(%CPU)| Time
|
--------------------------------------------------------------------------------
-----
|0 | SELECT STATEMENT | | 1 | 27 | 2(0)| 00:00
:01|
|1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 27 | 2(0)| 00:00
:01|
|*2 | INDEX RANGE SCAN | T_IND | 1 | | 1(0)| 00:00
:01|
--------------------------------------------------------------------------------
-----
PredicateInformation (identified by operation id):
---------------------------------------------------
2- access("ID"=99)
事實上,當我們統計表的時候,CBO可以不但獲得T表的信息,也可以統計出來索引的信息。所以,當我們執行‘whereid = 99’是,CBO,很明確的執行了索引瀏覽。
下面,我來顯示一下第二種,當不即時統計數據(數據過舊),CBO的錯誤執行計劃。
SQL>update t set id=99;
50320rows updated.
SQL>commit;
Commitcomplete.
SQL>select * from t where id=99;
50320rows selected.
ExecutionPlan
----------------------------------------------------------
Planhash value: 1376202287
--------------------------------------------------------------------------------
-----
|Id | Operation | Name | Rows | Bytes | Cost(%CPU)| Time
|
--------------------------------------------------------------------------------
-----
|0 | SELECT STATEMENT | | 1 | 27 | 2(0)| 00:00
:01|
|1 | TABLE ACCESS BY INDEX ROWID| T | 1 | 27 | 2(0)| 00:00
:01|
|*2 | INDEX RANGE SCAN | T_IND | 1 | | 1(0)| 00:00
:01|
--------------------------------------------------------------------------------
-----
PredicateInformation (identified by operation id):
---------------------------------------------------
2- access("ID"=99)
這裏,我們能夠清楚的得到這樣的數據,我們沒有即時的統計數據。CBO依然從上一個執行計劃中獲得,導致,我們已經更新了所有記錄,全表已經都是id=99的記錄了,但,CBO,一直認爲,全表也還是隻有一條id=99的記錄,選擇了走索引掃描的錯誤信息。
SQL>set autotrace trace exp;
SQL>exec dbms_stats.gather_table_stats(user,'T',cascade=>true);
PL/SQLprocedure successfully completed.
SQL>select *from t where id=99;
ExecutionPlan
----------------------------------------------------------
Planhash value: 1601196873
--------------------------------------------------------------------------
|Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time|
--------------------------------------------------------------------------
|0 | SELECT STATEMENT | | 50315 | 1326K| 52 (2)| 00:00:01|
|*1 | TABLE ACCESS FULL| T | 50315 | 1326K| 52 (2)| 00:00:01|
--------------------------------------------------------------------------
PredicateInformation (identified by operation id):
---------------------------------------------------
1– filter("ID"=99)
這樣,我們統計一下數據,CBO就能清楚的得到執行計劃。
總結:當第一次執行SQL時,CBO發現沒有做分析,所以,使用了動態採集的方式來估算數據信息,然後,對錶做了分析。SQL在執行第二次執行時,CBO,發現表已經分析過了,就不用使用動態採集了,而是直接分析數據。
這裏,就會出現兩種情況:
1.如果表沒有做過分析,那麼CBO可以通過動態採集的方式來獲得分析數據,也可以獲得正確的執行計劃;
2.如果表被分析過,但是分析數據信息過久,這時候,CBO就不會通過動態採集方式,而是直接使用舊的分析數據,從而導致錯誤的執行計劃。
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
這,我詳細的講一下Card對CBO 的影響。
這裏,我舉一下,嵌套查詢和關聯查詢。
SQL>create table t1 (id int,name varchar2(100));
Tablecreated.
SQL>create table t2 (id int,name varchar2(100));
Tablecreated.
SQL>create index ind_t1 on t1(id);
Indexcreated.
SQL>create index ind_t2 on t2(id);
Indexcreated.
SQL>create index ind_t2_name on t2(name);
Indexcreated.
SQL>insert into t1 select object_id ,object_name from dba_objects;
50321rows created.
SQL>execdbms_stats.gather_table_stats(user,'T1',cascade=>true,method_opt=>'forall indexed columns');
PL/SQLprocedure successfully completed.
這裏,我開始對T2的card進行僞造。
SQL>select * from t1 where id in (select /*+ dynamic_sampling(t2 0)cardinality(t2 10000) */ id from t2 where name ='AA');
ExecutionPlan
----------------------------------------------------------
Planhash value: 2727019379
--------------------------------------------------------------------------------
---------------
|Id | Operation | Name | Rows | Bytes |Cost (%C
PU)|Time |
--------------------------------------------------------------------------------
---------------
|0 | SELECT STATEMENT | | 1 | 41 |54
(4)|00:00:01 |
|1 | TABLE ACCESS BY INDEX ROWID | T1 | 1 | 28 |2
(0)|00:00:01 |
|2 | NESTED LOOPS | | 1 | 41 |54
(4)|00:00:01 |
|3 | VIEW | VW_NSO_1 | 10000 | 126K|1
(0)|00:00:01 |
|4 | HASH UNIQUE | | 1 | 634K|
||
|5 | TABLE ACCESS BY INDEX ROWID| T2 | 10000 | 634K|1
(0)|00:00:01 |
|*6 | INDEX RANGE SCAN | IND_T2_NAME | 1 | |1
(0)|00:00:01 |
|*7 | INDEX RANGE SCAN | IND_T1 | 1 | |1
(0)|00:00:01 |
--------------------------------------------------------------------------------
---------------
PredicateInformation (identified by operation id):
---------------------------------------------------
6- access("NAME"='AA')
7- access("ID"="$nso_col_1")
我們發出帶有子查詢的SQL,同時使用HINT
Cardinality(t21000)的作用是告訴CBO從T2表獲取數據到1000行
dynamic_samping(t20) 的作用是禁止動態採集數據。
通過這種方法,我們就模擬了子查詢的返回結果樹,同時爲了讓CBO完全依賴與這個信息生成實行計劃,禁止了子查詢的動態採集
這裏,T2的數據過大,所以,採集了HASHUNIQUE。
這裏, 我們再把Cardinality設置爲(T2,1)。看看效果。
SQL>select * from t1 where id in (select /*+ dynamic_sampling(t2 0)cardinality(t2 1) */ id from t2 where name ='AA');
ExecutionPlan
----------------------------------------------------------
Planhash value: 2727019379
--------------------------------------------------------------------------------
---------------
|Id | Operation | Name | Rows | Bytes |Cost (%C
PU)|Time |
--------------------------------------------------------------------------------
---------------
|0 | SELECT STATEMENT | | 1 | 41 |4 (
25)|00:00:01 |
|1 | TABLE ACCESS BY INDEX ROWID | T1 | 1 | 28 |2
(0)|00:00:01 |
|2 | NESTED LOOPS | | 1 | 41 |4 (
25)|00:00:01 |
|3 | VIEW | VW_NSO_1 | 1 | 13 |1
(0)|00:00:01 |
|4 | HASH UNIQUE | | 1 | 65 |
||
|5 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 65 |1
(0)|00:00:01 |
|*6 | INDEX RANGE SCAN | IND_T2_NAME | 1 | |1
(0)|00:00:01 |
|*7 | INDEX RANGE SCAN | IND_T1 | 1 | |1
(0)|00:00:01 |
--------------------------------------------------------------------------------
---------------
PredicateInformation (identified by operation id):
---------------------------------------------------
6- access("NAME"='AA')
7- access("ID"="$nso_col_1")
Cardinality的數值對於CBO的選擇無效?)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
我們看一下兩個表關聯查詢的情況
SQL>select /*+ dynamic_sampling(t2 0) cardinality(t2 1000) */ * fromt1,t2 where t1.id=t2.id;
ExecutionPlan
----------------------------------------------------------
Planhash value: 2959412835
---------------------------------------------------------------------------
|Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time|
---------------------------------------------------------------------------
|0 | SELECT STATEMENT | | 1000 | 93000 | 58 (4)| 00:00:01|
|*1 | HASH JOIN | | 1000 | 93000 | 58 (4)| 00:00:01|
|2 | TABLE ACCESS FULL| T2 | 1000 | 65000 | 2 (0)| 00:00:01|
|3 | TABLE ACCESS FULL| T1 | 50321 | 1375K| 55 (2)| 00:00:01|
---------------------------------------------------------------------------
PredicateInformation (identified by operation id):
---------------------------------------------------
1– access("T1"."ID"="T2"."ID")
這裏例子,CBO認爲T2關聯的數據足夠多,而T1又足夠大,所以,在這樣的情況下,HASHJOIN是正確的。
我又舉個 Cardinality(T2 1),大家再看看差別。
SQL>select /*+ dynamic_sampling(t2 0) cardinality(t2 1) */ * from t1,t2where t1.id=t2.id;
ExecutionPlan
----------------------------------------------------------
Planhash value: 828990364
--------------------------------------------------------------------------------
------
|Id | Operation | Name | Rows | Bytes | Cost(%CPU)| Time
|
--------------------------------------------------------------------------------
------
|0 | SELECT STATEMENT | | 1 | 93 | 4(0)| 00:0
0:01|
|1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 28 | 2(0)| 00:0
0:01|
|2 | NESTED LOOPS | | 1 | 93 | 4(0)| 00:0
0:01|
|3 | TABLE ACCESS FULL | T2 | 1 | 65 | 2(0)| 00:0
0:01|
|*4 | INDEX RANGE SCAN | IND_T1 | 1 | | 1(0)| 00:0
0:01|
--------------------------------------------------------------------------------
------
PredicateInformation (identified by operation id):
---------------------------------------------------
4- access("T1"."ID"="T2"."ID")
當T1足夠大,T2足夠小,而且T2有索引,所以,選擇NESTEDLOOPS 是合適的。
所以,我們可以看到,Cardinality對與CBO的執行結果是很重要的一個參數。
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
這裏,我來教大家怎麼看懂執行計劃。
SQL>select t1.* from t1,t2 where t1.id=t2.id and t1.id=5 and t2.name='A';
ExecutionPlan
----------------------------------------------------------
Planhash value: 828990364
--------------------------------------------------------------------------------
------
|Id | Operation | Name | Rows | Bytes | Cost(%CPU)| Time
|
--------------------------------------------------------------------------------
------
|0 | SELECT STATEMENT | | 1 | 93 | 4(0)| 00:0
0:01|
|1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 28 | 2(0)| 00:0
0:01|
|2 | NESTED LOOPS | | 1 | 93 | 4(0)| 00:0
0:01|
|*3 | TABLE ACCESS FULL | T2 | 1 | 65 | 2(0)| 00:0
0:01|
|*4 | INDEX RANGE SCAN | IND_T1 | 1 | | 1(0)| 00:0
0:01|
--------------------------------------------------------------------------------
------
PredicateInformation (identified by operation id):
---------------------------------------------------
3- filter("T2"."NAME"='A' AND "T2"."ID"=5)
4- access("T1"."ID"=5)
Note
-----
-
dynamicsampling used for this statement
我們這樣看。最先縮進的,就是先執行的語句,語句在前,最先執行。故,應該是
“|*3 | TABLE ACCESS FULL | T2 | 1 | 65 | 2(0)| 00:0
0:01| ” 最先執行。
接着就是
|2 | NESTED LOOPS | | 1 | 93 | 4(0)| 00:0
0:01|
其次,就是
|1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 28 | 2(0)| 00:0
0:01|
最後,再是
|0 | SELECT STATEMENT | | 1 | 93 | 4(0)| 00:0
0:01|
這就是整個的SQL執行計劃
ID=4—>ID=3—>ID=2—>ID=1—>ID=0
翻譯下來,大概是如下。
對於T2進行全表掃描,把符合T2.id=5和T2.name='A'這個條件的記錄讀取出來,然後去找到索引IND_T1 =5的值,知道把T2讀完。之後,根據索引的鍵值去T1表,找到相應的數據,然後,輸出出來。
對於後面有謂詞的參數,filter,access這兩個關鍵字
這樣解釋:如果執行計劃顯示的是access,就表示這個謂詞條件的值將會影響數據的訪問路徑(表還是索引,在這裏,是索引),在這裏它說明是通過訪問索引的方式和T2表做關聯查詢,而filter表示謂詞條件的值並不會影響數據訪問的路徑,而只是起到過濾的作用。
參照:譚懷遠先生所著的《讓Oracle跑得更快》
僅作爲一個學習文檔,有部分錯誤,請高手指點。請勿噴水,謝謝。