create table t as select level as id ,level||'a' as a,level||level||'b' as b from dual connect by level<100;
這裏A列的值能夠確定B列的值,
insert into t select * from t;
.............................. 一直重複插入數據
SQL> select count(*) from t;
COUNT(*)
----------
3244032
create index idx1 on t(a);
create index idx2 on t(a,b);
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname => 'SCOTT',
tabname => 'T',
estimate_percent => 100,
method_opt => 'for all columns size skewonly',
no_invalidate => FALSE,
degree => 8,
cascade => TRUE);
END;
/
SQL> select * from t where a='1a' and b='11b';
已選擇32768行。
已用時間: 00: 00: 03.98
執行計劃
----------------------------------------------------------
Plan hash value: 2303463401
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 331 | 3972 | 84 (0)| 00:00:02 |
| 1 | TABLE ACCESS BY INDEX ROWID| T | 331 | 3972 | 84 (0)| 00:00:02 |
|* 2 | INDEX RANGE SCAN | IDX2 | 331 | | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("A"='1a' AND "B"='11b')
統計信息
----------------------------------------------------------
1 recursive calls
0 db block gets
11838 consistent gets
7943 physical reads
0 redo size
441749 bytes sent via SQL*Net to client
24424 bytes received via SQL*Net from client
2186 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
32768 rows processed
因爲CBO不知道A與B關係,所以計算基數等於331,
SQL> select 1/99/99*3244032 from dual; ----這個其實就是 a選擇性*b選擇性 =(1/99)*(1/99)
1/99/99*3244032
---------------
330.989899
但是實際上它要返回32768條記錄
SQL> select * from t where a='1a';
已選擇32768行。
已用時間: 00: 00: 01.38
執行計劃
----------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 32768 | 384K| 1874 (8)| 00:00:23 |
|* 1 | TABLE ACCESS FULL| T | 32768 | 384K| 1874 (8)| 00:00:23 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("A"='1a')
統計信息
----------------------------------------------------------
0 recursive calls
0 db block gets
10120 consistent gets
6312 physical reads
0 redo size
441749 bytes sent via SQL*Net to client
24424 bytes received via SQL*Net from client
2186 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
32768 rows processed
如果where條件單獨是 where a='1a' CBO 就能夠算對基數,它的基數是這樣計算的
SQL> select 3244032/99 from dual;
3244032/99
----------
32768
很顯然,這個SQL select * from t where a='1a' and b='11b' 的執行計劃走錯了,它應該走全表掃描,但是因爲計算基數錯誤,導致它走 IDX2這個索引
相關列的解決辦法在Oracle中有2個,一個是動態採樣,另外一個就是Oracle11g,對相關列收集擴展統計
SQL> ALTER SESSION SET optimizer_dynamic_sampling=6;
會話已更改。
SQL> set lines 200
SQL> set pages 200
SQL> set timi on
SQL> explain plan for select * from t where a='1a' and b='11b';
已解釋。
已用時間: 00: 00: 00.86
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 32776 | 384K| 1885 (8)| 00:00:23 |
|* 1 | TABLE ACCESS FULL| T | 32776 | 384K| 1885 (8)| 00:00:23 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("A"='1a' AND "B"='11b')
Note
-----
- dynamic sampling used for this statement
已選擇17行。
設置動態採樣之後 Oracle評估基數就基本正確了,關於11g擴展統計這裏就不做了,有興趣的請自己做一下。
我對相關列的建議就是,能否在程序裏拼接?如果A能確定B,那麼做DB 設計的時候就不要創建B列了 直接在程序里根據A列的值生成B的值 這樣減少DB的存儲空間。
如果非要在DB裏設置B列,寫SQL的時候就不要把2個列都寫進去,也就是說不要寫成
select * from t where a='1a' and b='11b';
直接寫成
select * from t where a='1a' 或者 select * from t where b='11b'
這樣能儘量避免CBO計算基數出錯,如果這個表要與多表關聯,基數一旦算錯,必然導致整個SQL的執行計劃全部出錯,從而導致SQL性能下降。
動態採樣和擴展統計雖然是解決辦法,但是如果產品要考慮兼容性呢?我的產品要同時支持ORACLE,DB2,SQLSERVER,甚至以後的國產數據庫達夢,他們沒有動態採樣怎麼辦。