14.直方圖對執行計劃影響的解決辦法


2013-08-09 星期五


------------直方圖對執行計劃影響的解決辦法-----------------


SQL> create table tt as select 1 id,object_name from all_objects;


Table created.


SQL> update tt set id=99 where rownum=1;  --讓ID列分佈極度的不均勻,非常傾斜


1 row updated.


SQL> commit;


Commit complete.


SQL> create index ind_tt on tt(id);


Index created.


SQL> exec dbms_stats.gather_table_stats(user,'tt',estimate_percent=>100,method_opt=>'for all columns size skewonly')


PL/SQL procedure successfully completed.


SQL> select endpoint_value,endpoint_number from user_tab_histograms where table_name='TT' and column_name='ID' order by endpoint_number;


ENDPOINT_VALUE ENDPOINT_NUMBER

-------------- ---------------

            1                             40944

           99                            40945   --兩個bucket


select * from user_tab_columns where table_name='TT'  --ID頻度直方圖,object_name等高直方圖


SQL> conn hr/hr

Connected.

SQL> explain plan set statement_id '1' for select * from tt where id=1;


Explained.


SQL> explain plan set statement_id '99' for select * from tt where id=99;


Explained.


SQL> select statement_id,cardinality from plan_table where id=0 order by statement_id;


STATEMENT_ID                   CARDINALITY

------------------------------ -----------

1                                    40944

99                                       1



SQL> select * from tt where id=99;



Execution Plan

----------------------------------------------------------

Plan hash value: 3656862534


--------------------------------------------------------------------------------------

| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |        |     1 |    28 |     2   (0)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| TT     |     1 |    28 |     2   (0)| 00:00:01 |

|*  2 |   INDEX RANGE SCAN          | IND_TT |     1 |       |     1   (0)| 00:00:01 |

--------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - access("ID"=99)


SQL> select * from tt where id=1;


40944 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 264906180


--------------------------------------------------------------------------

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |      | 40944 |  1119K|    47   (3)| 00:00:01 |

|*  1 |  TABLE ACCESS FULL| TT   | 40944 |  1119K|    47   (3)| 00:00:01 |

--------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  1 - filter("ID"=1)


注意:如果沒有生成直方圖信息的話


SQL> exec dbms_stats.gather_table_stats(user,'tt',estimate_percent=>100,method_opt=>'for all columns size 1');


PL/SQL procedure successfully completed.


select * from user_tab_columns where table_name='TT' --確認是NONE


SQL> select endpoint_value,endpoint_number from user_tab_histograms where table_name='TT' and column_name='ID' order by endpoint_number;


ENDPOINT_VALUE ENDPOINT_NUMBER

-------------- ---------------

            1                            0

           99                           1


SQL> conn hr/hr

Connected.

SQL> explain plan set statement_id '1' for select * from tt where id=1;


Explained.


SQL> explain plan set statement_id '99' for select * from tt where id=99;


Explained.


SQL> select statement_id,cardinality from plan_table where id=0 order by statement_id;


STATEMENT_ID                   CARDINALITY

------------------------------ -----------

1                                    20473

99                                   20473


SQL> select * from tt where id=99;



Execution Plan

----------------------------------------------------------

Plan hash value: 264906180


--------------------------------------------------------------------------

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |      | 20473 |   559K|    47   (3)| 00:00:01 |

|*  1 |  TABLE ACCESS FULL| TT   | 20473 |   559K|    47   (3)| 00:00:01 |

--------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  1 - filter("ID"=99)



SQL> select * from tt where id=1;


40944 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 264906180


--------------------------------------------------------------------------

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |      | 20473 |   559K|    47   (3)| 00:00:01 |

|*  1 |  TABLE ACCESS FULL| TT   | 20473 |   559K|    47   (3)| 00:00:01 |

--------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  1 - filter("ID"=1)


錯誤的CAR,錯誤的PLAN。


總結:當列值分佈不均勻的時候,收集直方圖以AUTO方式可能不會生成直方圖,此時執行計劃是不正確的,

     此時需要認爲干預。


A 等高直方圖,會引起執行計劃的誤差,誤差隨着bucket的增加而減少,隨着數據量增加而增加。

B 如果直方圖類型爲NONE,此時,對於分佈不均勻的列,會發生plan的嚴重錯誤

C 最佳的情況是用頻度直方圖,但是count(distinct id)>254了,沒辦法作頻度直方圖,只能作等高直方圖,

 將bucket數量加到254,把誤差減到最小!

D 收集直方圖的動作封裝在dbms_stats包中的,但是封裝以後不是很精確了,因爲策略都是一樣的,

 所以留出了method_opt參數讓我們調整。

E 收集直方圖信息,對於數據量大的表是有性能損耗的,建議僅對有索引的列收集信息:

 FOR ALL INDEXED COLUMNS [size_clause],根據實際需求SQL來確定哪些列作直方圖。

 最主要關注列值分佈不均勻的列!!!


------------------------企業數據庫中作性能數據收集策略的方法(思路)-------------


1、作配置表——p_config


owner    table_name     estimate_percent    cascade    method_opt                    granularity           analyze_time

------------------------------------------------------------------------------------------------

SCOTT       TT              100              true     for all columns size skewonly     GLOBAL AND PARTITION

 HR         T1               30              true     for INDEXED columns size skewonly  GLOBAL

 。。。。。


 -------------------------------------------------------------------


2、創建SP


for idx in (select * from p_config) loop

  dbms_stats.gather_table_stats(idx.owner,idx.table_name,estimate_percent=>idx.estimate_percent,cascade=>idx.cascade,method_opt=>idx.method_opt...);

end loop;


3、將SP封裝在後臺JOB中,定時運行收集性能數據。

------------------------------------------------------------------------


表和索引的統計信息在,列的信息沒有了,plan發生怎樣的變化?


1、將列的信息刪除

SQL> exec dbms_stats.delete_column_stats(user,'tt','id');


PL/SQL procedure successfully completed.


select * from user_tab_col_statistics where table_name='TT' and column_name='ID'  --查不到數據


確認表和索引的信息還在:

SQL> select num_rows,avg_row_len,blocks,last_analyzed from user_tables where table_name='TT';


 NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

---------- ----------- ---------- -------------------

    40945                              28                           202          2013-08-08 14:49:51


SQL> select blevel,leaf_blocks,distinct_keys,last_analyzed from user_indexes where table_name='TT';


   BLEVEL LEAF_BLOCKS DISTINCT_KEYS LAST_ANALYZED

---------- ----------- ------------- -------------------

                   1                    80                          2                            2013-08-08 14:49:51


SQL> select * from tt where id=1;


40944 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 3656862534


--------------------------------------------------------------------------------------

| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |        |   409 | 11452 |    41   (0)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| TT     |   409 | 11452 |    41   (0)| 00:00:01 |

|*  2 |   INDEX RANGE SCAN          | IND_TT |   164 |       |    40   (0)| 00:00:01 |

--------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - access("ID"=1)


SQL> select count(1) from tt where id=1;


 COUNT(1)

----------

    40944


SQL> select * from tt where id=99;  --當前的級別下(級別2),列的信息缺失不會作動態採樣,當級別爲3的時候會採樣。



Execution Plan

----------------------------------------------------------

Plan hash value: 3656862534


--------------------------------------------------------------------------------------

| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |        |   409 | 11452 |    41   (0)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| TT     |   409 | 11452 |    41   (0)| 00:00:01 |

|*  2 |   INDEX RANGE SCAN          | IND_TT |   164 |       |    40   (0)| 00:00:01 |

--------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - access("ID"=99)



SQL> select count(1) from tt where id=99;


 COUNT(1)

----------

           1


雖然表和索引的統計信息還在,但是列的信息沒了,直方圖更加沒有了。CBO無法知道列值的分佈情況的,沒有辦法給出正確的執行計劃。


將採樣級別改到3:

SQL> show parameter dynamic


NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

optimizer_dynamic_sampling           integer     2

SQL> alter system set optimizer_dynamic_sampling=3 scope=both;


System altered.


SQL> show parameter dynamic


NAME                                 TYPE        VALUE

------------------------------------ ----------- ------------------------------

optimizer_dynamic_sampling           integer     3


SQL> select * from tt where id=99;



Execution Plan

----------------------------------------------------------

Plan hash value: 3656862534


--------------------------------------------------------------------------------------

| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |        |     7 |   196 |     2   (0)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| TT     |     7 |   196 |     2   (0)| 00:00:01 |

|*  2 |   INDEX RANGE SCAN          | IND_TT |     7 |       |     1   (0)| 00:00:01 |

--------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - access("ID"=99)


Note

-----

  - dynamic sampling used for this statement


SQL> select * from tt where id=1;


40944 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 264906180


--------------------------------------------------------------------------

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |      | 40938 |  1119K|    47   (3)| 00:00:01 |

|*  1 |  TABLE ACCESS FULL| TT   | 40938 |  1119K|    47   (3)| 00:00:01 |

--------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  1 - filter("ID"=1)


Note

-----

  - dynamic sampling used for this statement


結論:

採樣級別>=3,列的信息缺失會作動態採樣;

採樣級別<=2,列的信息缺失不會動態採樣。


------------------------聚簇因子--------------------------------------


聚簇因子是索引的一個屬性。


select index_name,clustering_factor from user_indexes  --索引的聚簇因子


SQL> select index_name,clustering_factor from user_indexes where index_name='T6_IND';


INDEX_NAME                     CLUSTERING_FACTOR

------------------------------ -----------------

T6_IND                                       376


定義——表示索引中的鍵值和源表上數據分佈的一種關係,當索引鍵值和表中的數據以及和佔用的數據塊數量大致相同的時候,意味着索引鍵值指向數據塊越集中,這個因子越小,越有利於索引的使用,相反,當索引鍵值指向的數據塊越多,數據的排列和索引相差很大,這個因子也就越大。


這個指標對執行計劃影響是很大的,比如執行計劃異常的時候,但是從card上無法選擇正確,經常考慮聚簇因子有問題或者直方圖對應的列分佈不均勻。


聚簇因子是怎樣算出來的,索引是創建在列上的,聚簇因子是針對索引的,也是針對列的,所以聚簇因子也是列的屬性。


聚簇因子就是列的索引對應的數據行掃描的時候所掃的數據塊次。


create or replace function clustering_factor(p_owner in varchar2,p_table_name in varchar2,p_column_name in varchar2) return number is

 l_cursor SYS_REFCURSOR;

 l_clustering_factor integer:=0;

 l_block_nr integer:=0;

 l_file_nr integer:=0;

 l_previous_block_nr integer:=0;

 l_previous_file_nr integer:=0;

begin

 open l_cursor for 'select dbms_rowid.rowid_block_number(rowid) block_nr,'||' dbms_rowid.rowid_to_absolute_fno(rowid,'''||p_owner||''','''||p_table_name||''') file_nr '||' from '||p_owner||'.'||p_table_name||' '||' where '||p_column_name||' is not null '||'order by '||p_column_name;

 loop

   fetch l_cursor into l_block_nr,l_file_nr;

   exit when (l_cursor%notfound);

   if l_block_nr<>l_previous_block_nr or l_file_nr<>l_previous_file_nr then

     l_clustering_factor:=l_clustering_factor+1;

   else

     null;

   end if;

   l_previous_block_nr:=l_block_nr;

   l_previous_file_nr:=l_file_nr;

 end loop;

 close l_cursor;

 return l_clustering_factor;

end;


驗證:


SQL> select index_name,clustering_factor from user_indexes where index_name='T6_IND';


INDEX_NAME                     CLUSTERING_FACTOR

------------------------------ -----------------

T6_IND                                       376


select * from user_ind_columns  --查詢索引、表、列的對應關係


SQL> select clustering_factor('HR','T6','OBJECT_ID') from dual;


CLUSTERING_FACTOR('HR','T6','OBJECT_ID')

----------------------------------------

                                    376


SQL> select index_name,clustering_factor from user_indexes where index_name='T3_IND';


INDEX_NAME                     CLUSTERING_FACTOR

------------------------------ -----------------

T3_IND                                       353



SQL> select clustering_factor('HR','T5','OBJECT_ID') from dual;


CLUSTERING_FACTOR('HR','T5','OBJECT_ID')

----------------------------------------

                                    353


案例:聚簇因子是怎樣影響執行計劃的。


SQL> drop table t;


Table dropped.


SQL> create table t as select object_id,object_name from all_objects;


Table created.


SQL> create index t_ind on t(object_id);


Index created.


SQL> drop table t1;


Table dropped.


SQL> create table t1 as select * from t where rownum=1;


Table created.


SQL> alter table t1 minimize records_per_block;  --將表的屬性改爲在塊中儘量存儲最少的行。


Table altered.


SQL> insert into t1 select * from t;


40944 rows created.


SQL> commit;


Commit complete.


SQL> create index t1_ind on t1(object_id);


Index created.


SQL> exec dbms_stats.gather_table_stats(user,'t',cascade=>true);


PL/SQL procedure successfully completed.


SQL> exec dbms_stats.gather_table_stats(user,'t1',cascade=>true);


PL/SQL procedure successfully completed.


SQL> select table_name,num_rows,blocks from user_tables where table_name in('T','T1');


TABLE_NAME                       NUM_ROWS     BLOCKS

------------------------------ ---------- ----------

T                                   40944        213

T1                                  40945      21320


SQL> select table_name,index_name,num_rows,leaf_blocks,clustering_factor from user_indexes where table_name in('T','T1');


TABLE_NAME                     INDEX_NAME                       NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR

------------------------------ ------------------------------ ---------- ----------- -----------------

T                              T_IND                               40944          91               353

T1                             T1_IND                              40945          91             33469


結論:數據分佈在數據塊上越多,索引聚合因子也就越大。


SQL> select * from t where object_id<1000;


64 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 1376202287


-------------------------------------------------------------------------------------

| Id  | Operation                   | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

-------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |       |   540 | 16200 |     8   (0)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| T     |   540 | 16200 |     8   (0)| 00:00:01 |

|*  2 |   INDEX RANGE SCAN          | T_IND |   540 |       |     3   (0)| 00:00:01 |

-------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - access("OBJECT_ID"<1000)



Statistics

----------------------------------------------------------

         0  recursive calls

         0  db block gets

        13  consistent gets

         0  physical reads

         0  redo size

      2603  bytes sent via SQL*Net to client

       444  bytes received via SQL*Net from client

         6  SQL*Net roundtrips to/from client

         0  sorts (memory)

         0  sorts (disk)

        64  rows processed


SQL> select * from t1 where object_id<1000;


65 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 2059591622


--------------------------------------------------------------------------------------

| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |        |   540 | 16200 |   445   (0)| 00:00:06 |

|   1 |  TABLE ACCESS BY INDEX ROWID| T1     |   540 | 16200 |   445   (0)| 00:00:06 |

|*  2 |   INDEX RANGE SCAN          | T1_IND |   540 |       |     3   (0)| 00:00:01 |

--------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - access("OBJECT_ID"<1000)



Statistics

----------------------------------------------------------

         0  recursive calls

         0  db block gets

        42  consistent gets

         0  physical reads

         0  redo size

      2608  bytes sent via SQL*Net to client

       444  bytes received via SQL*Net from client

         6  SQL*Net roundtrips to/from client

         0  sorts (memory)

         0  sorts (disk)

        65  rows processed


結論:執行路徑是一樣的,聚合因子高的索引產生了更多的一致性讀,COST也更大。


如果不用rowid來掃表,此時和聚簇因子沒有關係了。


SQL> select object_id from t where object_id<1000;


64 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 422821423


--------------------------------------------------------------------------

| Id  | Operation        | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------

|   0 | SELECT STATEMENT |       |   540 |  2700 |     3   (0)| 00:00:01 |

|*  1 |  INDEX RANGE SCAN| T_IND |   540 |  2700 |     3   (0)| 00:00:01 |

--------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  1 - access("OBJECT_ID"<1000)



Statistics

----------------------------------------------------------

         1  recursive calls

         0  db block gets

         7  consistent gets

         0  physical reads

         0  redo size

      1339  bytes sent via SQL*Net to client

       444  bytes received via SQL*Net from client

         6  SQL*Net roundtrips to/from client

         0  sorts (memory)

         0  sorts (disk)

        64  rows processed


SQL> select object_id from t1 where object_id<1000;


65 rows selected.



Execution Plan

----------------------------------------------------------

Plan hash value: 2474755989


---------------------------------------------------------------------------

| Id  | Operation        | Name   | Rows  | Bytes | Cost (%CPU)| Time     |

---------------------------------------------------------------------------

|   0 | SELECT STATEMENT |        |   540 |  2700 |     3   (0)| 00:00:01 |

|*  1 |  INDEX RANGE SCAN| T1_IND |   540 |  2700 |     3   (0)| 00:00:01 |

---------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  1 - access("OBJECT_ID"<1000)



Statistics

----------------------------------------------------------

         1  recursive calls

         0  db block gets

         7  consistent gets

         0  physical reads

         0  redo size

      1340  bytes sent via SQL*Net to client

       444  bytes received via SQL*Net from client

         6  SQL*Net roundtrips to/from client

         0  sorts (memory)

         0  sorts (disk)

        65  rows processed



塊中的行密度越低,越容易用索引,但是性能不一定最高。



----表列值的存儲排列順序和索引列值的排列順序一致和非一致對聚簇因子和執行計劃的影響-----------


SQL> conn hr/hr

Connected.

SQL> create table t_colocated(id number,col2 varchar2(100));


Table created.


SQL> begin

 2  for i in 1..100000 loop

 3  insert into t_colocated values(i,rpad(dbms_random.random,95,'*'));   --ID列存儲的時候是排序的

 4  end loop;

 5  end;

 6  /


PL/SQL procedure successfully completed.


SQL> commit;


Commit complete.


SQL> alter table t_colocated add constraint pk_t_colocated primary key(id);


Table altered.


SQL> create table t_disorganized as select id,col2 from t_colocated order by col2; --ID列是散亂的。


Table created.  --克隆表的方式在表中t_disorganized裝入數據


SQL> alter table t_disorganized add constraint pk_t_disorganized primary key(id);


Table altered.


SQL> select table_name,index_name,num_rows,leaf_blocks,clustering_factor from user_indexes where index_name in(upper('pk_t_colocated'),upper('pk_t_disorganized'));


TABLE_NAME                     INDEX_NAME                       NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR

------------------------------ ------------------------------ ---------- ----------- -----------------

T_COLOCATED                    PK_T_COLOCATED                     100000         208              1469

T_DISORGANIZED                 PK_T_DISORGANIZED                  100000         208             99935



SQL> select clustering_factor('HR',upper('t_disorganized'),'ID') from dual;


CLUSTERING_FACTOR('HR',UPPER('T_DISORGANIZED'),'ID')

----------------------------------------------------

                                              99935


SQL> select clustering_factor('HR',upper('t_colocated'),'ID') from dual;


CLUSTERING_FACTOR('HR',UPPER('T_COLOCATED'),'ID')

-------------------------------------------------

                                            1469


ID列次序亂了,查詢的數據的塊不是依次順序的,聚簇因子也會發生變化,對執行計劃的影響。


SQL> set autotrace trace exp

SQL> set linesize 1000

SQL> select * from t_disorganized where id<100;


Execution Plan

----------------------------------------------------------

Plan hash value: 290015569


-------------------------------------------------------------------------------------------------

| Id  | Operation                   | Name              | Rows  | Bytes | Cost (%CPU)| Time     |

-------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |                   |    99 |  6435 |   127   (0)| 00:00:02 |

|   1 |  TABLE ACCESS BY INDEX ROWID| T_DISORGANIZED    |    99 |  6435 |   127   (0)| 00:00:02 |

|*  2 |   INDEX RANGE SCAN          | PK_T_DISORGANIZED |    99 |       |     2   (0)| 00:00:01 |

-------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - access("ID"<100)


Note

-----

  - dynamic sampling used for this statement


SQL> select * from t_colocated where id<100;


Execution Plan

----------------------------------------------------------

Plan hash value: 4204525375


----------------------------------------------------------------------------------------------

| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |                |    99 |  6435 |     4   (0)| 00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| T_COLOCATED    |    99 |  6435 |     4   (0)| 00:00:01 |

|*  2 |   INDEX RANGE SCAN          | PK_T_COLOCATED |    99 |       |     2   (0)| 00:00:01 |

----------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - access("ID"<100)


Note

-----

  - dynamic sampling used for this statement


-------------------------dbms_stats包的詳細用法------------------------------------

DBMS_STATS.GATHER_TABLE_STATS (

  ownname          VARCHAR2,

  tabname          VARCHAR2,

  partname         VARCHAR2 DEFAULT NULL,

  estimate_percent NUMBER   DEFAULT to_estimate_percent_type

                                               (get_param('ESTIMATE_PERCENT')),

  block_sample     BOOLEAN  DEFAULT FALSE,

  method_opt       VARCHAR2 DEFAULT get_param('METHOD_OPT'),

  degree           NUMBER   DEFAULT to_degree_type(get_param('DEGREE')),

  granularity      VARCHAR2 DEFAULT GET_PARAM('GRANULARITY'),

  cascade          BOOLEAN  DEFAULT to_cascade_type(get_param('CASCADE')),

  stattab          VARCHAR2 DEFAULT NULL,

  statid           VARCHAR2 DEFAULT NULL,

  statown          VARCHAR2 DEFAULT NULL,

  no_invalidate    BOOLEAN  DEFAULT  to_no_invalidate_type (

                                    get_param('NO_INVALIDATE')),

  force            BOOLEAN DEFAULT FALSE);


degree——分析的並行度


並行——幾個進程來一起完成這個任務。


SQL> exec dbms_stats.gather_table_stats(user,'t_disorganized',cascade=>true,degree=>5);  --5個子進程一起完成這個分析


PL/SQL procedure successfully completed.


對象本身有並行度:

SQL> select table_name,degree from user_tables;


TABLE_NAME                     DEGREE

------------------------------ --------------------

XX2                                     1

TX                                      1

T_COLOCATED                             1

T2                                      1

T3                                      1

T4                                      1

T5                                      1

T6                                      1

EMP                                     1

T7                                      1

T8                                      1

TEST1                                   1

TEST                                    1

T_DISORGANIZED                          1

T1                                      1

TT                                      1

T                                       1

COUNTRIES                               1

JOBS                                    1

LOCATIONS                               1

REGIONS                                 1

JOB_HISTORY                             1

EMPLOYEES                               1

DEPARTMENTS                             1


24 rows selected.


默認的在表上的並行度都是1,也就是不併行。所以在分析的時候要指定degree參數。


SQL> alter table t_disorganized parallel 5;


Table altered.


SQL> select table_name,degree from user_tables where table_name='T_DISORGANIZED';


TABLE_NAME                     DEGREE

------------------------------ --------------------

T_DISORGANIZED                          5


SQL> exec dbms_stats.gather_table_stats(user,'t_disorganized',cascade=>true);


PL/SQL procedure successfully completed. --此時不用指定了,默認是按照5的並行度來執行的。


granularity——參數

'ALL' - 對錶的全局作分析,包括分區和子分區。

'AUTO'- 缺省,oracle自動根據分區的類型來決定用哪種粒度來分析

'DEFAULT' - 10g中這個參數是廢棄的。

'GLOBAL' - 分析表的全局,不包括分區和子分區。

'GLOBAL AND PARTITION' - 對全局分析和分區分析,不分析子分區

'PARTITION '- 僅僅分析分區

'SUBPARTITION' - 僅僅分析子分區


案例:

SQL> create table ttx(id int) partition by range(id)

 2  (partition p1 values less than(5),

 3  partition p2 values less than(10),

 4  partition p3 values less than(15)

 5  );


Table created.


SQL> select segment_name,partition_name,segment_type from user_segments where segment_name='TTX';


SEGMENT_NAME         PARTITION_NAME                 SEGMENT_TYPE

--------------------------------------------------------------------------------- ------------------------------ ------

TTX                  P1                             TABLE PARTITION

TTX                  P2                             TABLE PARTITION

TTX                  P3                             TABLE PARTITION


物理上這個表有三個segment,根據ID的值來選擇存放的段。


數據的存儲:

SQL> insert into ttx values(1);


1 row created.


SQL> insert into ttx values(6);


1 row created.


SQL> insert into ttx values(11);


1 row created.


SQL> commit;


Commit complete.


SQL> select * from ttx;


       ID

----------

        1

        6

       11


SQL> select * from ttx partition(p1);


       ID

----------

        1


SQL> select * from ttx partition(p2);


       ID

----------

        6


SQL> select * from ttx partition(p3);


       ID

----------

       11


索引在分區表上如何存儲的。


SQL> select segment_name,partition_name,segment_type from user_segments where segment_name='IND_TTX';   --全局索引


SEGMENT_NAME        PARTITION_NAME                 SEGMENT_TYPE

--------------------------------------------------------------------------------- --------------------------

IND_TTX                                            INDEX



SQL> drop index ind_ttx;


Index dropped.


SQL> create index ind_ttx on ttx(id) local;


Index created.


SQL> select segment_name,partition_name,segment_type from user_segments where segment_name='IND_TTX';


SEGMENT_NAME      PARTITION_NAME                 SEGMENT_TYPE

--------------------------------------------------------------------------------- ----------------------------

IND_TTX           P1                             INDEX PARTITION

IND_TTX           P2                             INDEX PARTITION

IND_TTX           P3                             INDEX PARTITION


分區索引(本地索引),根據表的分區情況將索引分區,3個segment


分析:

SQL> exec dbms_stats.gather_table_stats(user,'ttx',cascade=>true);


PL/SQL procedure successfully completed.


確認表和分區的統計信息都存在:


表的全局統計信息:

SQL> select num_rows,avg_row_len,blocks,last_analyzed from user_tables where table_name='TTX';


 NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

---------- ----------- ---------- -------------------

        3           3         15 2013-08-09 10:07:59


表的分區統計信息:

SQL> select partition_name,num_rows,avg_row_len,blocks,last_analyzed from user_tab_partitions where table_name='TTX';


PARTITION_NAME                   NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

------------------------------ ---------- ----------- ---------- -------------------

P1                                      1           3          5 2013-08-09 10:07:59

P2                                      1           3          5 2013-08-09 10:07:59

P3                                      1           3          5 2013-08-09 10:07:59


每個分區有1行數據,每個分區段都有一個extent,每個extent有8個數據塊:

select * from user_extents where partition_name='P1'


但是爲什麼統計顯示是5個數據塊?前三個塊是位圖。


確認下索引和索引分區的統計信息是存在的。

SQL> select blevel,leaf_blocks,distinct_keys,last_analyzed from user_indexes where table_name='TTX';


   BLEVEL LEAF_BLOCKS DISTINCT_KEYS LAST_ANALYZED

---------- ----------- ------------- -------------------

        0           3             3 2013-08-09 10:07:59



SQL> select partition_name,num_rows,blevel,leaf_blocks,last_analyzed from user_ind_partitions where index_name='IND_TTX';


PARTITION_NAME                   NUM_ROWS     BLEVEL LEAF_BLOCKS LAST_ANALYZED

------------------------------ ---------- ---------- ----------- -------------------

P1                                      1          0           1 2013-08-09 10:07:59

P2                                      1          0           1 2013-08-09 10:07:59

P3                                      1          0           1 2013-08-09 10:07:59


給表添加一個分區

SQL> alter table ttx add partition pm values less than(maxvalue);


Table altered.


給新的分區添加數據


SQL> begin

 2  for i in 1..10000 loop

 3  insert into ttx values(16);

 4  end loop;

 5  end;

 6  /


PL/SQL procedure successfully completed.


SQL> commit;


Commit complete.


SQL> select count(1) from ttx partition(pm);


 COUNT(1)

----------

    10000


看分區索引的統計信息:

SQL> select partition_name,num_rows,blevel,leaf_blocks,last_analyzed from user_ind_partitions where index_name='IND_TTX';


PARTITION_NAME                   NUM_ROWS     BLEVEL LEAF_BLOCKS LAST_ANALYZED

------------------------------ ---------- ---------- ----------- -------------------

P1                                      1          0           1 2013-08-09 10:07:59

P2                                      1          0           1 2013-08-09 10:07:59

P3                                      1          0           1 2013-08-09 10:07:59

PM   --新增的分區是沒有統計信息的


看錶的全局信息

SQL> select num_rows,avg_row_len,blocks,last_analyzed from user_tables where table_name='TTX';


 NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

---------- ----------- ---------- -------------------

        3           3         15 2013-08-09 10:07:59    --表的全局信息沒有變化


執行計劃的影響:

SQL> select * from ttx where id=16;


Execution Plan

----------------------------------------------------------

Plan hash value: 3737425109


--------------------------------------------------------------------------------------------------

| Id  | Operation              | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

--------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT       |         |     1 |     3 |     1   (0)| 00:00:01 |       |       |

|   1 |  PARTITION RANGE SINGLE|         |     1 |     3 |     1   (0)| 00:00:01 |     4 |     4 |

|*  2 |   INDEX RANGE SCAN     | IND_TTX |     1 |     3 |     1   (0)| 00:00:01 |     4 |     4 |

--------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - access("ID"=16)


Pstart——查詢掃描讀取分區的起始分區號

Pstop——查詢掃描讀取分區的結束分區號


ROWS的估計值是嚴重失誤的,沒有分析新增加的分區,導致CBO作出了錯誤的判斷。


A 只對分區作分析,不對全局作分析

SQL> exec dbms_stats.gather_table_stats(user,'ttx',partname=>'pm',estimate_percent=>100,granularity=>'partition');


PL/SQL procedure successfully completed.


查看分區的信息:


SQL> select partition_name,num_rows,blevel,leaf_blocks,last_analyzed from user_ind_partitions where index_name='IND_TTX';


PARTITION_NAME                   NUM_ROWS     BLEVEL LEAF_BLOCKS LAST_ANALYZED

------------------------------ ---------- ---------- ----------- -------------------

P1                                      1          0           1 2013-08-09 10:07:59

P2                                      1          0           1 2013-08-09 10:07:59

P3                                      1          0           1 2013-08-09 10:07:59

PM                                  10000          1          27 2013-08-09 10:23:32


SQL> select partition_name,num_rows,avg_row_len,blocks,last_analyzed from user_tab_partitions where table_name='TTX';


PARTITION_NAME                   NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

------------------------------ ---------- ----------- ---------- -------------------

P1                                      1           3          5 2013-08-09 10:07:59

P2                                      1           3          5 2013-08-09 10:07:59

P3                                      1           3          5 2013-08-09 10:07:59

PM                                  10000           3         20 2013-08-09 10:23:32


確認全局信息沒有分析:

SQL> select num_rows,avg_row_len,blocks,last_analyzed from user_tables where table_name='TTX';


 NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

---------- ----------- ---------- -------------------

        3           3         15 2013-08-09 10:07:59


SQL> select blevel,leaf_blocks,distinct_keys,last_analyzed from user_indexes where table_name='TTX';


   BLEVEL LEAF_BLOCKS DISTINCT_KEYS LAST_ANALYZED

---------- ----------- ------------- -------------------

        0           3             3 2013-08-09 10:07:59


SQL> select * from ttx where id=16;  --此時執行計劃是正確的


Execution Plan

----------------------------------------------------------

Plan hash value: 701592076


-----------------------------------------------------------------------------------------------

| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

-----------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT       |      |  9999 | 29997 |     6   (0)| 00:00:01 |       |       |

|   1 |  PARTITION RANGE SINGLE|      |  9999 | 29997 |     6   (0)| 00:00:01 |     4 |     4 |

|*  2 |   TABLE ACCESS FULL    | TTX  |  9999 | 29997 |     6   (0)| 00:00:01 |     4 |     4 |

-----------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - filter("ID"=16)


沒有對全局分析執行計劃計劃在什麼情況下會異常?


對全局數據作出變化:

SQL> alter table ttx add object_name varchar2(20);


Table altered.


SQL> update ttx set object_name='AAAA';


10003 rows updated.


SQL> commit;


Commit complete.


SQL> create index ind2_ttx on ttx(object_name);


Index created.


SQL> exec dbms_stats.gather_table_stats(user,'ttx',partname=>'pm',estimate_percent=>100,granularity=>'partition');


PL/SQL procedure successfully completed. --只分析分區不分析全局


SQL> select * from ttx where id=16;  --查詢ID列是沒有問題的


Execution Plan

----------------------------------------------------------

Plan hash value: 701592076


-----------------------------------------------------------------------------------------------

| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

-----------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT       |      |  9999 | 79992 |    11   (0)| 00:00:01 |       |       |

|   1 |  PARTITION RANGE SINGLE|      |  9999 | 79992 |    11   (0)| 00:00:01 |     4 |     4 |

|*  2 |   TABLE ACCESS FULL    | TTX  |  9999 | 79992 |    11   (0)| 00:00:01 |     4 |     4 |

-----------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - filter("ID"=16)


SQL> select * from ttx where object_name='AAAA';  --查詢name列錯誤了。


Execution Plan

----------------------------------------------------------

Plan hash value: 1144724227


--------------------------------------------------------------------------------------------

| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

--------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |      |     3 |     9 |     5   (0)| 00:00:01 |       |       |

|   1 |  PARTITION RANGE ALL|      |     3 |     9 |     5   (0)| 00:00:01 |     1 |     4 | --表示掃描分區1~4

|*  2 |   TABLE ACCESS FULL | TTX  |     3 |     9 |     5   (0)| 00:00:01 |     1 |     4 |

--------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - filter("OBJECT_NAME"='AAAA')


Note

-----

  - dynamic sampling used for this statement


只對分區分析不對全局分析,查詢只是涉及分區的時候,執行計劃是沒有問題的。查詢涉及全局的時候,執行計劃是錯誤的。


SQL> exec dbms_stats.gather_table_stats(user,'ttx',estimate_percent=>100,granularity=>'global');  --全局分析


PL/SQL procedure successfully completed.


SQL> select num_rows,avg_row_len,blocks,last_analyzed from user_tables where table_name='TTX';


 NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

---------- ----------- ---------- -------------------

    10003           8         58 2013-08-09 10:31:14


SQL> select blevel,leaf_blocks,distinct_keys,last_analyzed from user_indexes where table_name='TTX';


   BLEVEL LEAF_BLOCKS DISTINCT_KEYS LAST_ANALYZED

---------- ----------- ------------- -------------------

        1          30             5 2013-08-09 10:31:14

        1          28             1 2013-08-09 10:31:14


SQL> select * from ttx where object_name='AAAA';


Execution Plan

----------------------------------------------------------

Plan hash value: 1144724227


--------------------------------------------------------------------------------------------

| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

--------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT    |      | 10003 | 80024 |    14   (0)| 00:00:01 |       |       |

|   1 |  PARTITION RANGE ALL|      | 10003 | 80024 |    14   (0)| 00:00:01 |     1 |     4 |

|*  2 |   TABLE ACCESS FULL | TTX  | 10003 | 80024 |    14   (0)| 00:00:01 |     1 |     4 |

--------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - filter("OBJECT_NAME"='AAAA')


總結:即使分區級別有信息了,但是沒有對全局分析,當查詢涉及到全局數據,而全局數據在上一次分析之後有數據或者結構的變化的時候,依然導致錯誤的執行計劃。


B 只對全局作分析,不對分區作分析


刪除最後一個分區的數據

SQL> delete from ttx where id>15;


10000 rows deleted.


SQL> commit;


Commit complete.


先對分區和全局都分析


SQL> exec dbms_stats.gather_table_stats(user,'ttx',estimate_percent=>100,cascade=>true);


PL/SQL procedure successfully completed.


SQL> select num_rows,avg_row_len,blocks,last_analyzed from user_tables where table_name='TTX';


 NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

---------- ----------- ---------- -------------------

        3           8         58 2013-08-09 10:35:43


SQL> select partition_name,num_rows,avg_row_len,blocks,last_analyzed from user_tab_partitions where table_name='TTX';


PARTITION_NAME                   NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

------------------------------ ---------- ----------- ---------- -------------------

P1                                      1           8          5 2013-08-09 10:35:43

P2                                      1           8          5 2013-08-09 10:35:43

P3                                      1           8          5 2013-08-09 10:35:43

PM                                      0           0         43 2013-08-09 10:35:43


往第四個分區中加數據


SQL> begin

 2  for i in 1..10000 loop

 3  insert into ttx values(16,'');

 4  end loop;

 5  end;

 6  /


PL/SQL procedure successfully completed.


SQL> commit;


Commit complete.


讓數據嚴重傾斜

SQL> update ttx set id=1000 where id=16 and rownum=1;


1 row updated.


SQL> commit;


Commit complete.


只對全局作分析不對分區作分析,且不做直方圖分析


SQL> exec dbms_stats.gather_table_stats(user,'ttx',estimate_percent=>100,granularity=>'global',method_opt=>'for all columns size 1',cascade=>true);


PL/SQL procedure successfully completed.


確認全局信息是新的,分區沒有分析:


SQL> select num_rows,avg_row_len,blocks,last_analyzed from user_tables where table_name='TTX';


 NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

---------- ----------- ---------- -------------------

    10003           4         58 2013-08-09 10:41:08


SQL> select blevel,leaf_blocks,distinct_keys,last_analyzed from user_indexes where table_name='TTX';


   BLEVEL LEAF_BLOCKS DISTINCT_KEYS LAST_ANALYZED

---------- ----------- ------------- -------------------

        1          31             5 2013-08-09 10:41:08

        1           1             1 2013-08-09 10:41:08


SQL>  select partition_name,num_rows,avg_row_len,blocks,last_analyzed from user_tab_partitions where table_name='TTX';


PARTITION_NAME                   NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

------------------------------ ---------- ----------- ---------- -------------------

P1                                      1           8          5 2013-08-09 10:35:43

P2                                      1           8          5 2013-08-09 10:35:43

P3                                      1           8          5 2013-08-09 10:35:43

PM                                      0           0         43 2013-08-09 10:35:43


SQL> select partition_name,num_rows,blevel,leaf_blocks,last_analyzed from user_ind_partitions where index_name='IND_TTX';


PARTITION_NAME                   NUM_ROWS     BLEVEL LEAF_BLOCKS LAST_ANALYZED

------------------------------ ---------- ---------- ----------- -------------------

P1                                      1          0           1 2013-08-09 10:35:43

P2                                      1          0           1 2013-08-09 10:35:43

P3                                      1          0           1 2013-08-09 10:35:43

PM                                      0          1           0 2013-08-09 10:35:43


SQL> select * from ttx where id=16;


Execution Plan

----------------------------------------------------------

Plan hash value: 2572159449


--------------------------------------------------------------------------------------------------------------

| Id  | Operation                          | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

--------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT                   |         |     1 |    25 |     1   (0)| 00:00:01 |       |       |

|   1 |  PARTITION RANGE SINGLE            |         |     1 |    25 |     1   (0)| 00:00:01 |     4 |     4 |

|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| TTX     |     1 |    25 |     1   (0)| 00:00:01 |     4 |     4 |

|*  3 |    INDEX RANGE SCAN                | IND_TTX |     1 |       |     1   (0)| 00:00:01 |     4 |     4 |

--------------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  3 - access("ID"=16)


加上直方圖信息。


SQL> exec dbms_stats.gather_table_stats(user,'ttx',estimate_percent=>100,granularity=>'global',method_opt=>'for all columns size skewonly',cascade=>true);


PL/SQL procedure successfully completed.


select * from user_tab_col_statistics where table_name='TTX'  --確認有直方圖信息了。


SQL> select * from ttx where id=16;


Execution Plan

----------------------------------------------------------

Plan hash value: 2572159449


--------------------------------------------------------------------------------------------------------------

| Id  | Operation                          | Name    | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

--------------------------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT                   |         |     1 |    25 |     1   (0)| 00:00:01 |       |       |

|   1 |  PARTITION RANGE SINGLE            |         |     1 |    25 |     1   (0)| 00:00:01 |     4 |     4 |

|   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| TTX     |     1 |    25 |     1   (0)| 00:00:01 |     4 |     4 |

|*  3 |    INDEX RANGE SCAN                | IND_TTX |     1 |       |     1   (0)| 00:00:01 |     4 |     4 |

--------------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  3 - access("ID"=16)


執行計劃還是錯誤的,說明主要不是直方圖引起的。主要還是因爲沒有收集分區的信息導致的。


SQL> exec dbms_stats.gather_table_stats(user,'ttx',estimate_percent=>100,granularity=>'partition',method_opt=>'for all columns size 1',cascade=>true);


PL/SQL procedure successfully completed.  分析分區但是不收集直方圖。


SQL> select partition_name,num_rows,avg_row_len,blocks,last_analyzed from user_tab_partitions where table_name='TTX';


PARTITION_NAME                   NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

------------------------------ ---------- ----------- ---------- -------------------

P1                                      1           8          5 2013-08-09 10:46:33

P2                                      1           8          5 2013-08-09 10:46:33

P3                                      1           8          5 2013-08-09 10:46:33

PM                                  10000           3         43 2013-08-09 10:46:33


SQL> select partition_name,num_rows,blevel,leaf_blocks,last_analyzed from user_ind_partitions where index_name='IND_TTX';


PARTITION_NAME                   NUM_ROWS     BLEVEL LEAF_BLOCKS LAST_ANALYZED

------------------------------ ---------- ---------- ----------- -------------------

P1                                      1          0           1 2013-08-09 10:46:33

P2                                      1          0           1 2013-08-09 10:46:33

P3                                      1          0           1 2013-08-09 10:46:33

PM                                  10000          1          28 2013-08-09 10:46:33


SQL> select * from ttx where id=16;


Execution Plan

----------------------------------------------------------

Plan hash value: 701592076


-----------------------------------------------------------------------------------------------

| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

-----------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT       |      |  5000 | 15000 |    11   (0)| 00:00:01 |       |       |

|   1 |  PARTITION RANGE SINGLE|      |  5000 | 15000 |    11   (0)| 00:00:01 |     4 |     4 |

|*  2 |   TABLE ACCESS FULL    | TTX  |  5000 | 15000 |    11   (0)| 00:00:01 |     4 |     4 |

-----------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - filter("ID"=16)


bucket只有1個,分區中,最大值是1000,最小值是16,共10000行,按照均勻分佈,各5000行。


SQL> exec dbms_stats.gather_table_stats(user,'ttx',estimate_percent=>100,granularity=>'partition',method_opt=>'for all columns size skewonly',cascade=>true);  --重新收集直方圖信息


PL/SQL procedure successfully completed.


SQL> select * from ttx where id=16;


Execution Plan

----------------------------------------------------------

Plan hash value: 701592076


-----------------------------------------------------------------------------------------------

| Id  | Operation              | Name | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |

-----------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT       |      |  9999 | 29997 |    11   (0)| 00:00:01 |       |       |

|   1 |  PARTITION RANGE SINGLE|      |  9999 | 29997 |    11   (0)| 00:00:01 |     4 |     4 |

|*  2 |   TABLE ACCESS FULL    | TTX  |  9999 | 29997 |    11   (0)| 00:00:01 |     4 |     4 |

-----------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


  2 - filter("ID"=16)


必須對全局和分析都分析才能得到精確的執行計劃,但是上面的例子是很極端的例子,

將所有可能出現的問題都放大了,優化效果很明顯。


但是實際的環境中,系統的繁忙程度決定了我們可能不允許時時對大表作全局分析,採取策略?



建議的策略:

1、所有的SQL語句是不是僅僅在分區完成的,如果是,那麼沒有必要去分析全局,只分析新增的分區即可;

  如果有些SQL是跨分區查詢的,考慮新增分區的數據是不是佔比很大,如果僅僅很少比列的數據,

  可以考慮不進行全局分析,如果薪增的分區佔比很大,就需要全局分析+分區分析。

2、如果企業的數據庫本身容量就不大,全局分析+分區分析也未嘗不可。

3、如果列值分佈均勻,考慮將直方圖分析取消掉,減小全局分析的壓力。

------------------------------------------------------------------------

dbms_stats中保存分析數據的方法:


A 創建一個保存性能數據的表


SQL> exec dbms_stats.create_stat_table(user,'stat_tab','mytbs3');


PL/SQL procedure successfully completed.


SQL> set linesize 100

SQL> desc stat_tab  --這個表用來轉存性能數據的,不需要關注字段的含義

Name                                                  Null?    Type

----------------------------------------------------- -------- ------------------------------------

STATID                                                         VARCHAR2(30)

TYPE                                                           CHAR(1)

VERSION                                                        NUMBER

FLAGS                                                          NUMBER

C1                                                             VARCHAR2(30)

C2                                                             VARCHAR2(30)

C3                                                             VARCHAR2(30)

C4                                                             VARCHAR2(30)

C5                                                             VARCHAR2(30)

N1                                                             NUMBER

N2                                                             NUMBER

N3                                                             NUMBER

N4                                                             NUMBER

N5                                                             NUMBER

N6                                                             NUMBER

N7                                                             NUMBER

N8                                                             NUMBER

N9                                                             NUMBER

N10                                                            NUMBER

N11                                                            NUMBER

N12                                                            NUMBER

D1                                                             DATE

R1                                                             RAW(32)

R2                                                             RAW(32)

CH1                                                            VARCHAR2(1000)


B 刪除這個表的話。


SQL> exec dbms_stats.drop_stat_table(user,'stat_tab');


PL/SQL procedure successfully completed.


SQL> desc stat_tab

ERROR:

ORA-04043: object stat_tab does not exist


SQL> exec dbms_stats.create_stat_table(user,'stat_tab','mytbs3');


PL/SQL procedure successfully completed.


C 保存數據——收集性能數據的時候同時將性能數據保存在這個表中。


場景A:收集性能數據的時候轉存性能數據。


SQL> exec dbms_stats.gather_table_stats(user,'ttx',stattab=>'stat_tab',cascade=>true);


PL/SQL procedure successfully completed.


SQL> select count(1) from stat_tab;


 COUNT(1)

----------

       26


場景B:如果在數據字典表已經有了性能數據,現在不能再去收集,可以將已經有的性能數據導入到這個表中,表已經建好。


SQL> truncate table stat_tab;


Table truncated.


SQL> exec dbms_stats.export_table_stats(user,'ttx',stattab=>'stat_tab',cascade=>true);


PL/SQL procedure successfully completed.


SQL> select count(1) from stat_tab;


 COUNT(1)

----------

       26


一個用戶下,stattab只要一個就夠了,所有表、索引、列的性能數據都能存儲在這個表中。


場景C:誤刪除了TTX表的性能數據,用這個表的數據來恢復


SQL> exec dbms_stats.delete_table_stats(user,'ttx');


PL/SQL procedure successfully completed.


SQL> select num_rows,avg_row_len,blocks,last_analyzed from user_tables where table_name='TTX';


 NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

---------- ----------- ---------- -------------------


將性能數據恢復

SQL> exec dbms_stats.import_table_stats(user,'ttx',stattab=>'stat_tab',cascade=>true);


PL/SQL procedure successfully completed.


SQL> select num_rows,avg_row_len,blocks,last_analyzed from user_tables where table_name='TTX';


 NUM_ROWS AVG_ROW_LEN     BLOCKS LAST_ANALYZED

---------- ----------- ---------- -------------------

    10003           4         58 2013-08-09 11:25:43


數據遷移的時候和數據切割的時候,性能數據的處理。


--------------------------------

鎖定分析數據

當一個執行計劃被調試穩定之後,希望性能數據能保存下來,不被其他用戶修改,可以將性能數據鎖定。


SQL> exec dbms_stats.lock_table_stats(user,'ttx');


PL/SQL procedure successfully completed.


再想分析的話就會報錯的。


SQL> exec dbms_stats.gather_table_stats(user,'ttx',cascade=>true);

BEGIN dbms_stats.gather_table_stats(user,'ttx',cascade=>true); END;


*

ERROR at line 1:

ORA-20005: object statistics are locked (stattype = ALL)

ORA-06512: at "SYS.DBMS_STATS", line 13437

ORA-06512: at "SYS.DBMS_STATS", line 13457

ORA-06512: at line 1


解決辦法:

1、解鎖

SQL> exec dbms_stats.unlock_table_stats(user,'ttx');


PL/SQL procedure successfully completed.


SQL> exec dbms_stats.gather_table_stats(user,'ttx',cascade=>true);


PL/SQL procedure successfully completed.


2、強制覆蓋

SQL> exec dbms_stats.gather_table_stats(user,'ttx',cascade=>true,force=>true);


PL/SQL procedure successfully completed.

----------------------------------------------

設置性能數據


使用場合:

1、當相應的性能數據指標不正確的時候,導致執行計劃失敗,或者當前不允許收集分析信息,可以手工設置性能數據。

2、在測試環境中或者開發環境中,調試執行計劃的時候,可以通過此方法模擬生產的性能數據。


目的:作執行計劃的調試。


設置表的性能數據

DBMS_STATS.SET_TABLE_STATS (

  ownname       VARCHAR2,

  tabname       VARCHAR2,

  partname      VARCHAR2 DEFAULT NULL,

  stattab       VARCHAR2 DEFAULT NULL,

  statid        VARCHAR2 DEFAULT NULL,

  numrows       NUMBER   DEFAULT NULL,

  numblks       NUMBER   DEFAULT NULL,

  avgrlen       NUMBER   DEFAULT NULL,

  flags         NUMBER   DEFAULT NULL,

  statown       VARCHAR2 DEFAULT NULL,

  no_invalidate BOOLEAN  DEFAULT to_no_invalidate_type (

                                    get_param('NO_INVALIDATE')),

  cachedblk     NUMBER    DEFAULT NULL,

  cachehit      NUMBER    DEFUALT NULL,

  force         BOOLEAN   DEFAULT FALSE);


設置列的性能數據

DBMS_STATS.SET_COLUMN_STATS (

  ownname       VARCHAR2,

  tabname       VARCHAR2,

  colname       VARCHAR2,

  partname      VARCHAR2 DEFAULT NULL,

  stattab       VARCHAR2 DEFAULT NULL,

  statid        VARCHAR2 DEFAULT NULL,

  distcnt       NUMBER DEFAULT NULL,

  density       NUMBER DEFAULT NULL,

  nullcnt       NUMBER DEFAULT NULL,

  srec          StatRec DEFAULT NULL,

  avgclen       NUMBER DEFAULT NULL,

  flags         NUMBER DEFAULT NULL,

  statown       VARCHAR2 DEFAULT NULL,

  no_invalidate BOOLEAN DEFAULT to_no_invalidate_type(

                                   get_param('NO_INVALIDATE')),

  force         BOOLEAN DEFAULT FALSE);


設置索引的性能數據

DBMS_STATS.SET_INDEX_STATS (

  ownname       VARCHAR2,

  indname       VARCHAR2,

  partname      VARCHAR2  DEFAULT NULL,

  stattab       VARCHAR2  DEFAULT NULL,

  statid        VARCHAR2  DEFAULT NULL,

  numrows       NUMBER    DEFAULT NULL,

  numlblks      NUMBER    DEFAULT NULL,

  numdist       NUMBER    DEFAULT NULL,

  avglblk       NUMBER    DEFAULT NULL,

  avgdblk       NUMBER    DEFAULT NULL,

  clstfct       NUMBER    DEFAULT NULL,

  indlevel      NUMBER    DEFAULT NULL,

  flags         NUMBER    DEFAULT NULL,

  statown       VARCHAR2  DEFAULT NULL,

  no_invalidate BOOLEAN   DEFAULT to_no_invalidate_type(

                                   get_param('NO_INVALIDATE')),

  guessq        NUMBER    DEFAULT NULL,

  cachedblk     NUMBER    DEFAULT NULL,

  cachehit      NUMBER    DEFUALT NULL,

  force         BOOLEAN   DEFAULT FALSE);



SQL> drop table t;


Table dropped.


SQL> create table t as select * from all_objects;


Table created.


SQL> exec dbms_stats.gather_table_stats(user,'t');


PL/SQL procedure successfully completed.


SQL> select * from t;  --測試環境


Execution Plan

----------------------------------------------------------

Plan hash value: 1601196873


--------------------------------------------------------------------------

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |      | 40961 |  3800K|   134   (3)| 00:00:02 |

|   1 |  TABLE ACCESS FULL| T    | 40961 |  3800K|   134   (3)| 00:00:02 |

--------------------------------------------------------------------------


但是這個表在生產機上有1000萬數據。


SQL> exec dbms_stats.set_table_stats(user,'t',numrows=>10000000,numblks=>1000000,avgrlen=>178);


PL/SQL procedure successfully completed.


SQL> select * from t;


Execution Plan

----------------------------------------------------------

Plan hash value: 1601196873


--------------------------------------------------------------------------

| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |

--------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |      |    10M|   953M|   220K  (1)| 00:44:04 |

|   1 |  TABLE ACCESS FULL| T    |    10M|   953M|   220K  (1)| 00:44:04 |

--------------------------------------------------------------------------


-----------------------------------------------

------------------------

表的連接原理














發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章