一張10000條記錄的表c1
create table c1 as select * from dba_objects where rownum <10001;
exec dbms_stats.gather_table_stats(user,'c1',estimate_percent=> 1);
按照理解,sample_size應該是10000*1%=100條左右
實際上
select sample_size from user_tables where table_name ='T1';
結果大致會在4400-5900附近。
而500000條記錄的表,sample size也在5000附近。
原因在哪裏呢?
打開SQL_TRACE
alter session set sql_trace=true;
exec dbms_stats.gather_table_stats(user,'t1',estimate_percent=> 1);
alter session set sql_trace=false;
select sample_size from user_tables where table_name ='C1';
4725
select value from v$diag_info where name='Default Trace File';
E:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\orcl_ora_23220.trc
這個是trace文件放的位置
cd E:\APP\ADMINISTRATOR\diag\rdbms\orcl\orcl\trace\
e:\
tkprof orcl_ora_23220.trc a.trc
打開a.trc
找到下面的部分
select /*+ no_parallel(t) no_parallel_index(t) dbms_stats
cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
no_substrb_pad */count(*), count("OBJECT_ID"), count(distinct "OBJECT_ID"),
sum(sys_op_opnsize("OBJECT_ID")), substrb(dump(min("OBJECT_ID"),16,0,32),1,
120), substrb(dump(max("OBJECT_ID"),16,0,32),1,120)
from
"TEST"."C1" sample ( 1.0000000000) t
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.01 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.00 0.00 10 12 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.01 0.01 10 12 0 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 90 (recursive depth: 1)
Number of plan statistics captured: 1
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
1 1 1 SORT AGGREGATE (cr=12 pr=10 pw=0 time=5504 us)
64 64 64 VIEW VW_DAG_0 (cr=12 pr=10 pw=0 time=4775 us cost=5 size=4680 card=60)
64 64 64 HASH GROUP BY (cr=12 pr=10 pw=0 time=4702 us cost=5 size=240 card=60)
64 64 64 TABLE ACCESS SAMPLE C1 (cr=12 pr=10 pw=0 time=1031 us cost=4 size=240 card=60)
********************************************************************************
SQL ID: 50zv2cg4980b7 Plan Hash: 2789923169
select /*+ no_parallel(t) no_parallel_index(t) dbms_stats
cursor_sharing_exact use_weak_name_resl dynamic_sampling(0) no_monitoring
no_substrb_pad */count(*), count("OBJECT_ID"), count(distinct "OBJECT_ID"),
sum(sys_op_opnsize("OBJECT_ID")), substrb(dump(min("OBJECT_ID"),16,0,32),1,
120), substrb(dump(max("OBJECT_ID"),16,0,32),1,120)
from
"TEST"."C1" sample ( 78.1250000000) t
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 2 0.04 0.03 0 12 0 1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 4 0.04 0.04 0 12 0 1
Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 90 (recursive depth: 1)
Number of plan statistics captured: 1
Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
1 1 1 SORT AGGREGATE (cr=12 pr=0 pw=0 time=36494 us)
4725 4725 4725 VIEW VW_DAG_0 (cr=12 pr=0 pw=0 time=47839 us cost=5 size=365040 card=4680)
4725 4725 4725 HASH GROUP BY (cr=12 pr=0 pw=0 time=29064 us cost=5 size=18720 card=4680)
4725 4725 4725 TABLE ACCESS SAMPLE C1 (cr=12 pr=0 pw=0 time=9405 us cost=4 size=18720 card=4680)
最下面的4725就是本次gather獲得的sample size數。
我們會發現,oracle首先按照給定的比例1%去取個總數,發現64條記錄,
然後又重新按照78.125%去取總數,取出來正好是4725條記錄,這個就是我們前面看到的sample_size
那爲什麼是78.125呢?
64*78.125=5000
答案就是oracle用5000去除前面的1%的返回值得到的。
是不是oracle就是設定了5000呢?
換一個表、換一個比例,發現同樣的公式仍然是對的。
也就是說,如果oracle發現給定的比例不夠5000條記錄的話,會自動湊到5000條記錄來。