關於索引壓縮的研究

當單列索引和複合索引中的數據列重複項比較多的時候,可以考慮進行索引壓縮。索引壓縮可以在某種程度上減小索引所佔空間,減小掃描索引時候的I/O,提高查詢的性能。
語法:create index index_name on table_name(col1,col2 ….coln) compress n; (n>0)
不輸入n的話,默認壓縮所有的索引列
索引中的前n項被壓縮,稱做前綴。

運行環境

SQL> conn / as sysdba
已連接。
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle8i Enterprise Edition Release 8.1.6.0.0 - Production
PL/SQL Release 8.1.6.0.0 - Production
CORE    8.1.6.0.0       Production
TNS for 32-bit Windows: Version 8.1.6.0.0 - Production
NLSRTL Version 3.4.1.0.0 – Production

實驗一:索引壓縮後佔用空間的變化

SQL> drop table test10;
Table dropped
SQL> create table test10
  2  (id1 varchar2(20),
  3  id2 varchar2(20),
  4  id3 varchar2(20))
  5  tablespace test;
Table created
SQL> insert into test10 select 'aaaaaaaaaaaaaaaaaaaa','bbbbbbbbbbbbbbbbbbbb',rownum from dba_objects where rownum<10000;
9999 rows inserted
SQL> insert into test10 select 'bbbbbbbbbbbbbbbbbbbb','aaaaaaaaaaaaaaaaaaaa',rownum from dba_objects where rownum<10000;
9999 rows inserted
SQL> commit;
Commit complete
SQL> insert into test10 select 'aaaaaaaaaaaaaaaaaaaa','bbbbbbbbbbbbbbbbbbbb',rownum from dba_objects where rownum<10000;
9999 rows inserted
SQL> insert into test10 select 'bbbbbbbbbbbbbbbbbbbb','aaaaaaaaaaaaaaaaaaaa',rownum from dba_objects where rownum<10000;
9999 rows inserted
SQL> commit;
Commit complete
SQL> create table test11 as select * from test10;
Table created
SQL> create index ind_test10 on test10(id1,id2,id3) tablespace test;
Index created
SQL> create index ind_test11 on test11(id1,id2,id3) compress 2 tablespace test;
Index created
SQL> set serveroutput on 
SQL> exec show_space('IND_TEST10','I','MYTEST');
Free Blocks.............................0
Total Blocks............................195
Total Bytes.............................1597440
Unused Blocks...........................29
Unused Bytes............................237568
Last Used Ext FileId....................8
Last Used Ext BlockId...................2127
Last Used Block.........................36
PL/SQL procedure successfully completed

SQL> exec show_space('IND_TEST11','I','MYTEST');
Free Blocks.............................0
Total Blocks............................55
Total Bytes.............................450560
Unused Blocks...........................8
Unused Bytes............................65536
Last Used Ext FileId....................8
Last Used Ext BlockId...................2227
Last Used Block.........................12
PL/SQL procedure successfully completed

普通索引IND_TEST10所佔用的空間:195 – 29 = 166
壓縮索引IND_TEST11所佔用的空間:55 – 8 = 47
節約的空間百分比:(166 – 47)/ 166 = 71.7%
事實上,壓縮索引所能節約的空間百分比大小與壓縮索引的前綴字段大小佔總字段大小的百分比有關係。前綴字段所佔百分比越大,則節約空間越大,壓縮效果越明顯;反之,則節約空間越小,壓縮效果越不明顯。
以上實驗是在前綴字段的重複項比較多的情況下,壓縮索引可以發揮自己的優勢。下面看一下如果前綴字段沒有重複項這種極限的情況下,壓縮索引的情況。

SQL> create table test12 
  2  (id number)
  3  tablespace test;
Table created
SQL> insert into test12 select rownum from dba_objects;
23725 rows inserted
SQL> commit;
Commit complete
SQL> create table test13 as select * from test12;
Table created
SQL> create index ind_test12 on test12(id) tablespace test;
Index created
SQL> create index ind_test13 on test13(id) compress tablespace test;
Index created
SQL> exec show_space('IND_TEST12','I','MYTEST');
Free Blocks.............................0
Total Blocks............................55
Total Bytes.............................450560
Unused Blocks...........................1
Unused Bytes............................8192
Last Used Ext FileId....................8
Last Used Ext BlockId...................2392
Last Used Block.........................19
PL/SQL procedure successfully completed

SQL> exec show_space('IND_TEST13','I','MYTEST');
Free Blocks.............................0
Total Blocks............................85
Total Bytes.............................696320
Unused Blocks...........................11
Unused Bytes............................90112
Last Used Ext FileId....................8
Last Used Ext BlockId...................2467
Last Used Block.........................19
PL/SQL procedure successfully completed

普通索引IND_TEST12所佔用的空間:55 – 1 = 54
壓縮索引IND_TEST13所佔用的空間:85 – 11 = 74
浪費的空間百分比:(74 – 54)/ 54 = 37%
由此可見,使用壓縮索引的前提必須是前綴列的重複項比較多,否則會對性能產生更壞的影響。


實驗二:壓縮索引BLOCK的內部結構
下面用一個比較小的表爲例子,看一下壓縮索引BLOCK的內部結構

SQL> create table test14 as select * from test10 where 1=0;
Table created
SQL> insert into test14 values ('aa','bb','11');
1 row inserted
SQL> insert into test14 values ('aa','bb','22');
1 row inserted
SQL> insert into test14 values ('bb','aa','11');
1 row inserted
SQL> insert into test14 values ('bb','aa','2');
1 row inserted
SQL> commit;
Commit complete
SQL> select * from test14;
ID1                  ID2                  ID3
-------------------- -------------------- --------------------
aa                   bb                   11
aa                   bb                   22
bb                   aa                   11
bb                   aa                   2

SQL> create index ind_test14 on test14(id1,id2,id3) compress 2;
Index created
SQL> exec show_space('IND_TEST14','I','MYTEST');
Free Blocks.............................0
Total Blocks............................5
Total Bytes.............................40960
Unused Blocks...........................3
Unused Bytes............................24576
Last Used Ext FileId....................8
Last Used Ext BlockId...................2502
Last Used Block.........................2
PL/SQL procedure successfully completed

SQL> alter system dump datafile 8 block 2503;
System altered

Start dump data blocks tsn: 7 file#: 8 minblk 2503 maxblk 2503
buffer tsn: 7 rdba: 0x020009c7 (8/2503)
scn: 0x0000.24a47653 seq: 0x01 flg: 0x00 tail: 0x76530601
frmt: 0x02 chkval: 0x0000 type: 0x06=trans data
 
Block header dump:  0x020009c7
 Object id on Block? Y
 seg/obj: 0x5fba  csc: 0x00.24a47651  itc: 2  flg: -  typ: 2 - INDEX
     fsl: 0  fnx: 0x0 ver: 0x01
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   xid:  0x0000.000.00000000    uba: 0x00000000.0000.00  ----    0  fsc 0x0000.00000000
0x02   xid:  0x0006.005.000000b9    uba: 0x00000000.0000.00  ----    0  fsc 0x0000.00000000
 
Leaf block dump
===============
header address 365784156=0x15cd6c5c
kdxcolev 0
kdxcolok 0
kdxcoopc 0xa0: opcode=0: iot flags=-C- is converted=Y
kdxconco 4
kdxcosdc 0
kdxconro 4
kdxcofbo 56=0x38
kdxcofeo 7973=0x1f25
kdxcoavs 7917
kdxlespl 0
kdxlende 0
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
kdxlepnro 2
kdxlepnco 2
prefix row#0[8028] flag: -P---, lock: 0    //即‘aa’,‘bb’
col 0; len 2; (2):  61 61
col 1; len 2; (2):  62 62
prc 2   //在這個BLOCK中前綴爲‘aa’,‘bb’的記錄數2
prefix row#1[7996] flag: -P---, lock: 0	//即‘bb’,‘aa’
col 0; len 2; (2):  62 62
col 1; len 2; (2):  61 61
prc 2   //在這個BLOCK中前綴爲‘bb,‘aa’的記錄數2
row#0[8016] flag: -----, lock: 0
col 0; len 2; (2):  31 31
col 1; len 6; (6):  02 00 09 c2 00 00
psno 0  //此條記錄的前綴序號爲‘0’,即‘aa’,‘bb’
row#1[8004] flag: -----, lock: 0
col 0; len 2; (2):  32 32
col 1; len 6; (6):  02 00 09 c2 00 01
psno 0
row#2[7984] flag: -----, lock: 0
col 0; len 2; (2):  31 31
col 1; len 6; (6):  02 00 09 c2 00 02
psno 1	//此條記錄的前綴序號爲‘1’,即‘bb’,‘aa’
row#3[7973] flag: -----, lock: 0
col 0; len 1; (1):  32
col 1; len 6; (6):  02 00 09 c2 00 03
psno 1
----- end of leaf block dump -----
End dump data blocks tsn: 7 file#: 8 minblk 2503 maxblk 2503

通過dump的結果就可以看到壓縮索引節約空間的原因了。經過其他的一些實驗還發現以下幾點:
1, 僅僅該BLOCK中存在該前綴對應的記錄,該前綴的說明纔會在BLOCK中出現。
2, 當索引中記錄增多到引起葉的分裂的時候,相同前綴的記錄會盡量存儲在相同的BLOCK中,即BLOCK中的記錄會發生重組。


實驗三:壓縮索引對查詢性能的影響
一般說來,因爲索引壓縮後所佔用的空間比較小,所以在發生索引掃描的時候需要訪問的索引塊比較小,會提高查詢的性能。

SQL> run
  1* select * from test10 where id1='aaaaaaaaaaaaaaaaaaaa'
已選擇9999行。
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   INDEX (RANGE SCAN) OF 'IND_TEST10' (NON-UNIQUE)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        744  consistent gets
          0  physical reads
          0  redo size
     484574  bytes sent via SQL*Net to client
      74350  bytes received via SQL*Net from client
        668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
9999	rows processed

SQL> run
  1* select * from test11 where id1='aaaaaaaaaaaaaaaaaaaa'
已選擇9999行。
Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   INDEX (RANGE SCAN) OF 'IND_TEST11' (NON-UNIQUE)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        691  consistent gets
          0  physical reads
          0  redo size
     484574  bytes sent via SQL*Net to client
      74350  bytes received via SQL*Net from client
        668  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
       9999  rows processed


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章