index clustering factor

index clustering factor,指的是,表中數據的順序和索引中數據的順序的相似度。相似度越高,聚簇因子就越小。相似度越小,聚簇因子就越高。

參考文檔:

https://docs.oracle.com/database/121/CNCPT/indexiot.htm#CNCPT89180

Index Clustering Factor

The index clustering factor measures row order in relation to an indexed value such as employee last name. As the degree of order increases, the clustering factor decreases.

The clustering factor is useful as a rough measure of the number of I/Os required to read an entire table using an index:

  • If the clustering factor is high, then Oracle Database performs a relatively high number of I/Os during a large index range scan. The index entries point to random table blocks, so the database may have to read and reread the same blocks over and over again to retrieve the data pointed to by the index.

       -- 上面語句的備註(聚簇因子高,說明數據的順序和索引的順序很不相似,索引掃描會增加IO,需要通過索引不停地取數據塊)

  • If the clustering factor is low, then Oracle Database performs a relatively low number of I/Os during a large index range scan. The index keys in a range tend to point to the same data block, so the database does not have to read and reread the same blocks over and over.

      -- 上面的語句的備註(聚簇因子低,說明數據的順序和索引的順序相似,索引掃描不會需要太多的IO,直接從索引中取就可以了,個人理解,可能不太準確)

The clustering factor is relevant for index scans because it can show:

  • Whether the database will use an index for large range scans

  • The degree of table organization in relation to the index key

  • Whether you should consider using an index-organized table, partitioning, or table cluster if rows must be ordered by the index key

Example 3-4 Clustering Factor

Assume that the employees table fits into two data blocks. Table 3-1 depicts the rows in the two data blocks (the ellipses indicate data that is not shown).

Table 3-1 Contents of Two Data Blocks in the Employees Table

Data Block 1 Data Block 2
100 Steven    King    SKING    ...
156 Janette   King    JKING    ...
115 Alexander Khoo    AKHOO    ...
.
.
.
116 Shelli  Baida     SBAIDA   ...
204 Hermann Baer      HBAER    ...
105 David   Austin    DAUSTIN  ...
130 Mozhe   Atkinson  MATKINSO ...
166 Sundar  Ande      SANDE    ...
174 Ellen   Abel      EABEL    ...
 


149 Eleni    Zlotkey EZLOTKEY ...
200 Jennifer Whalen  JWHALEN  ...
.
.
.
137 Renske   Ladwig  RLADWIG  ...
173 Sundita  Kumar   SKUMAR   ...
101 Neena    Kochar  NKOCHHAR ...

Rows are stored in the blocks in order of last name (shown in bold). For example, the bottom row in data block 1 describes Abel, the next row up describes Ande, and so on alphabetically until the top row in block 1 for Steven King. The bottom row in block 2 describes Kochar, the next row up describes Kumar, and so on alphabetically until the last row in the block for Zlotkey.

-- 上面的語句,block1和block2 中的,索引是lastname,lastname自下而上順序排列。

Assume that an index exists on the last name column. Each name entry corresponds to a rowid. Conceptually, the index entries would look as follows:   -- 索引中的數據排列如下,按照lastname 排序。

Abel,block1row1
Ande,block1row2
Atkinson,block1row3
Austin,block1row4
Baer,block1row5
.
.
.

Assume that a separate index exists on the employee ID column. Conceptually, the index entries might look as follows, with employee IDs distributed in almost random locations throughout the two blocks:  -- employee id列上又存在索引,索引中的數據按照employee id排列如下:

100,block1row50
101,block2row1
102,block1row9
103,block2row19
104,block2row39
105,block1row4
.
.
.

The following statement queries the ALL_INDEXES view for the clustering factor for these two indexes:

-- 查詢索引的聚簇因子

SQL> SELECT INDEX_NAME, CLUSTERING_FACTOR 
  2  FROM ALL_INDEXES 
  3  WHERE INDEX_NAME IN ('EMP_NAME_IX','EMP_EMP_ID_PK');
 
INDEX_NAME           CLUSTERING_FACTOR
-------------------- -----------------
EMP_EMP_ID_PK                       19
EMP_NAME_IX                          2

The clustering factor for EMP_NAME_IX is low, which means that adjacent index entries in a single leaf block tend to point to rows in the same data blocks. The clustering factor for EMP_EMP_ID_PK is high, which means that adjacent index entries in the same leaf block are much less likely to point to rows in the same data blocks.

可以看到 索引emp_name_ix的聚簇因子很低, 說明數據順序和索引中的數據順序很相似。從上面的圖中可以看出 。

索引emp_emp_id的聚簇因子很高,說明數據順序和索引中的數據順序很不相似(順序相差較大)。從上面圖中可以看出,該索引中數據的排列順序和數據排列順序相差較大。

-- 先把官方文檔弄出來,後續再補充測試案例。

END

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章