索引結構
B-tree索引的索引行存在page中,在葉子page中,存儲這索引鍵和指向錶行的TID信息。
B-tree索引的重要特性:
- B樹是平衡的,也就是說,每個葉子頁面從根節點由相同數量的內部頁面分隔。因此,搜索任何值都花費相同的時間。
- B樹有多個分支,即每個頁面(通常爲8 KB)包含許多(數百個)TID。所以,B樹的深度很小,只有非常大的表,纔會高達4–5的深度。
- 索引中的數據以升序排序(頁面之間和每個頁面內部),並且同一級別的頁面通過雙向列表相互連接。因此,我們可以僅通過列表的一個方向或另一個方向獲得有序數據集,而不必每次都返回到根。
下圖是一個簡單的整數爲索引鍵的例子
索引的第一頁是meta頁,它和索引的根節點關聯。內部節點位於根下方,葉子頁位於最底行。葉子節點向下的箭頭表示從葉節點到錶行(TID)的引用。
等值搜索
通過條件“索引字段=表達式”在樹中搜索值。比如,搜索鍵值爲49,過程如下
首先從根節點開始搜索,確定要降到下一層子節點。我們知道根節點(4、32、64)中的鍵,因此我們找出了子節點中的值範圍。由於32≤49 <64,因此我們需要下降到第二個子節點。接下來,重複相同的過程,直到我們到達葉子節點,可以從該葉節點獲得所需的TID。
非等值搜索
當通過條件“ indexed-field≤expression”(或“ indexed-field≥expression”)進行搜索時,我們首先通過相等條件“ indexed-field = expression”在索引中找到一個值(如果有),然後按對應的方向遍歷葉子頁。
如下索引鍵n<=35的處理過程
範圍搜索
如:23 ≤ n ≤ 64 處理過程見下圖
實際應用舉例:
如下有9行飛行航班數據
demo=# select * from aircrafts;
aircraft_code | model | range
---------------+---------------------+-------
773 | Boeing 777-300 | 11100
763 | Boeing 767-300 | 7900
SU9 | Sukhoi SuperJet-100 | 3000
320 | Airbus A320-200 | 5700
321 | Airbus A321-200 | 5600
319 | Airbus A319-100 | 6700
733 | Boeing 737-300 | 4200
CN1 | Cessna 208 Caravan | 1200
CR2 | Bombardier CRJ-200 | 2700
(9 rows)
在range創建B-tree索引
demo=# create index on aircrafts(range);
關閉全表掃描
demo=# set enable_seqscan = off;
等值查詢執行計劃
demo=# explain(costs off) select * from aircrafts where range = 3000;
QUERY PLAN
---------------------------------------------------
Index Scan using aircrafts_range_idx on aircrafts
Index Cond: (range = 3000)
(2 rows)
非等值查詢的執行計劃
demo=# explain(costs off) select * from aircrafts where range < 3000;
QUERY PLAN
---------------------------------------------------
Index Scan using aircrafts_range_idx on aircrafts
Index Cond: (range < 3000)
(2 rows)
範圍查詢的執行計劃
demo=# explain(costs off) select * from aircrafts
where range between 3000 and 5000;
QUERY PLAN
-----------------------------------------------------
Index Scan using aircrafts_range_idx on aircrafts
Index Cond: ((range >= 3000) AND (range <= 5000))
(2 rows)
排序使用索引
創建索引時,我們可以指定排序順序。例如,我們可以通過這種方式特別按照航班範圍創建索引:
demo=# create index on aircrafts(range desc);
在這種情況下,較大的值將出現在左側的樹中,而較小的值將出現在右側。
表達式索引使用
demo=# create view aircrafts_v as
select model,
case
when range < 4000 then 1
when range < 10000 then 2
else 3
end as class
from aircrafts;
demo=# select * from aircrafts_v;
model | class
---------------------+-------
Boeing 777-300 | 3
Boeing 767-300 | 2
Sukhoi SuperJet-100 | 1
Airbus A320-200 | 2
Airbus A321-200 | 2
Airbus A319-100 | 2
Boeing 737-300 | 2
Cessna 208 Caravan | 1
Bombardier CRJ-200 | 1
(9 rows)
創建表達式索引
demo=# create index on aircrafts(
(case when range < 4000 then 1 when range < 10000 then 2 else 3 end),
model);
demo=# select class, model from aircrafts_v order by class, model;
class | model
-------+---------------------
1 | Bombardier CRJ-200
1 | Cessna 208 Caravan
1 | Sukhoi SuperJet-100
2 | Airbus A319-100
2 | Airbus A320-200
2 | Airbus A321-200
2 | Boeing 737-300
2 | Boeing 767-300
3 | Boeing 777-300
(9 rows)
demo=# explain(costs off)
select class, model from aircrafts_v order by class, model;
QUERY PLAN
--------------------------------------------------------
Index Scan using aircrafts_case_model_idx on aircrafts
(1 row)
可以降序使用索引,如下
demo=# select class, model from aircrafts_v order by class desc, model desc;
class | model
-------+---------------------
3 | Boeing 777-300
2 | Boeing 767-300
2 | Boeing 737-300
2 | Airbus A321-200
2 | Airbus A320-200
2 | Airbus A319-100
1 | Sukhoi SuperJet-100
1 | Cessna 208 Caravan
1 | Bombardier CRJ-200
(9 rows)
demo=# explain(costs off)
select class, model from aircrafts_v order by class desc, model desc;
QUERY PLAN
-----------------------------------------------------------------
Index Scan BACKWARD using aircrafts_case_model_idx on aircrafts
(1 row)
但是我們查詢中如果一列是升序,一列是降序,那麼無法使用該索引
demo=# explain(costs off)
select class, model from aircrafts_v order by class ASC, model DESC;
QUERY PLAN
-------------------------------------------------
Sort
Sort Key: (CASE ... END), aircrafts.model DESC
-> Seq Scan on aircrafts
(3 rows)
創建對應的升序和降序的索引,纔可以使用索引
demo=# create index aircrafts_case_asc_model_desc_idx on aircrafts(
(case
when range < 4000 then 1
when range < 10000 then 2
else 3
end) ASC,
model DESC);
demo=# explain(costs off)
select class, model from aircrafts_v order by class ASC, model DESC;
QUERY PLAN
-----------------------------------------------------------------
Index Scan using aircrafts_case_asc_model_desc_idx on aircrafts
(1 row)
列的排序
使用多列索引時出現的另一個問題是索引中列出列的順序。對於B樹,此順序非常重要:頁面內的數據將按第一個字段排序,然後按第二個字段排序,依此類推。
如下圖:
從該圖可以清楚地看出,按謂詞進行搜索,例如“ class = 3”(按第一個字段進行搜索)或“ class = 3 and model =‘Boeing 777-300’”(按兩個字段進行搜索)將非常有效。
但是我們查詢條件中如果只有"model = 'Boeing 777-300"的時候,也是從根節點開始掃描,但是不知道需要下降到下一層的那個節點,所以下面的節點都需要掃描。
創建以下索引
demo=# create index on aircrafts(
model,
(case when range < 4000 then 1 when range < 10000 then 2 else 3 end));
通過條件"model = ‘Boeing 777-300’"查詢將會使用該索引,"class = 3"將不會使用,以上描述的就是前導列可以使用索引,沒有前導列不能使用索引。
NULLs處理
B-tree索引支持IS NULL和IS NOT NULL,如下查詢
demo=# create index on flights(actual_arrival);
demo=# explain(costs off) select * from flights where actual_arrival is null;
QUERY PLAN
-------------------------------------------------------
Bitmap Heap Scan on flights
Recheck Cond: (actual_arrival IS NULL)
-> Bitmap Index Scan on flights_actual_arrival_idx
Index Cond: (actual_arrival IS NULL)
(4 rows)
NULL位於索引的某一端,具體取決於創建索引的方式(NULLS FIRST或NULLS LAST)。如果SELECT命令在其ORDER BY子句中指定的NULL順序與爲構建索引指定的順序相同(NULLS FIRST或NULLS LAST),則可以使用索引。
demo=# explain(costs off)
select * from flights order by actual_arrival NULLS LAST;
QUERY PLAN
--------------------------------------------------------
Index Scan using flights_actual_arrival_idx on flights
(1 row)
如果順序不同,則無法使用索引
demo=# explain(costs off)
select * from flights order by actual_arrival NULLS FIRST;
QUERY PLAN
----------------------------------------
Sort
Sort Key: actual_arrival NULLS FIRST
-> Seq Scan on flights
(3 rows)
要想使用,則必須創建順序一致的索引
demo=# create index flights_nulls_first_idx on flights(actual_arrival NULLS FIRST);
demo=# explain(costs off)
select * from flights order by actual_arrival NULLS FIRST;
QUERY PLAN
-----------------------------------------------------
Index Scan using flights_nulls_first_idx on flights
(1 row)
原因是因爲null和任何值比較,都是未定義的
demo=# \pset null NULL
demo=# select null < 42;
?column?
----------
NULL
(1 row)
B-tree的一些屬性如下:
amname | name | pg_indexam_has_property
--------+---------------+-------------------------
btree | can_order | t
btree | can_unique | t
btree | can_multi_col | t
btree | can_exclude | t
name | pg_index_has_property
---------------+-----------------------
clusterable | t
index_scan | t
bitmap_scan | t
backward_scan | t
name | pg_index_column_has_property
--------------------+------------------------------
asc | t
desc | f
nulls_first | f
nulls_last | t
orderable | t
distance_orderable | f
returnable | t
search_array | t
search_nulls | t
索引列之外的行可以使用index only scan
如下表結構,book_ref是主鍵也是外鍵
demo=# \d bookings
Table "bookings.bookings"
Column | Type | Modifiers
--------------+--------------------------+-----------
book_ref | character(6) | not null
book_date | timestamp with time zone | not null
total_amount | numeric(10,2) | not null
Indexes:
"bookings_pkey" PRIMARY KEY, btree (book_ref)
Referenced by:
TABLE "tickets" CONSTRAINT "tickets_book_ref_fkey" FOREIGN KEY (book_ref) REFERENCES bookings(book_ref)
創建一個include book_date的索引
demo=# create unique index bookings_pkey2 on bookings(book_ref) INCLUDE (book_date);
替換原來的索引
demo=# begin;
demo=# alter table bookings drop constraint bookings_pkey cascade;
demo=# alter table bookings add primary key using index bookings_pkey2;
demo=# alter table tickets add foreign key (book_ref) references bookings (book_ref);
demo=# commit;
demo=# \d bookings
Table "bookings.bookings"
Column | Type | Modifiers
--------------+--------------------------+-----------
book_ref | character(6) | not null
book_date | timestamp with time zone | not null
total_amount | numeric(10,2) | not null
Indexes:
"bookings_pkey2" PRIMARY KEY, btree (book_ref) INCLUDE (book_date)
Referenced by:
TABLE "tickets" CONSTRAINT "tickets_book_ref_fkey" FOREIGN KEY (book_ref) REFERENCES bookings(book_ref)
可見select字段裏面包含book_ref, book_date兩個字段,但是依然可以走index only scan
demo=# explain(costs off)
select book_ref, book_date from bookings where book_ref = '059FC4';
QUERY PLAN
--------------------------------------------------
Index Only Scan using bookings_pkey2 on bookings
Index Cond: (book_ref = '059FC4'::bpchar)
(2 rows)
注意:只有B-tree支持include,include不支持表達式。
B-tree可以包含哪些操作
例如:布爾型和整型包含以下操作
postgres=# select amop.amopopr::regoperator as opfamily_operator,
amop.amopstrategy
from pg_am am,
pg_opfamily opf,
pg_amop amop
where opf.opfmethod = am.oid
and amop.amopfamily = opf.oid
and am.amname = 'btree'
and opf.opfname = 'bool_ops'
order by amopstrategy;
opfamily_operator | amopstrategy
---------------------+--------------
<(boolean,boolean) | 1
<=(boolean,boolean) | 2
=(boolean,boolean) | 3
>=(boolean,boolean) | 4
>(boolean,boolean) | 5
(5 rows)
postgres=# select amop.amopopr::regoperator as opfamily_operator
from pg_am am,
pg_opfamily opf,
pg_amop amop
where opf.opfmethod = am.oid
and amop.amopfamily = opf.oid
and am.amname = 'btree'
and opf.opfname = 'integer_ops'
and amop.amopstrategy = 1
order by opfamily_operator;
opfamily_operator
----------------------
<(integer,bigint)
<(smallint,smallint)
<(integer,integer)
<(bigint,bigint)
<(bigint,integer)
<(smallint,integer)
<(integer,smallint)
<(smallint,bigint)
<(bigint,smallint)
(9 rows)
新數據類型支持B-tree
創建複合類型和相關表,插入數據
postgres=# create type complex as (re float, im float);
postgres=# create table numbers(x complex);
postgres=# insert into numbers values ((0.0, 10.0)), ((1.0, 3.0)), ((1.0, 1.0));
postgres=# select * from numbers order by x;
x
--------
(0,10)
(1,1)
(1,3)
(3 rows)
默認情況下,對於複合類型,排序是按組件進行的:首先比較第一個字段,然後比較第二個字段,依此類推,與逐個字符比較文本字符串的方式大致相同。但是我們可以定義不同的順序。例如,複合類型可被視爲向量,並按係數(長度)排序,該係數被計算爲座標平方和的平方根。爲了定義這樣的順序,讓我們創建一個函數來計算:
postgres=# create function modulus(a complex) returns float as $$
select sqrt(a.re*a.re + a.im*a.im);
$$ immutable language sql;
postgres=# create function complex_lt(a complex, b complex) returns boolean as $$
select modulus(a) < modulus(b);
$$ immutable language sql;
postgres=# create function complex_le(a complex, b complex) returns boolean as $$
select modulus(a) <= modulus(b);
$$ immutable language sql;
postgres=# create function complex_eq(a complex, b complex) returns boolean as $$
select modulus(a) = modulus(b);
$$ immutable language sql;
postgres=# create function complex_ge(a complex, b complex) returns boolean as $$
select modulus(a) >= modulus(b);
$$ immutable language sql;
postgres=# create function complex_gt(a complex, b complex) returns boolean as $$
select modulus(a) > modulus(b);
$$ immutable language sql;
創建運算符:
postgres=# create operator #<#(leftarg=complex, rightarg=complex, procedure=complex_lt);
postgres=# create operator #<=#(leftarg=complex, rightarg=complex, procedure=complex_le);
postgres=# create operator #=#(leftarg=complex, rightarg=complex, procedure=complex_eq);
postgres=# create operator #>=#(leftarg=complex, rightarg=complex, procedure=complex_ge);
postgres=# create operator #>#(leftarg=complex, rightarg=complex, procedure=complex_gt);
進行數值比較
postgres=# select (1.0,1.0)::complex #<# (1.0,3.0)::complex;
?column?
----------
t
(1 row)
postgres=# create function complex_cmp(a complex, b complex) returns integer as $$
select case when modulus(a) < modulus(b) then -1
when modulus(a) > modulus(b) then 1
else 0
end;
$$ language sql;
postgres=# create operator class complex_ops
default for type complex
using btree as
operator 1 #<#,
operator 2 #<=#,
operator 3 #=#,
operator 4 #>=#,
operator 5 #>#,
function 1 complex_cmp(complex,complex);
postgres=# select * from numbers order by x;
x
--------
(1,1)
(1,3)
(0,10)
(3 rows)
postgres=# select amp.amprocnum,
amp.amproc,
amp.amproclefttype::regtype,
amp.amprocrighttype::regtype
from pg_opfamily opf,
pg_am am,
pg_amproc amp
where opf.opfname = 'complex_ops'
and opf.opfmethod = am.oid
and am.amname = 'btree'
and amp.amprocfamily = opf.oid;
amprocnum | amproc | amproclefttype | amprocrighttype
-----------+-------------+----------------+-----------------
1 | complex_cmp | complex | complex
(1 row)
可以使用pageinspect查看B-tree索引的內部結構
可以看到有2個level,不包含root節點
demo=# create extension pageinspect;
Index metapage:
demo=# select * from bt_metap('ticket_flights_pkey');
magic | version | root | level | fastroot | fastlevel
--------+---------+------+-------+----------+-----------
340322 | 2 | 164 | 2 | 164 | 2
(1 row)
塊164的統計信息
demo=# select type, live_items, dead_items, avg_item_size, page_size, free_size
from bt_page_stats('ticket_flights_pkey',164);
type | live_items | dead_items | avg_item_size | page_size | free_size
------+------------+------------+---------------+-----------+-----------
r | 33 | 0 | 31 | 8192 | 6984
(1 row)
塊中數據查看
demo=# select itemoffset, ctid, itemlen, left(data,56) as data
from bt_page_items('ticket_flights_pkey',164) limit 5;
itemoffset | ctid | itemlen | data
------------+---------+---------+----------------------------------------------------------
1 | (3,1) | 8 |
2 | (163,1) | 32 | 1d 30 30 30 35 34 33 32 33 30 35 37 37 31 00 00 ff 5f 00
3 | (323,1) | 32 | 1d 30 30 30 35 34 33 32 34 32 33 36 36 32 00 00 4f 78 00
4 | (482,1) | 32 | 1d 30 30 30 35 34 33 32 35 33 30 38 39 33 00 00 4d 1e 00
5 | (641,1) | 32 | 1d 30 30 30 35 34 33 32 36 35 35 37 38 35 00 00 2b 09 00
(5 rows)