postgresql/lightdb vacuum對性能的影響及徹底理解表膨脹

先來看測試結果
zjh@postgres=# create table big_table(id int,v text); CREATE TABLE zjh@postgres=# ALTER TABLE big_table SET (autovacuum_enabled = off); # 關閉自動vacuum ALTER TABLE zjh@postgres=# insert into big_table select id, rpad('x',64,'x') from generate_series(1,1000000) id; INSERT 0 1000000 zjh@postgres=# zjh@postgres=# \timing on Timing is on. zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 76.603 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 60.682 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 60.963 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 60.632 ms ### 穩定的執行時間 zjh@postgres=# update big_table set v = v, id = id; #生成50%死元祖 UPDATE 1000000 Time: 1006.034 ms (00:01.006) zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 307.994 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 75.222 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 115.800 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 109.309 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 76.994 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 76.219 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 75.804 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 75.834 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 76.684 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 76.299 ms ## 穩定後帶死元祖的執行時間,25%以上的額外時延 zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 1923.425 ms (00:01.923) zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 352.238 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 103.585 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 211.861 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 144.573 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 99.129 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 100.284 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 99.148 ms ## 兩次update後,額外負載65% zjh@postgres=# vacuum big_table ; VACUUM Time: 214.800 ms zjh@postgres=# zjh@postgres=# vacuum big_table ; VACUUM Time: 11.348 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 88.478 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 85.893 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 87.403 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 85.340 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 85.990 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 86.514 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 85.684 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 84.336 ms # vacuum之後,仍然還有40%的額外負載 zjh@postgres=# vacuum full big_table ; # full vacuum VACUUM Time: 416.220 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 66.535 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 63.514 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 63.039 ms ## 重組織後,負載基本消除,雖然也有一點點,但基本可以忽略不計了。 zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 1116.814 ms (00:01.117) zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 2250.193 ms (00:02.250) zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 1264.835 ms (00:01.265) zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 1266.069 ms (00:01.266) zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 2000.205 ms (00:02.000) zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 384.475 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 174.367 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 364.749 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 178.216 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 176.623 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 170.568 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 170.408 ms zjh@postgres=# zjh@postgres=# zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 172.460 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 171.467 ms ## 五次update之後 zjh@postgres=# update big_table set v = v, id = id; UPDATE 1000000 Time: 1114.980 ms (00:01.115) zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 455.640 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 197.581 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 335.761 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 212.413 ms zjh@postgres=# select count(1) from big_table ; count --------- 1000000 (1 row) Time: 195.236 ms ## 六次update之後

zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
pg_size_pretty
----------------
675 MB    -- 可見膨脹是非常厲害的。
(1 row)

zjh@postgres=# vacuum big_table ;
VACUUM
Time: 662.148 ms
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 675 MB     #  不回收,但是新的DML可以複用
(1 row)

Time: 0.406 ms
zjh@postgres=# vacuum full big_table ;
VACUUM
Time: 568.773 ms
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 96 MB            # 回收,其實就是重組織
(1 row)
zjh@postgres=# vacuum big_table ;
VACUUM
Time: 40.442 ms
zjh@postgres=# 
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 204 MB
(1 row)

Time: 0.331 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 98.444 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 105.852 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 104.016 ms
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 204 MB
(1 row)

Time: 0.273 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 106.985 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 106.612 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 106.837 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 107.296 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 105.588 ms
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 204 MB
(1 row)

Time: 0.347 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 103.695 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 104.665 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 106.016 ms
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 204 MB
(1 row)

Time: 0.310 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 104.338 ms    # 此時所有死元祖回收的空間已經用完。
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 212 MB    # 即使大量的死元祖,也會導致空間膨脹
(1 row)

Time: 0.446 ms
zjh@postgres=# update big_table set v = v, id = id limit 100000;
UPDATE 100000
Time: 109.384 ms
zjh@postgres=# select pg_size_pretty(pg_relation_size('big_table'));
 pg_size_pretty 
----------------
 222 MB    # 即使大量的死元祖,也會導致空間膨脹
(1 row) Time: 0.266 ms

  從上可知,因爲未清理的死元祖會佔用額外空間、也會導致vm未被置爲可見,所以不僅I/O更高、佔用的buffer更多、也需要額外判斷mvcc是否滿足,進而對性能的影響還是比較大。只讀特性確保避免意外操作導致表被更新,進而產生額外不必要的死元祖、vaccum活動以及vm被誤干擾。

  所以,對於大表做大量的update/delete(能truncate或insert select,就不要update/delete)之後,最好執行一次vacuum full xxx,釋放空間。

postgresql中更新操作的物理實現

  8.2版本引入了HOT特性,極大地提高了索引更新的性能。

 

 

   8.2及之後,支持HOT,索引條目未更新不會插入新的,如下:

  

 

   同時也會不定期的做碎片清理,如下:

  

 

   所以在10+新版中,CPU最多多兩次指針訪問+空間浪費,一段時間後,只會多一次指針訪問+0空間浪費。相比這一次指針訪問,索引葉子節點中定位索引鍵就好幾次比較了,所以幾乎可以忽略不計。

  最後,vacuum除了能夠解決清理死元組消除表膨脹外,凍結事務ID也是其職責,這影響到事務回捲,可參見 postgresql中的事務回捲原理及預防措施

  具體可參考下https://zhuanlan.zhihu.com/p/379706959

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章