Cassandra的調優總結

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"硬件選擇","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Cassandra 的吞吐量隨着更多的 CPU 內核、更多的 RAM 和更快的磁盤而提高,雖然Cassandra可以在測試或開發環境的小型服務器上運行,但是線上的生成環境至少","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"需要 2 個內核和至少 8GB 的 RAM","attrs":{}},{"type":"text","text":"。典型的生產服務器具有 8 個或更多內核和至少 32GB 的 RAM。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"配置優化","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Cassandra 堆應該不小於 2GB,並且不超過系統 RAM 的 50%,堆大小通常在系統內存的 ¼ 到 ½ 之間,不要將所有內存都用於堆,因爲它也用於堆外緩存和文件系統緩存。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"小於 12GB 的堆應該考慮 ParNew/ConcurrentMarkSweep 垃圾收集","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大於 12GB 的堆應該考慮 G1GC,因爲對於","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"較大的堆,G1 的性能優於 CMS","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"調整 GC 時始終啓用 GC 日誌記錄,Cassandra 的GCInspector類記錄了超過200 毫秒的垃圾收集的信息。如果頻繁打印GC日誌,這表明 JVM 上的垃圾收集壓力過大。除了調整垃圾收集選項外,其他策略還包括","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"添加節點和降低緩存大小等","attrs":{}},{"type":"text","text":"。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"出於性能的考慮,","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"Xmx","attrs":{}},{"type":"text","text":"和","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}}],"text":"Xms","attrs":{}},{"type":"text","text":"應該被設置爲同樣的值","attrs":{}}]}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"業務場景優化","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一般來說,使用Cassandra的場景分爲三種:","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"頻繁寫入、頻繁讀取、混合","attrs":{}},{"type":"text","text":"。在絕大部分業務中基本都是","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"頻繁寫入和頻繁讀取","attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在講業務場景優化之前先總結一下GC的一些結論以便於好理解後續的一些優化手段。對於GC來說,一般涉及清除垃圾與對象升級等過程,有如下相關結論:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"清除垃圾的過程是很快的","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"升級對象相對與清除垃圾比較緩慢,它需要涉及到對象的複製","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"升級對象越多,需要的時間越長,你配置的堆內存空間越大,越能進行更多的對象升級,但是也意味着你需要消耗更多的時間","attrs":{}}]}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"頻繁寫入","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一般來說Cassandra的寫入性能較高,它的寫入流程如下圖:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/10/10a2b6964718018cc4eca97fba86ee3a.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":8,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(圖片來源於網絡)","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在Commit log中記錄數據","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將數據寫入內存表(memtable)","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從內存表中刷新數據","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將數據存儲在磁盤上的 SSTables 中","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在寫入之後,後續還會涉及到Compaction(壓實)操作,所以在要優化頻繁寫入產生頻繁GC的情況下","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"要特別考慮需要被分配對象內存的部分:Memtable和壓實過程","attrs":{}},{"type":"text","text":"。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於Memtable來說,Memtable所佔據的空間是可以配置的,對於Memtable相關的對象可以分配到內存並且可以保留一段時間,我們可以通過","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"strong","attrs":{}}],"text":"memtable_heap_space_in_mb","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"參數指定Memtable的大小","attrs":{}},{"type":"text","text":"。如果沒有指定它會設置爲堆內存容量的1/4作爲其默認值。接下來還可以進行調整新生代的空間大小,從而減少GC停頓的頻次。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於壓實來說,","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"strong","attrs":{}}],"text":"compaction_throughput_mb_per_sec","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"可以配置壓實操作的吞吐量","attrs":{}},{"type":"text","text":",並且壓實操作可能會產生大量的垃圾以及大量短暫存活的對象。您應該根據自己的業務場景來選擇不同的壓實策略:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Size Tiered Compaction Strategy(STCS)","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"默認壓縮策略,推薦用於寫入密集型的工作負載。或者當其他策略不適合工作負載時,作爲後備很有用,也可以用於處理LCS策略的 I/O過高的情況。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Leveled Compaction Strategy(LCS)","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"推薦用於讀取密集型工作負載。LCS相比於其他策略會產生更多的IO,如果你的IO已經是你的瓶頸了,切換到LCS帶來的額外IO開銷可能會抵消它所帶來的優勢。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"Time Window Compaction Strategy(TWCS)","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":"none"},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"適用於時間序列類型或任何有TTL(存活時間)限制的數據庫表。對於這種數據的頻繁寫入來說,相比TimeWindowCompactionStrategy這種壓實策略,LeveledCompactionStrategy這種壓實策略不僅會產生更多的I/O並喫掉大量的CPU資源,同時還會產生遠遠更多的內存垃圾。","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"具體的優缺點可以參考:","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e1/e1b1d678b3d7506d3f41d5a45d516b90.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"頻繁讀取","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於讀取頻繁的工作模式來說,它與Memtable的關聯就不大了,相反由於讀取操作會從磁盤上拉出數據並創建臨時的對象,頻繁讀取的工作負載會生成很多的短時存活對象。這些對象通常存活不過一秒,有時可能只存活幾毫秒而已。","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"所以在這種模式下我們更應該關注新生代與堆內存的大小","attrs":{}},{"type":"text","text":"。因爲:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對象的升級過程是緩慢的","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"很多對象被升級會很容易導致老年代的空間被填滿","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以在頻繁讀取的工作負載下,我們需要增加我們的堆內存和新生代的空間大小,不然可能就會產生更多的GC和停頓。同時","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"增加","attrs":{}},{"type":"text","marks":[{"type":"italic","attrs":{}},{"type":"strong","attrs":{}}],"text":"XX:MaxTenuringThreshold","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"的值","attrs":{}},{"type":"text","text":"也可以讓對象留在新生代,而不是被升級(升級去老年代可能會佔用大部分老年代空間,然後觸發full gc)。其次還可以","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"調整-XX:SurvivorRatio=4而不是用其默認值8,讓他擁有更大的Survivor空間。注意這裏調整老年代空間所得到的收益不是很大。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"其它優化","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以上的優化只是對一些常見手段的總結,優化的效果如何還需要根據具體業務場景來分析,其它可以調整的東西還有很多,比如:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"調整緩存的大小","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"調整寫入的線程數和讀取線程數。","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"關閉讀修復的特性","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"刪除數據的時候調整gc_grace_seconds來更快的刪除不要數據","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其它還可以根據業務調整數據一致性策略等","attrs":{}}]}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於新手來說,你可能不能進行有效的優化,你所能做的只有","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"按照思路多動手、多試試,進行適當的壓測、不斷地修改參數這樣才能找到對於自己業務合適的調優配置","attrs":{}},{"type":"text","text":"。不要妄想着一蹴而就。以上知識講的只是Cassandra的皮毛,記錄下來一方面是爲了後續遇到問題好查閱,一方面也想總結分享給大家。想要詳細學習可以查閱","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Cassandra官方文檔:","attrs":{}},{"type":"link","attrs":{"href":"https://cassandra.apache.org/doc/latest/getting_started/index.html","title":"","type":null},"content":[{"type":"text","text":"https://cassandra.apache.org/doc/latest/getting_started/index.html","attrs":{}}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第三方教程:","attrs":{}},{"type":"link","attrs":{"href":"https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/cassandraAbout.html","title":"","type":null},"content":[{"type":"text","text":"https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/cassandraAbout.html","attrs":{}}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章