Prometheus遠程寫入InfluxDB,遇到OOM引發的錯誤怎麼解?

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"一、問題場景"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"InfluxDB作爲時序數據庫被廣泛應用於監控領域的數據持久化工作,在數據持久化的運行過程中,往往會遇到各種各樣的錯誤,MAX-VALUE-PER-TAG是在InfluxDB寫入時爲避免OOM問題所引發的保護性錯誤。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文通過Prometheus遠程寫入InfluxDB進行業務系統監控數據持久化的視角,來介紹一下max-value-per-tag錯誤(如下圖)產生的原因及解決措施,以期能爲大家在使用InfluxDB時提供一些經驗和參考。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/4c\/2f\/4c43f6209d147f6ef5144fdeb604b82f.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"二、成因分析"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在Prometheus遠程寫入InfluxDB的場景中,導致此問題產生的主要原因是由於通過Prometheus遠程寫入InfluxDB的監控指標中包含了多個具有較高離散度的標籤數據所引起。舉例如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"設置Exporter暴露的監控指標,使監控指標中包含具有高離散度的標籤要素,此類標籤不用多,一兩個即可,例如:全局系統跟蹤號(每個指標自增1)、交易時間(每個指標實時抓取系統時間)等;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"設置完監控指標後,我們通過Prometheus將採集的指標持續不斷的寫入InfluxDB。(注:Prometheus會使用snappy壓縮算法將含有預先定義的如下數據結構信息流使用protobuf協議傳輸至InfluxDB);"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/56\/3b\/56d2397c51ca86204662154a72a2993b.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"InfluxDB收到數據流後會將監控數據指標名作爲表名(measurement)、所有標籤作爲索引字段(tags)、樣本值作爲常規字段(field)進行解析和存儲:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/03\/19\/0370c81b34e663a562466ca9d0963219.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/a9\/36\/a978f39d91f57ee70d3a1990cae50236.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在存儲時,InfluxDB會將由標籤組成的索引字段(tags)默認保存在內存中,目的是爲了避免索引字段取值過多而導致程序申請內存過大而引起的操作系統運行異常,即OOM問題;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了避免OOM問題的發生,InfluxDB提供了max-values-per-tag保護參數,該參數的初始閾值爲100000,即當某一索引字段(tags)取值數量超過該閾值時,會引發該條時序記錄寫入失敗:"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/f4\/4f\/f4953cf4a945067edee46d9b3c03834f.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/da\/14\/dac638f106c3109e1fa3b5eb1ca2ee14.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"綜上所述,當Prometheus持續寫入數據至InfluxDB一段時間後,由這些高離散度的標籤形成的索引字段(tags)數量便會達到max-values-per-tag閾值設定限制,從而導致後續監控數據寫入失敗。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"三、解決方案"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"解決方案1"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1)方案簡述"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由上一章節描述可知,InfluxDB將索引數據默認保存在內存中主要是爲了解決OOM問題,但如果我們的使用場景只是爲了做監控數據持久化,即基本上只會做寫入操作,是否可以將這些索引數據保存在磁盤中呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章