作業幫PB級低成本日誌檢索服務
{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"日誌是服務觀察的主要方式,我們依賴日誌去感知服務的運行狀態、歷史狀況;當發生錯誤時,我們又依賴日誌去了解現場,定位問題。日誌對研發工程師來說異常關鍵,同時隨着微服務的流行,服務部署越來越分散化,所以我們需要一套日誌服務來採集、傳輸、檢索日誌。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於這個情況,誕生了以ELK爲代表的開源的日誌服務。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"需求場景"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在我們的場景下,高峯日誌寫入壓力大(每秒千萬級日誌條數);實時要求高:日誌處理從採集到可以被檢索的時間正常1s以內(高峯時期3s);成本壓力巨大,要求保存半年的日誌且可以回溯查詢(百PB規模)。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"ElasticSearch的不足"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"ELK"},{"type":"text","text":"方案裏最爲核心的就是"},{"type":"text","marks":[{"type":"strong"}],"text":"ElasticSearch"},{"type":"text","text":", 它負責存儲和索引日誌, 對外提供查詢能力。"},{"type":"text","marks":[{"type":"strong"}],"text":"Elasticsearch"},{"type":"text","text":" 是一個搜索引擎, 底層依賴了"},{"type":"text","marks":[{"type":"strong"}],"text":"Lucene的倒排索引技術"},{"type":"text","text":"來實現檢索, 並且通過"},{"type":"text","marks":[{"type":"strong"}],"text":"shard"},{"type":"text","text":"的設計拆分數據分片, 從而突破單機在存儲空間和處理性能上的限制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"• 寫入性能"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ElasticSearch寫入數據需要對日誌索引字段的倒排索引做更新,從而能夠檢索到最新的日誌。爲了提升寫入性能,可以做聚合提交、延遲索引、減少refersh等等,但是始終要建立索引, 在日誌流量巨大的情況下(每秒20GB數據、千萬級日誌條數), 瓶頸明顯。離理想差距過大,我們期望寫入近乎準實時。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"• 運行成本"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ElasticSearch需要定期維護索引、數據分片以及檢索緩存, 這會佔用大量的 CPU 和內存,日誌數據是存儲在機器磁盤上,在需要存儲大量日誌且保存很長時間時, 機器磁盤使用量巨大,同時索引後會帶來數據膨脹,進一步帶來成本提升。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"•對非格式化的日誌支持不好"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ELK需要解析日誌以便爲日誌項建立索引, 非格式化的日誌需要增加額外的處理邏輯來適配。存在很多業務日誌並不規範,且有收斂難度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"總結:日誌檢索場景是一個"},{"type":"text","marks":[{"type":"strong"}],"text":"寫多讀少"},{"type":"text","text":"的場景, 在這樣的場景下去維護一個龐大且複雜的索引, 在我們看來其實是一個性價比很低的事情。如果採用ElasticSearch方案,經測算我們需要幾萬核規模集羣,仍然保證不了寫入數據和檢索效率,且資源浪費嚴重。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"日誌檢索設計"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"面對這種情況, 我們不妨從一個不同的角度去看待日誌檢索的場景, 用一個更適合的設計來解決日誌檢索的需求, 新的設計具體有以下三個點:"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1. 日誌分塊"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同樣的我們需要對日誌進行採集,但在處理日誌時我們不對日誌原文進行解析和索引,而是通過日誌時間、日誌所屬實例、日誌類型、日誌級別等日誌元數據對日誌進行分塊。這樣檢索系統可以"},{"type":"text","marks":[{"type":"strong"}],"text":"不對日誌格式做任何要求"},{"type":"text","text":",並且因爲沒有解析和建立索引(這塊開銷很大)的步驟, 寫入速度也能夠達到極致(只取決於磁盤的 IO 速度)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/03\/21\/03df897547bb9a63ab8f10980a53f821.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"簡單來說, 我們可以將一個實例產生的同一類日誌按時間順序寫入到一個文件中, 並按時間維度對文件拆分. 不同的日誌塊會分散在多臺機器上(我們一般會按照實例和類型等維度對日誌塊的存儲機器進行分片), 這樣我們就可以在多臺機器上對這些日誌塊併發地進行處理, 這種方式是支持橫向擴展的. 如果一臺機器的處理性能不夠, 橫向再擴展就行。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那如何對入日誌塊內的數據進行檢索呢?這個很簡單, 因爲保存的是日誌原文,可以直接使用 grep 相關的命令直接對日誌塊進行檢索處理。對開發人員來說, grep是最爲熟悉的命令, 並且使用上也很靈活, 可以滿足開發對日誌檢索的各種需求。因爲我們是直接對日誌塊做追加寫入,不需要等待索引建立生效,在日誌刷入到日誌塊上時就可以被立刻檢索到, 保證了檢索結果的"},{"type":"text","marks":[{"type":"strong"}],"text":"實時性"},{"type":"text","text":"。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2. 元數據索引"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"接下來我們看看要如何對這麼一大批的日誌塊進行檢索。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先我們當日志塊建立時, 我們會基於日誌塊的元數據信息搭建索引, 像服務名稱、日誌時間, 日誌所屬實例, 日誌類型等信息, 並將日誌塊的存儲位置做爲value一起存儲。通過索引日誌塊的元數據,當我們需要對某個服務在某段時間內的某類日誌發起檢索時,就可以快速地找到需要檢索的日誌塊位置,併發處理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/16\/06\/16fd67096b1b93e91ed5b4eb7914c706.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"索引的結構可以按需構建, 你可以將你關心的元數據信息放入到索引中, 從而方便快速圈定需要的日誌塊。因爲我們只對日誌塊的元數據做了索引, 相比於對全部日誌建立索引, 這個成本可以說降到了極低, 鎖定日誌塊的速度也足夠理想。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"3. 日誌生命週期與數據沉降"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"日誌數據以時間維度的方向可以理解爲一種時序數據, 離當前時間越近的日誌會越有價值, 被查詢的可能性也會越高, 呈現一種冷熱分離的情況。而且冷數據也並非是毫無價值,開發人員要求回溯幾個月前的日誌數據也是存在的場景, 即我們的日誌需要在其生命週期裏都能夠對外提供查詢能力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於這種情況,如果將生命週期內的所有日誌塊都保存在本地磁盤上, 無疑是對我們的機器容量提了很大的需求。對於這種日誌存儲上的需求,我們可以採用壓縮和沉降的手段來解決。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/75\/16\/75f67436650eedd2191c05ae2311d216.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"簡單來說,我們將日誌塊存儲分爲本地存儲(磁盤)、遠程存儲(對象存儲)、歸檔存儲三個級別; 本地存儲負責提供實時和短期的日誌查詢(一天或幾個小時), 遠程存儲負責一定時期內的日誌查詢需求(一週或者幾周), 歸檔存儲負責日誌整個生命週期裏的查詢需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現在我們看看日誌塊在其生命週期裏是如何在多級存儲間流轉的, 首先日誌塊會在本地磁盤創建並寫入對應的日誌數據, 完成後會在本地磁盤保留一定時間(保留的時間取決於磁盤存儲壓力), 在保存一定時間後, 它首先會被"},{"type":"text","marks":[{"type":"strong"}],"text":"壓縮"},{"type":"text","text":"然後被上傳至遠程存儲(一般是對象存儲中的標準存儲類型), 再經過一段時間後日志塊會被遷移到歸檔存儲中保存(一般是對象存儲中的歸檔存儲類型)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這樣的存儲設計有什麼好處呢? 如下面的多級存儲示意圖所示, 越往下存儲的數據量越大, 存儲介質的成本也越低, 每層大概爲上一層的 1\/3 左右, 並且數據是在壓縮後存儲的, 日誌的數據壓縮率一般可以達到"},{"type":"text","marks":[{"type":"strong"}],"text":"10:1"},{"type":"text","text":", 由此看歸檔存儲日誌的成本能在本地存儲的"},{"type":"text","marks":[{"type":"strong"}],"text":"1%"},{"type":"text","text":"的左右, 如果使用了 SSD 硬盤作爲本地存儲, 這個差距還會更大。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"價格參考:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"embedcomp","attrs":{"type":"table","data":{"content":"
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.