雲原生在京東丨雲原生時代下的監控:如何基於雲原生進行指標採集?

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/af/af9f6637b50b09be60b00a42f3812d5e.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雲妹導讀:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從 IDC 到雲,從彈性計算到容器技術,整個軟件運行的環境發生了天翻地覆的變化,監控對象以及指標也發生了微妙的變化。從原本的主機爲主體,變爲了容器和服務爲主體。而人們對於監控的要求,也逐漸從“看到指標 ”向被監控對象的“可觀測性”發生轉變。這一轉變在以 Kubernetes 爲代表的容器管理領域尤爲明顯。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從 Kubernetes 成爲容器管理領域的事實標準開始,基於雲原生也就是基於 Kubernetes 原生。在雲的體系下,基礎硬件基本上都被抽象化、模糊化,硬故障需要人爲干預的頻次在逐漸降低,健康檢查、失敗自愈、負載均衡等功能的提供,也使得簡單的、毀滅性的故障變少。而隨着服務的拆分和模塊的堆疊,不可描述的、模糊的、莫名其妙的故障卻比以前更加的頻繁。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"“看到指標”只是對於數據簡單的呈現,在目前雲的環境下,並不能高效地幫助我們找到問題。而“可觀測性”體現的是對數據的再加工,旨在挖掘出數據背後隱藏的信息,不僅僅停留在展現數據層面,更是經過對數據的解析和再組織,體現出數據的上下文信息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了達成“可觀測性”的目標,就需要更加"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"標準化、簡潔化的指標數據,以及更便捷的收集方式,更強更豐富的語義表達能力,更快更高效的存儲能力。"},{"type":"text","text":"本篇文章將主要探討時序指標的採集結構和採集方式,數據也是指時序數據,存儲結構以及 tracing、log、event 等監控形式不在本次討論範圍之內。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/df/df4ab1867a6762b3bde6736a8bee6bc7.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"提到時序數據,讓我們先看看幾個目前監控系統比較常用的時序數據庫:"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"opentsdb,influxdb,prometheus "},{"type":"text","text":"等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"經典的時序數據基本結構大家都是有統一認知的:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"唯一序列名標識,即指標名稱;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"指標的標籤集,詳細描述指標的維度;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"時間戳與數值對,詳細描述指標在某個時間點的值。 "}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"時序數據基本結構爲指標名稱 + 多個 kv 對的標籤集 + 時間戳 + 值,但是在細節上各家又各有不同。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/09/0965fa5182d869fd10199f45620c49cc.webp","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/7b/7bd45d70a5f9f6dadac0cf0fa957050a.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/f3/f39292cc71b30f925e8f190c8fba9252.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"opentsdb 使用大家耳熟能詳的 json 格式,可能是用戶第一反應中結構化的時序數據結構。只要瞭解基本時序數據結構的人一眼就能知道各個字段的含義。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/95/95f2f16ef91c11ef73c98965a33b8407.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"1[,=[,= ]] =[,=] []\n2例如:\n3cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/56/560a7cf68bc8e22cfacfa9ab38c6b60f.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e1/e1c4b7945fcbeeed787078d5b53d2f58.webp","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"1metric_name [\n2\"{\" label_name \"=\" `\"` label_value `\"` { \",\" label_name \"=\" `\"` label_value `\"` } [ \",\" ] \"}\"\n3] value [ timestamp ]\n4例如:\n5http_requests_total{method=\"post\",code=\"200\"} 1027 1395066363000"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/3e/3e15ca0a07fb2bad88b40406f5199a0c.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"influxdb 和 prometheus 都使用了自定義文本格式的時序數據描述,通過固定的語法格式將 json 的樹狀層級結構打平,並且沒有語義的丟失,行級的表述形式更便於閱讀。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/0c/0c0bc855985610704060b9ab62f53720.webp","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"文本格式優勢"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"○ 更符合人類閱讀習慣"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"○ 行級的表述結構對文件讀取的內存優化更友好"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"○ 減少了層級的嵌套"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"文本格式劣勢"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"○ 解析成本更高"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"○ 校驗相對更麻煩"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/a2/a2204ee98ef2cf112c47e68632fe7fbb.webp","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用過 Prometheus 的同學可能會注意到其實 Prometheus 的採集結構不是單行的,每類指標往往還伴隨着幾行註釋內容 ,其中主要是兩類HELP 和 TYPE ,分別表示指標的簡介說明和類型。格式大概是:"}]},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"1# Anything you want to say\n2# HELP http_requests_total The total number of HTTP requests.\n3# TYPE http_requests_total counter\n4http_requests_total{method=\"post\",code=\"200\"} 1027 1395066363000\n5http_requests_total{method=\"post\",code=\"400\"} 3 1395066363000"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Prometheus 主要支持4類指標類型:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Counter:"},{"type":"text","text":"只增不減的計數器。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Gauge:"},{"type":"text","text":"可增可減的數值。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Histogram:"},{"type":"text","text":"直方圖,分桶採樣。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Summary:"},{"type":"text","text":"數據彙總,分位數採樣。"}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其中 Counter 和 Gauge 很好理解,Histogram 和 Summary 可能一時間會讓人迷惑。其實 Histogram 和 Summary 都是爲了從不同維度解決和過濾長尾問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"例如,我和首富的平均身價並不能真實反映出我自己的身價。因此分桶或者分位數才能更準確的描述數據真實的分佈狀態。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而 Histogram 和 Summary 主要區別就在於對分位數的計算上,Histogram  在客戶端只進行分桶計算,因此可以在服務端進行整體的分位數計算。Summary 則是在客戶端環境下計算了分位數,因此失去了在整體視圖上進行分位數計算的可能性。官方也給出了 Histogram 和 Summary 的區別:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/bf/bfbfa5df1a31fa654d90ff1caa08dc03.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"需要說明的是,截止到目前爲止的 Prometheus 版本 2.20.1,這些 metric types 僅僅使用在客戶端庫(client libraries)和傳輸協議(wire protocol)中,服務端暫時沒有記錄這些信息。所以如果你對一個 Gauge 類型的指標使用 histogram_quantile(0.9,xxx) 也是可以的,只不過因爲沒有 xxx_bucket 的存在,計算不出來值而已。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/70/70a82030ce14883774372f86591936b8.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"時序監控數據的採集,從監控端來看,數據獲取的形式只有兩種,pull 和 push,不同的採集方式也決定了部署方式的不同。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"還是通過 opentsdb,prometheus 來舉例,因爲 influxdb 集羣版本方案爲商業版,暫不做討論。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/0e/0ea1743135ff9a6fe3990bd12bdb7fab.webp","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/3c/3c2bedfe2f8ffe7216973ed56dc0cbd2.webp","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖爲 opentsdb 架構圖 ,其中:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Servers:"},{"type":"text","text":"表示被採集數據的服務,C則是表示採集指標的工具,可以理解爲 opensdb 的 agent,servers 通過C將數據推送到下游的 TSD。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"TSD:"},{"type":"text","text":"對應實際進程名 TSDMain 是 opentsdb 組件,理解爲接收層,每個TSD都是獨立的,沒有 master 和 slave 的區分,也沒有共享狀態。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"HBase:"},{"type":"text","text":"opentsdb實際的最終數據存儲在 hbase 中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從架構圖可以看出,如果推送形式的數據量大量增長的情況下,可以通過多級組件,或者擴容無狀態的接收層等方式較爲簡單的進行吞吐量的升級。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/9d/9d02bc2d9963d198fe538b8099df9249.webp","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/bd/bd2c9477f3f69e004cbf93e6a48a89e1.webp","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖爲 prometheus 架構圖,主要看下面幾個部分:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"blockquote","content":[{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Prometheus Server:"},{"type":"text","text":"用於抓取和存儲時間序列化數據。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Exporters:"},{"type":"text","text":"主動拉取數據的插件。"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Pushgateway:"},{"type":"text","text":"被動拉取數據的插件。"}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"拉取的方式,通常是監控端定時從配置的各個被監控端的 Exporter 處拉取指標。這樣的設計方式可以降低監控端與被監控端的耦合程度,被監控端不需要知道監控端的存在,這樣將指標發送的壓力從被監控端轉義到監控端。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對比一下 pull 和 push 方式各自的優劣勢:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"pull 的優勢"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"pull 的劣勢"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"簡單對比了 pull 和 push 各自的特點,在雲原生環境中,prometheus 是目前的時序監控標準,爲什麼會選擇pull的形式,這裏有官方的解釋(https://prometheus.io/docs/introduction/faq/#why-do-you-pull-rather-than-push)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上面簡單介紹了一下從監控端視角看待數據採集方式的 pull 和 push 形式,而從被監控端來看,數據獲取的方式就多種多樣了,通常可以分爲以下幾種類型:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"blockquote","content":[{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"默認採集"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"探測採集"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"組件採集"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"埋點採集"}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面一一舉例說明。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/12/12260d25ebbc3c427863045a383281c9.webp","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"默認採集通常是通俗意義上的所有人都會需要觀察的基礎指標,往往與業務沒有強關聯,例如 cpu、memory、disk、net 等等硬件或者系統指標。通常監控系統都會有特定的 agent 來固定採集這些指標,而在雲原生中非常方便的使用 "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"node_exporter、CAdvisor、process-exporter"},{"type":"text","text":",分別進行節點機器、容器以及進程的基礎監控。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/7c/7c546fde54072688f14dbb4709b6ff29.webp","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"探測採集主要是指"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"從外部採集數據的方式。"},{"type":"text","text":"例如域名監控、站點監控、端口監控等都屬於這一類。採集的方式對系統沒有侵入,因爲對網絡的依賴比較強,所以通常會部署多個探測點,減少因爲網絡問題造成的誤報,但是需要特別小心的是,一定要評估探測採集的頻次,否則很容易對被探測方造成請求壓力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/49/499499e97459bac8333140872065acbe.webp","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通常是指"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"已經有現成的採集方案,只需要簡單的操作或者配置就可以進行詳細的指標採集,"},{"type":"text","text":"例如 mysql 的監控,redis 的監控等。在雲原生環境中,這種採集方式比較常見,得益於 prometheus 的發展壯大,常見的組件採集 exporter 層出不窮,prometheus 官方認證的各種 exporter。對於以下比較特殊或者定製化的需求,也完全可以按照 /metrics 接口標準自己完成自定義 exporter 的編寫。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/0c/0c238e9d98a5bcf590aaae129eb4dce5.webp","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於一個系統的關鍵性指標,本身的研發同學是最有發言權的,通過埋點的方式可以"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"精準的獲取相關指標。"},{"type":"text","text":"在 prometheus 體系中可以非常方便的使用 github.com/prometheus/client_* 的工具包來實現埋點採集。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/6e/6e7b6b2b03e07ecb77943afec1b63afd.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文對監控系統的第一個階段“採集”,從"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"“採集結構”"},{"type":"text","text":"和"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"“採集方式”"},{"type":"text","text":"兩方面做了簡單的介紹和梳理。相比於以往,在雲原生的環境中,服務顆粒度拆分的更細緻,迭代效率更高,從開發到上線形成了更快節奏的反饋循環,這也要求監控系統能夠更快速的反映出系統的異常,“採集結構”和“採集方式”雖然不是監控系統最核心的部分,但是簡潔的採集結構和便捷的採集方式也爲後續實現“可觀測性”提供了基礎。目前在雲原生環境中,使用 prometheus 可以非常方便快捷的實現監控,雖然仍有許多工作需要做,例如集羣化、持久化存儲等,但是隨着 Thanos 等方案的出現,prometheus 也在漸漸豐滿中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"歡迎點擊【"},{"type":"link","attrs":{"href":"https://developer.jdcloud.com/technical/cloud-native?utm_source=PMM_infoQ&utm_medium=Readmoreutm_campaign=ReadMoreutm_term=NA","title":""},"content":[{"type":"text","text":"京東智聯雲"}]},{"type":"text","text":"】瞭解開發者社區"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":""},{"type":"text","marks":[{"type":"strong"}],"text":"更多精彩技術實踐與獨家乾貨解析"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":""},{"type":"text","marks":[{"type":"strong"}],"text":"歡迎關注【京東智聯雲開發者】公衆號"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":""}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/0e/0e9d235592cebe258ab4197e98206044.jpeg?x-oss-process=image/resize,p_80/auto-orient,1","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":""}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章