618臨陣磨刀:Flink容器化與平臺化建設少走彎路

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自2017年起,爲保障內部業務在平時和大促期間的平穩運行,我們唯品會就開始基於Kubernetes深入打造高性能、穩定、可靠、易用的實時計算平臺,我們現在的平臺支持Flink、Spark、Storm等主流框架。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文將分爲五個方面,分享唯品會Flink的容器化實踐應用以及產品化經驗:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/48\/48f64a5212d1d8117d156b6b97d3a1ad.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"一、發展概覽"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1、集羣規模"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/29\/296941c074e16b1925a8de5d28469a2b.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在集羣規模方面,我們有2000+的物理機,主要部署Kubernetes異地雙活的集羣利用Kubernetes的namespaces、labels和taints等實現業務隔離以及初步的計算負載隔離。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink任務數、Flink SQL任務數、Storm任務數、Spark任務數,這些線上實時應用加起來有1000多個,目前我們主要在支持Flink SQL這一塊,SQL化是一個趨勢,所以我們要支持SQL任務的上線平臺。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2、平臺架構"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/25\/25ffd6a9a25c45d234b58edc1f14c0cb.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們從下往上進行解析實時計算平臺的整體架構:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"資源調度層(最底層)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實際上是用deployment的模式運行Kubernetes上,平臺雖然是支持yarn調度,但是yarn調度是與批任務共享資源,所以主流任務還是運行在Kubernetes上的。並且,yarn調度這一層主要是跟離線部署的一套yarn集羣,在2017年的時候,我們自研了Flink on Kubernetes的一套方案,因爲底層調度分了兩層,所以在大促資源緊張的時候,實時跟離線就可以做一個資源的借調。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" 存儲層"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"主要用來支持公司內部基於Kafka實時數據vms,基於binlog的vdp數據和原生Kafka作爲消息總線,狀態存儲在hdfs上,數據主要存入Redis,MySQL,HBase,Kudu,HDFS,ClickHouse等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"計算引擎層"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"主要是Flink、Storm、Spark,目前主推的是Flink這一塊,每個框架會都會支持幾個版本的鏡像以滿足不同的業務需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"實時平臺層"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"主要提供作業配置、調度、版本管理、容器監控、job監控、告警、日誌等功能,提供多租戶的資源管理(quota,label管理),提供Kafka監控。資源配置也分爲大促日和平常日,大促的資源和平常的資源是不一樣的,資源的權限管控也是不一樣的。在Flink 1.11版本之前,平臺自建元數據管理系統爲Flink SQL管理schema,1.11版本開始,通過hive metastore與公司元數據管理系統融合。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"應用層"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"主要是支持實時大屏、推薦、實驗平臺、實時監控和實時數據清洗的一些場景。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"二、Flink容器化實踐"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1、容器化方案"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/f6\/f6090cde3c4920b9eee444df1fa5001b.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上面是實時平臺Flink容器化的架構圖。Flink容器化其實是基於standalone模式部署的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們的部署模式共有client、job manager、task manager三個角色,每一個角色都會有一個deployment來控制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用戶通過平臺上傳任務jar包、配置等,存儲於hdfs上。同時由平臺維護的配置、依賴等也存儲在hdfs上,當pod啓動時,就會進行拉取等初始化操作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"client中主進程是一個由go開發的agent,當client啓動時,會首先檢查集羣狀態,當集羣準備好後,從hdfs上拉取jar包,再向這個集羣提交任務,client的主要任務是做容錯,它主要功能還有監控任務狀態,做savepoint等操作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過部署在每臺物理機上的smart-agent採集容器的指標寫入m3,以及通過Flink暴漏的接口將metrics寫入prometheus,結合grafana展示。同樣通過部署在每臺物理機上的vfilebeat採集掛載出來的相關日誌寫入es,在dragonfly可以實現日誌檢索。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"1)Flink 平臺化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在實踐過程中,一定要結合具體場景和易用性,再去考慮做平臺化工作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/68\/685a1aa68f12534c158a6108e8cf8e47.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"2)Flink穩定性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在我們應用部署以及運行過程中,異常肯定是不可避免的,這時候我們平臺就需要做一些保證任務在出現異常狀況後,保持穩定性的一些策略。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"pod的健康和可用"},{"type":"text","text":":"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"是由livenessProbe和readinessProbe檢測的,同時指定pod的重啓策略,Kubernetes本身是可以做一個pod的拉起。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Flink任務產生異常時"},{"type":"text","text":":"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Flink有自已本身的一套restart策略和failover機制,這是它第一層的保障。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在client中會定時監控Flink狀態,同時將最新的checkpoint地址更新到自己的緩存中,並彙報到平臺,固化到MySQL中。當Flink無法再重啓時,由client重新從最新的成功checkpoint提交任務。這是它的第二層保障。這一層將checkpoint固化到MySQL中後,就不再使用Flink HA機制了,少了zk的組件依賴。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當前兩層無法重啓時或集羣出現異常時,由平臺自動從固化到MySQL中的最新checkpoint重新拉起一個集羣,提交任務,這是它的第三層保障。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"機房容災"},{"type":"text","text":":"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用戶的jar包,checkpoint都做了異地雙HDFS存儲。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"異地雙機房雙集羣。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2、Kafka監控方案"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kafka監控是我們的任務監控裏非常重要的一個環節,整體的流程如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/49\/49ec18834fb5ebf511d77dd19f3e5055.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"平臺提供監控Kafka 堆積,用戶在界面上,可以配置自己的Kafka監控,告知在怎樣的集羣,以及用戶消費message等配置信息,可以從MySQL中將用戶Kafka監控配置提取後,再通過jmx監控Kafka,這樣的信息採集之後,寫入下游Kafka,再通過另一個Flink任務實時監控告警,同時將這些數據同步寫入ck裏面,從而反饋給我們的用戶(這裏也可以不用ck,也可以用Prometheus去做監控,都是可以的,但ck會更加適合),最後再用Grafana組件去展示給用戶。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"三、Flink SQL平臺化建設"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有了前面Flink的容器化方案之後,那麼就要開始Flink SQL平臺化建設了。大家都知道,這樣流失的api開發起來,還是有一定的成本的, Flink肯定是比Storm快的,也相對比較穩定、容易一些,但是對於一些用戶,特別是Java開發的一些同學來說,做這個是有一定門檻的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kubernetes的Flink容器化實現以後,方便了Flink api 應用的發佈,但是對於Flink SQL的任務仍然不夠便利。於是平臺提供了更加方便的在線編輯發佈、SQL管理等一棧式開發平臺。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1、 Flink SQL方案"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/77\/779505bd7ddf0b95de8539ca5b8146fc.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"平臺的Flink SQL方案如上圖所示,任務發佈系統與元數據管理系統是完全解耦的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"1)Flink SQL 任務發佈平臺化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在實踐過程中,需要考慮易用性,做平臺化工作,主操作界面如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink SQL的版本管理、語法校驗、拓撲圖管理等;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"UDF 通用和任務級別的管理,支持用戶自定義udf;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"提供參數化的配置界面,方便用戶上線任務。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面我們看一個用戶界面配置的一個例子:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/27\/271948a80f9eace12077c50ec7441430.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面是集羣配置的一個範例:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/92\/921dbd93ef40605f8ff96b384234e8b9.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"2)元數據管理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"平臺在1.11之前通過構建自己的元數據管理系統UDM,MySQL存儲Kafka,Redis等schema,通過自定義catalog打通Flink與UDM,從而實現元數據管理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在1.11之後,Flink集成hive逐漸完善,平臺重構了Flink SQL框架,並通過部署一個SQL-gateway service服務,中間調用自己維護的sql-client jar包,從而與離線元數據打通,實現了實時離線元數據的統一,爲之後的流批一體打好了基礎。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在元數據管理系統創建的Flink表操作界面如下圖所示:創建Flink表的元數據,持久化到hive裏,Flink SQL啓動時從hive裏讀取對應表的table schema信息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/f2\/f29dc6cdfa8aa97338bfbccb7929c60b.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2、Flink SQL相關實踐"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"平臺對於官方原生支持或者不支持的connector進行整合和開發,鏡像和connector,format等相關依賴進行解耦,可以快捷的進行更新與迭代。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"1)Flink SQL相關實踐"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink SQL主要分爲以下三層:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"connector層"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持VDP connector讀取source數據源;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持Redis string、hash等數據類型的sink&維表關聯;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持kudu connector&catalog&維表關聯;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持protobuf format解析實時清洗數據;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持vms connector讀取source數據源;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持ClickHouse connector sink分佈式表&本地表高TPS寫入;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Hive connector支持數坊Watermark Commit Policy分區提交策略&array、decimal等複雜數據類型。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" runntime層"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"主要支持拓撲圖執行計劃修改;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"維表關聯keyBy優化cache提升查詢性能;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"維表關聯延遲join。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"平臺層"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"hive UDF;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持json HLL相關處理函數;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持Flink運行相關參數設置如minibatch、聚合優化參數;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink升級hadoop3。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"2)拓撲圖執行計劃修改"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對現階段SQL生成的stream graph並行度無法修改等問題,平臺提供可修改的拓撲預覽修改相關參數。平臺會將解析後的FlinkSQL的excution plan json提供給用戶,利用uid保證算子的唯一性,修改每個算子的並行度,chain策略等,也爲用戶解決反壓問題提供方法。例如針對ClickHouse sink 小併發大批次的場景,我們支持修改ClickHouse sink並行度,source並行度=72,sink 並行度=24,提高ClickHouse sink tps。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a5\/a5409b867b3a034656b3525928a0a836.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"3)維表關聯keyBy優化cache"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對維表關聯的情況,爲了降低IO請求次數,降低維表數據庫讀壓力,從而降低延遲,提高吞吐,有以下三種措施:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c5\/c5930b3d809b03444a8ac279b0abadc3.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面是維表關聯KeyBy優化cache的圖:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/8c\/8c8e83511f26daee77f79ef7b03aea2e.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"    在優化之前的時候,維表關聯LookupJoin算子和正常算子chain在一起,優化之間維表關聯Lookup Join算子和正常算子不chain在一起,將join key 作爲hash策略的key。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"採用這種方式優化後,例如原來的3000W 數據量維表,10個TM節點,每個節點都要緩存3000W的數據,總共需要緩存3億的量。而經過keyBy優化之後,每個TM節點只需要緩存3000W\/10 =300W的數據量,總共緩存的數據量只有3000W,這非常大程度減少了緩存數據量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"4)維表關聯延遲join"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"維表關聯中,有很多業務場景,在維表數據新增數據之前,主流數據已經發生join操作,會出現關聯不上的情況。因此,爲了保證數據的正確,將關聯不上的數據進行緩存,進行延遲join。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最簡單的做法是,在維表關聯的function裏設置重試次數和重試間隔,這個方法會增大整個流的延遲,但主流qps不高的情況下,可以解決問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"增加延遲join的算子,當join維表未關聯時,先緩存起來,根據設置重試次數和重試間隔從而進行延遲的join。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"四、應用案例"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1、實時數倉"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"1)實時數據入倉"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/25\/256dec4f2d97a9842a4d9741eff70d0a.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實時數倉主要分爲三個過程:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"流量數據一級Kafka進行實時數據清洗後,可以寫到二級清洗Kafka,主要是protobuf格式,再通過Flink SQL寫入hive 5min表,以便做後續的準實時ETL,加速ods層數據源的準備時間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MySQL 業務庫的數據,通過VDP解析形成binlog cdc消息流,再通過Flink SQL寫入hive 5min表,同時會提交到自定義分區,再把分區狀態彙報到服務接口,最後再做一個離線的調度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"業務系統通過VMS API產生業務Kafka消息流,通過Flink SQL解析之後寫入hive 5min表。可以支持string、json、csv等消息格式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"使用 Flink SQL做流式數據入倉是非常方便的,而且 1.12 版本已經可以支持了小文件的自動合併,解決了小文件的問題,解決了大數據層一個非常普遍的痛點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們自定義分區提交策略,當前分區ready時候會調一下實時平臺的分區提交api,在離線調度定時調度通過這個api檢查分區是否ready。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"採用Flink SQL統一入倉方案以後,我們可獲得以下成果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先我們不僅解決了以往Flume方案不穩定的問題,用戶也可以實現自助入倉,大大降低入倉任務的維護成本,穩定性也可以得到保障。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其次我們還提升了離線數倉的時效性,從小時級降低至5min粒度入倉,時效性可以增強。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"2)實時指標計算"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/2f\/2f1f01a0f6fe760f9f99940de78e811d.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實時應用消費清洗後Kafka,通過Redis維表、api等方式關聯,再通過Flink window 增量計算UV,持久化寫到Hbase裏。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實時應用消費VDP消息流之後,通過Redis維表、api等方式關聯,再通過Flink SQL 計算出銷售額等相關指標,增量upsert到kudu裏,方便根據range分區批量查詢,最終通過數據服務對實時大屏提供最終服務。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以往指標計算通常採用Storm方式,這個方式需要通過api定製化開發,採用這樣Flink方案以後,我們可以獲得了一下成果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將計算邏輯切到Flink SQL上,降低計算任務口徑變化快,解決修改上線週期慢等問題;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"切換至Flink SQL可以做到快速修改,並且實現快速上線,降低了維護的成本。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"3)實時離線一體化ETL數據集成"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"具體的流程如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/63\/63bfe84d190a3b77602a07d70d234c76.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink SQL 在最近的版本中持續強化了維表 join 的能力,不僅可以實時關聯數據庫中的維表數據,現在還能關聯 Hive 和 Kafka 中的維表數據,能靈活滿足不同工作負載和時效性的需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於 Flink 強大的流式 ETL 的能力,我們可以統一在實時層做數據接入和數據轉換,然後將明細層的數據迴流到離線數倉中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們通過將presto內部使用的HyperLogLog( 後面簡稱HLL) 實現引入到Spark UDAF函數裏,打通HLL對象在Spark SQL與presto引擎之間的互通,如Spark SQL通過prepare函數生成的HLL對象,不僅可以在Spark SQL裏merge查詢而且可以在presto裏進行merge查詢。具體流程如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/9a\/9a7ce55625468155a5f356d0dca8141b.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"UV近似計算示例: "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/8c\/8c5cd5e5527601a611946daace2ea65e.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以基於實時離線一體化ETL數據集成的架構,我們可獲得以下成果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2、實驗平臺(Flink實時數據入OLAP)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"唯品會實驗平臺是通過配置多維度分析和下鑽分析,提供海量數據的A\/B-test實驗效果分析的一體化平臺。一個實驗是由一股流量(比如用戶請求)和在這股流量上進行的相對對比實驗的修改組成。實驗平臺對於海量數據查詢有着低延遲、低響應、超大規模數據(百億級)的需求。整體數據架構如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/b0\/b0493f57009b8cde76deb5ad97a5aaf4.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"離線數據是通過waterdrop導入到ClickHouse裏面去;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"實時數據通過Flink SQL將Kafka裏的數據清洗解析展開等操作之後,通過Redis維表關聯商品屬性,通過分佈式表寫入到ClickHouse,然後通過數據服務adhoc查詢,通過數據服務提供對外的接口。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"業務數據流如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/34\/34087e831355ed28a79bffdffc02a5c0.png","alt":"圖片","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"業務數據流可以給大家簡單介紹一下,我們的實驗平臺有一個很重要的ES場景,我們上線一個應用場景,我想看效果如何,上線產生的曝光、點擊、加購、收藏是怎樣的。我們需要把每一個數據的明細,比如說分流的一些數據,需要根據場景分區,寫到ck裏面去。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們通過Flink SQL Redis connector,支持Redis的sink 、source維表關聯等操作,可以很方便地讀寫Redis,實現維表關聯,維表關聯內可配置cache ,極大提高應用的TPS。通過Flink SQL 實現實時數據流的pipeline,最終將大寬表sink到CK 裏,並按照某個字段粒度做murmurHash3_64 存儲,保證相同用戶的數據都存在同一shard 節點組內,從而使得ck大表之間的join 變成 local本地表之間的join,減少數據shuffle操作,提升join查詢效率。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"五、未來規劃"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"1、提高Flink SQL易用性"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先我們會提高Flink的一個易用性,主要因爲Flink SQL對於hive用戶來說,使用起來還是有一點不一樣的地方。不管是是hive,還是Spark SQL,都是批量處理的一個場景。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以當前我們的Flink SQL 調試起來仍有很多不方便的地方,對於做離線hive用戶來說還有一定的使用門檻,例如手動配置Kafka監控、任務的壓測調優,所以如何能讓用戶的使用門檻降至最低,讓用戶只需要懂SQL或者懂業務,把Flink SQL裏面的概念對用戶屏蔽掉,簡化用戶的使用流程,是一個比較大的挑戰。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"將來我們考慮做一些智能監控,告訴用戶當前任務存在的問題,不需要用戶去學習太多的東西,儘可能自動化並給用戶一些優化建議。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"2、數據湖CDC分析方案落地"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一方面,我們做數據湖主要是爲了解決我們binlog實時更新的場景,目前我們的VDP binlog消息流,通過Flink SQL寫入到hive ods層,以加速ods層數據源的準備時間,但是會產生大量重複消息去重合並。我們會考慮Flink + 數據湖的cdc入倉方案來做增量入倉。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一方面我們希望通過數據湖,來替代我們Kudu,我們這邊一部分重要的業務在用Kudu,雖然Kudu沒有大量的使用,但鑑於Kudu的運維比一般的數據庫運維複雜得多、比較小衆,並且像訂單打寬之後的Kafka消息流、以及聚合結果都需要非常強的實時upsert能力,所以我們就開始調研CDC+數據湖這種解決方案,用這種方案的增量upsert能力來替換kudu增量upsert場景。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Q&A"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q1:vdp connector 是 MySQL binlog讀取嗎?和canal是一種工具嗎?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"A1 :"},{"type":"text","text":"vdp是公司binlog同步的一個組件,將binlog解析之後發送到Kafka。是基於canal二次開發的。我們定義了一個cdc format可以對接公司的vdp Kafka數據源,與Canal CDC format有點類似。目前沒有開源,使我們公司用的 binlog的一個同步方案。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q2 : uv數據輸出到hbase,銷售數據輸出到kudu,輸出到了不同的數據源,主要是因爲什麼採取的這種策略?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"A2 :"},{"type":"text","text":"kudu的應用場景沒有hbase這麼廣泛。uv實時寫入的TPS比較高,hbase比較適合單條查詢的場景,寫入hbase 高吞吐+低延遲,小範圍查詢延遲低;kudu的話具備一些OLAP的特性,可以存訂單類明細,列存加速,結合Spark、presto等做OLAP分析。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q3 : 請問一下,你們怎麼解決的ClickHouse的數據更新問題?比如數據指標更新。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"A3 : "},{"type":"text","text":"ck的更新是異步merge,只能在同一shard同一節點同一分區內異步merge,是弱一致性。對於指標更新場景不太建議使用ck。如果在ck裏有更新強需求的場景,可以嘗試 AggregatingMergeTree解決方案,用insert 替換update,做字段級的merge。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q4:binlog寫入怎麼保證數據的去重和一致性?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"A4 : "},{"type":"text","text":"binlog目前還沒有寫入ck的場景,這個方案看起來不太成熟。不建議這麼做,可以用採用CDC + 數據湖的解決方案。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q5 : 如果ck各個節點寫入不均衡,怎麼去監控,怎麼解決?怎麼樣看數據傾斜呢?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"A5 :"},{"type":"text","text":"可以通過ck的system.parts本地表監控每臺機器每個表每個分區的寫入數據量以及size,來查看數據分區,從而定位到某個表某臺機器某個分區。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q6 : 你們存在實時平臺是如何做任務監控或者健康檢查的?又是如何在出錯後自動恢復的?現在用的是yarn-application模式嗎?存在一個yarn application對應多個Flink job的情況嗎?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"A6 : "},{"type":"text","text":"對於Flink 1.12+版本,支持了PrometheusReporter方式暴露一些Flink metrics指標,比如算子的watermark、checkpoint相關的指標如size、耗時、失敗次數等關鍵指標,然後採集、存儲起來做任務監控告警。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Flink原生的restart策略和failover機制,"},{"type":"text","marks":[{"type":"strong"}],"text":"作爲第一層的保證。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在client中會定時監控Flink狀態,同時將最新的checkpoint地址更新到自己的緩存中,並彙報到平臺,固化到MySQL中。當Flink無法再重啓時,由client重新從最新的成功checkpoint提交任務。"},{"type":"text","marks":[{"type":"strong"}],"text":"作爲第二層保證。"},{"type":"text","text":"這一層將checkpoint固化到MySQL中後,就不再使用Flink HA機制了,少了zk的組件依賴。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當前兩層無法重啓時或集羣出現異常時,由平臺自動從固化到MySQL中的最新chekcpoint重新拉起一個集羣,提交任務,"},{"type":"text","marks":[{"type":"strong"}],"text":"作爲第三層保證。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們支持yarn-per-job模式,主要基於Flink on Kubernetes模式部署standalone集羣。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q7 : 目前你們大數據平臺上所有的組件都是容器化的還是混合的?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"A7 :"},{"type":"text","text":"目前我們實時這一塊的組件Flink、Spark 、Storm、Presto等計算框架實現了容器化,詳情可看上文1.2平臺架構。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q8 :kudu不是在Kubernetes上跑的吧?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"A8 :"},{"type":"text","text":"kudu不是在Kubernetes上運行,這個目前還沒有特別成熟的方案。並且kudu 是基於cloudera manager 運維的,沒有上Kubernetes的必要。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q9 : Flink實時數倉維度表存到ck中,再去查詢ck的話,這樣方案可以嗎?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"A9:"},{"type":"text","text":"這是可以的,是可以值得嘗試的。事實表與維度表數據都可以存,可以按照某個字段做哈希(比如user_id),從而實現local join的效果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"嘉賓介紹:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"王康,"},{"type":"text","text":"唯品會 數據平臺 高級開發工程師。多年大數據實時計算方面工作經驗,負責Flink SQL平臺的設計與開發工作,致力於爲公司提供大規模、高效、穩定的實時數據SQL開發平臺。曾入職京東物流,負責實時數據平臺單表、寬表平臺化建設。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文轉載自:dbaplus社羣(ID:dbaplus社羣)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文鏈接:"},{"type":"link","attrs":{"href":"https:\/\/mp.weixin.qq.com\/s\/Cw7FVvuDYckMEA6tzeXk_w","title":"xxx","type":null},"content":[{"type":"text","text":"618臨陣磨刀:Flink容器化與平臺化建設少走彎路"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章