基於Impala的網易有數BI查詢優化總結

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"《"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.cn\/article\/L9PWP96XuBYDdYVvDKHz","title":"xxx","type":null},"content":[{"type":"text","text":"網易雲音樂數倉2020年建設之路"}]},{"type":"text","text":"》一文提到了Impala性能優化工作對於音樂數倉建設的重要性,本文總結了Impala在網易數帆旗下有數BI應用場景下的最新查詢優化經驗,並探討後續進一步優化的思路。文章首先簡述有數BI + Impala在網易雲音樂等業務使用時遇到的挑戰,再介紹進行有數查詢優化的重要工具——網易Impala管理服務器,最後結合實際業務問題討論具體優化方法及下一步計劃。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"一、有數BI + Impala遇上慢查詢"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在網易數帆旗下的全鏈路數據生產力平臺中,有數商業智能(BI)產品提供了數據大屏、有數報告(EasyBI)和自助取數(EasyFetch)等服務。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/53\/53177c8cbe45b5d445c929be51060fc6.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自助取數面向的用戶是數據分析師,有數報告所面向的用戶是BI工程師,允許用戶直接通過拖拽UI界面的各種控件來獲取所需的取數結果或數據報告,能夠減少數據開發等相關工程師的工作量,大大提高分析師取數的效率和BI報告製作效率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/4d\/4def37f3f3e7293965022a2def9607fc.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前,有數產品在包括網易雲音樂、嚴選、傳媒等在內的網易集團內部業務,以及包括德邦快遞、名創優品、溫氏集團、古茗等外部客戶的業務上均大規模使用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"世上武功,唯快不破。對於自助取數和有數報表,用戶體驗非常重要的一點就是要快。目前有數產品主要使用Impala作爲數據查詢引擎,相比Presto等其他開源OLAP查詢引擎,Impala具有明顯的性能優勢。區別於社區版Apache Impala,有數使用的是網易大數據的Impala增強版。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在有數大規模使用中,音樂等業務場景也呈現出了有數+Impala的不少問題,包括查詢錯誤較多、部分查詢較慢等。針對這些問題,Impala內核小組與業務、大數據產品團隊合作進行了大量優化,提高了查詢成功率,減少了慢查詢數量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"後面我們會結合案例詳細分析如何優化。開始前,先介紹優化所用的2個工具:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在Impala這一側,我們進行問題分析,尋找優化方法的主要工具是 "},{"type":"text","marks":[{"type":"strong"}],"text":"Impala管理服務器"},{"type":"text","text":",這部分在下一小節展開介紹;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一個工具是有數報告。是的,我們用有數BI產品來對有數查詢進行優化,將基於Impala管理服務器得到的分析結果製作成直觀的圖表報告。在優化過程也逐漸體會到有數產品的強大。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"二、Impala管理服務器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Impala是個Hadoop生態下的MPP查詢引擎,以性能著稱,其核心組件包括Catalog,Statestore和Impalad,Impalad根據是否接收客戶端查詢請求又可分爲coordinator\/executor。Impala的系統架構如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/b6\/b6a58744bcf051b5b335cd3bd3b57444.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"社區版Impala運維上的不足"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前社區版Impala在運維方面還存在不少短板。主要體現在如下方面:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"社區版Impala爲每個coordinator提供了webui界面,能夠查看該coordinator正在執行和最近已完成的查詢信息等。但Impala沒有提供集羣層面的查詢視圖,即沒有將各coordinator節點的查詢信息彙總到一個webui上。在觀察集羣查詢狀態時需要同時打開各個coordinator的webui並頻繁切換;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"coordinator的webui上展示的查詢信息是非持久化的,一旦進程重啓,這些信息就丟失了,而上線升級、系統bug等因素,進程重啓是不可避免的;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"coordinator的查詢信息非持久化帶來的另一個問題是:即使進程未重啓過,其所能緩存的查詢個數也是有限的,通過--query_log_size參數進行配置;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"coordinator緩存的信息是在進程的地址空間內,未暴露出來。外部工具無法獲取這些信息進行分析;"}]}]}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"管理服務器功能"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對上述問題,網易大數據團隊在Apache Impala社區版基礎上開發了Impala集中式管理模塊,即Impala管理服務器(managerd)。其主要功能包括:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"彙總Impala集羣中各coordinator節點正在執行和已完成的查詢,提供統一的web端查詢視圖。這樣在需要查找集羣中正在執行或已完成的查詢時,無需打開各coordinator的web界面;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"將歷史查詢信息持久化到MySQL和對象存儲上,有效防止進程重啓或查詢數量過多導致查詢丟失問題;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"管理服務器保存了集羣上已執行的所有歷史查詢信息,包括查詢基礎信息表basic_info和查詢明細表detail_info,如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/37\/37d2323312426fc7e9744eeccbfe404f.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"detail_info相比basic_info的字段更少,但有多個mediumblob字段,其中包含了更豐富的查詢信息。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/0e\/0ea94fff3f2f13fcb8bbe4acc1525a35.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"detail_info表有個profile字段,存儲了NOS上的key,該key對應的NOS對象保存了完整的查詢profile文件。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"profile文件"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"profile文件是深入分析查詢整個執行過程的關鍵信息,包括查詢的時間線(timeline),各個執行片段的counter信息、查詢涉及的表是否有統計信息等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a7\/a7667331118c3591253e66870936c760.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"三、使用痛點及優化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如前所述,我們可以把業務痛點分爲慢查詢和查詢錯誤兩類問題。下面就結合生產環境來舉例說明具體存在哪些問題。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"慢查詢原因分析和優化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"出現慢查詢的原因很多,下面分別從Impala、有數BI產品和HDFS等維度來進行說明。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"1.Impala相關"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"統計信息缺失"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"與主流的數據庫和數倉查詢引擎一樣,Impala也是基於代價模型進行執行計劃優化(CBO)。只有獲取足夠的統計信息,才能支撐Impala選取較優的執行計劃。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但Impala作爲一個查詢引擎,往往不負責schema的創建和數據的導入,因此,也就無法在數據導入時計算統計信息。作爲一個基於CBO的查詢引擎,若用戶不手動執行compute [incremental] stats計算統計信息,Impala的查詢性能是要打折扣的。下圖爲統計缺失時的一個執行計劃,可以看到531.35G的表分區作爲右表被廣播(broadcast)到集羣的其他節點上進行join操作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/2b\/2b1e6b241cacf08b37a1ca57dc923676.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/43\/43accaee87f377cc817b65cb065fba3a.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"優化與改進"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在對查詢所涉及的表進行統計信息計算後,再次進行查詢,join方式變爲分區模式(partitioned)。"}]},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/9a\/9a0fb3ff98c0f77bde097b6f1b8cfde8.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/c9\/c92a01430c0a57985940e8f761793964.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"優化效果是顯而易見的,除了性能得到了提升(從10分鐘超時變爲46s)。資源的消耗也急劇減低(詳見04:hash join的mem-estimate值)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因統計信息缺失導致的慢查詢是普遍存在的,線上集羣通過手動配置需要進行統計信息計算的表,對其跑compute stats腳本的方式來計算統計信息,作爲臨時的優化方式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前,我們已經依託Impala管理服務器開發了基於歷史查詢數據的自動統計信息計算功能,能夠根據所配置的參數自動選擇待處理的表,將其記錄到compute_stats_info表中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/31\/31b0de57f1a529342c7fa716e953c5e9.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"管理服務器通過後臺線程讀取這些表記錄並進行統計信息計算。預計Q1上線使用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"元數據緩存未命中"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了基於CBO進行執行計劃選擇外,Impala通過將表元數據緩存在本地來提升查詢性能,如將Hive表的元數據從Metastore(hms)加載到Catalogd和coordinator上,在爲查詢確定執行計劃時就無需花時間通過RPC調用從hms獲取所需的表元數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但由於Hive表會持續進行元數據更新,比如表分區的增加、刪除和重命名,修改表屬性等。這些都會導致Impala上緩存的元數據版本過舊,網易Impala版本增加了元數據同步功能,在hms側有元數據更新時,會刷新(refresh table)或失效(invalidate metadata table)緩存的元數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"問題原因及優化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"元數據同步功能解決了元數據版本過低導致查詢出錯問題,但會失效緩存的元數據,導致性能下降。而且Hive表支持自定義屬性,即用戶可以增加一些具有特殊用途的表狀態信息,比如網易大數據平臺的元數據中心可爲表增加訪問次數等統計,舉例如下。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"'metahub.table.accessCount'='xxx', 'metahub.table.readCount'='xxx', 'metahub.table.readTimes'='xxx', 'metahub.table.referCount'='xxx', \n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"元數據中心會調用hms的alter table接口以較高的頻次更新這些信息(採樣發現,在15分鐘內,多達191次)。但其實,這些信息對於Impala並沒有作用,不會對Impala執行計劃產生影響。如前所述,alter table操作會觸發Impala側緩存失效,導致查詢時需重新加載。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/da\/daa46aaa56dae7495495ce53ae2936fa.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖所示爲一個總執行時間21.5秒的查詢,其中11秒花在從hms加載表元數據上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在大數據開發團隊的支持下,已能夠識別這些對Impala沒有影響的alter table操作並將其過濾掉,從而提高查詢的緩存命中率。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"目前仍有較多必要的alter table操作場景會導致元數據失效,比如每天的離線數據產出。下一步,我們計劃通過優化元數據更新的方式,及時收集因爲各種原因導致的元數據緩存失效掉,通過後臺線程將其重新加載到緩存中。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"2.有數BI查詢相關"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"用戶採用在有數界面拖拽控件的方式取數或製作圖表,有數產品需要將其轉換成Impala等查詢引擎支持的SQL語句。有數生成的SQL是否合理,對查詢性能具有重要的影響。下面列舉SQL查詢的優化案例。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"問題舉例"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"時間\/日期轉換導致性能問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"就公司內部而已,事實表的數據一般以時間作爲分區字段,如每天一個分區,分區字段類型爲字符串。在分析報告中,經常需要將時間字段轉換爲時間戳類型,或進一步截取爲分鐘、小時、天、周、月等粒度。如下所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/b8\/b88ea05dd8f8e2a77ead9bae40700360.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"諸如此類,需要對每條記錄都一一進行多個時間轉換處理操作,勢必會影響查詢的性能。下面是個是否進行時間轉換的查詢性能對比。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/b4\/b4318cc5db1bcb6ec1225bc966c335ac.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"大查詢拖慢HDFS掃描性能性能"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"HDFS掃描性能往往會成爲查詢瓶頸,除了因與其他如離線分析等業務共用一套存儲外,還有個原因是Impala下發了大量需掃描過多數據的查詢語句。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有些業務表的單分區\/天數據量超過500G,某些有數查詢的分區範圍指定過大或沒有指定分區,比如1個季度或乾脆不限定分區,則單個查詢至少需掃描50+TB數據量。下圖就是個案例。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/4a\/4af886356b4f0044b6fcd512c842b4ac.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"更加嚴重的是,對於像有數這種BI報表產品,同個報告可能會包含多張相似的報表,類似的查詢往往都是成批出現的,影響更大。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/4a\/4acd2109bca40a043ab24fc03a36a6d2.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"產品側優化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"包括上述例子在內的多個SQL查詢相關問題,在有數版本迭代過程中逐步得到了優化,比如下圖爲在有數7.3版本所做的2個優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/be\/be41c8c694982867c2cdf698a9ac9117.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/a2\/a2123fe74be91ccdb8b3aee7abd40e49.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"3.HDFS存儲相關"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"該問題又可分爲2種,分別是訪問HDFS NameNode(NN)獲取文件元信息和從DataNode(DN)讀取文件數據。(需要注意的是,HDFS瓶頸是相對的,分場景的。對於Impala查詢來說可能是瓶頸,但對於離線批處理任務來說,可能同樣的性能表現並不構成瓶頸)。本文主要討論DN相關問題及優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"小文件問題"},{"type":"text","text":"單文件過小,且文件數太多,導致無法通過順序IO連續讀取大數據塊,需要重複走打開文件+讀取數據的流程,效率較低;線上某些表存在較嚴重的小文件問題。如下所示例子,文件大小僅爲10+MB,線上個別表的文件大小甚至僅爲KB、Byte級別。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/1b\/1bf50d4c47c83083b7a24b1a2bcbde8f.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"分區過大"},{"type":"text","text":"一般情況下,數據掃描的最小粒度是分區級別,分區越大則需掃描的數據量大。下圖所示某表,每天一分區,單分區文件數1k+,分區大小400+GB,共有200+分區。也就是說,如果需要分析1個星期數據,需掃描近3TB,若分析1個月,需掃約15TB。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ab\/abba6b4d22a75ebaf4f2467b7d89af93.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"表存儲格式"},{"type":"text","text":"SQL On Hadoop查詢引擎查詢Parquet或ORC格式的表文件時性能是最好的,如對Impala來說,對於Parquet或ORC格式,Runtime Filters(RF)特性的優勢能夠充分發揮,而對於TEXT文件格式,RF僅能作用在分區表上。下圖所示爲一張TEXT格式的100+G非分區表,該集羣每日慢查詢中有不小比例與該表相關。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/81\/81e6f965d22abd29bc635c4ec55ad106.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數倉治理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於DN相關的性能問題,涉及數倉治理,目前主要依賴業務的數倉團隊配合基於實際的業務場景進行優化。對於TEXT表,建議業務儘可能修改爲Parquet格式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於小文件問題。首先應該合理配置數據生產\/導入併發度,儘量減少小文件的產出的機率;其次,對於已存在的小文件,應在分區內進行適當的合併;再次,對於每個分區的總數據量過小的情況,應該考慮不對錶進行分區。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於分區過大問題。可考慮進行數據清洗提質,去除其中無價值的數據。據瞭解,音樂有8億+用戶,一般情況,會計算分析每個用戶相關行爲或推薦數據生成事實表,但這其中有一定比例的用戶是不活躍的,甚至是多年未登錄的。對於某些用戶相關事實表,可以考慮去掉不活躍用戶的數據,從而減少每個分區內的數據量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同時,大數據平臺將爲業務提供小文件合併和文件格式轉換等一鍵式數倉優化功能。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"查詢錯誤原因分析和收斂"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"業務在使用有數BI時會出現因爲Impala側查詢錯誤導致取數結果或BI報告無法生成,嚴重影響有數產品的體驗。業務反饋每天會有些查詢出錯,但不知道爲什麼出錯,前端呈現的錯誤日誌可讀性差,因此也不知道該如何進行改進\/優化。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"1.錯誤分類"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們根據basic_info表的state字段獲取出錯(exception)的查詢,並結合detail_info表的status逐步整理出了不同原因導致的錯誤。下面列舉出現較多的錯誤。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"SQL自身錯誤"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"主要是SQL語法、參數限制和UDF誤用等,舉例如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"org.apache.impala.common.AnalysisException: aggregate function not allowed in WHERE clause\n...\norg.apache.impala.common.AnalysisException: Exceeded the maximum number of child expressions (10000).\n...\norg.apache.impala.common.AnalysisException: No matching function with signature: default.dcount(BIGINT)\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"元數據錯誤"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"主要包括無法打開文件、列類型不兼容、Parquet格式不兼容、列未找到等,舉例如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"Disk I\/O error on xxx.jd.163.org:22000: Failed to open HDFS file hdfs:\/\/hz-cluster11\/user\/da_music\/hive\/warehouse\/xxx\nError(2): No such file or directory\n...\nError: File 'hdfs:\/\/hz-cluster11\/user\/da_music\/hive\/warehouse\/xxx' has an incompatible Parquet schema for column 'xxx'. Column type: STRING, Parquet schema:\n...\norg.apache.impala.common.AnalysisException: Could not resolve column\/field reference: 't2.current_card'\n...\norg.apache.impala.common.AnalysisException: Failed to load metadata for table: xxx\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"系統負載類錯誤"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"主要包括隊列滿、隊列超時、SQL內存超值、進程內存超值等錯誤。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"Rejected query from pool root.default: queue full, limit=160, num_queued=165\n...\nAdmission for query exceeded timeout 300000ms in pool root.default. Queued reason: queue is not empty (size 51); queued queries are executed first\n...\nMemory limit exceeded: Failed to allocate row batch\nEXCHANGE_NODE (id=31) could not allocate 16.00 KB without exceeding limit.\n...\nFailed to increase reservation by 68.00 MB because it would exceed the applicable reservation limit for the \"Process\"\n"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"查詢被取消"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"錯誤提示很簡單,有Cancelled、Session closed兩種。均是產品側主動kill了對應的Impala查詢導致,可能原因有很多,我們目前主要關注因爲執行時間超過閾值的查詢,如音樂用的有數產品設置的閾值爲10分鐘,這些超時查詢作爲慢查詢進行分析。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"2.錯誤收斂優化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於SQL自身錯誤,與上文所述SQL查詢性能優化一樣,主要與有數團隊一起梳理對應的錯誤原因,改寫有數SQL生成規則。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"元數據錯誤"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於元數據錯誤,基本上是由於Impala元數據同步問題導致,舉雲音樂的有數EasyFetch集羣爲例。該集羣在優化前存在較多因元數據同步導致的查詢錯誤,以前的同學已初步定位到是由於Impala未同步通過“Impala同步”選項開啓的表元數據,但並沒有繼續分析爲什麼會無法同步。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/50\/50a22e4a861bbc79852c27c346eed126.png","alt":null,"title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在本次優化期間,我們從頭梳理了一遍“Impala同步”選項的功能和實現邏輯,確定是平臺組件的代碼bug導致,修復後此類錯誤大幅減少。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"系統負載類錯誤"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這是個綜合性的問題,典型的場景是因爲少數慢查詢長期佔用了系統資源。對於隊列滿、隊列超時等錯誤,可以通過增加查詢的併發數或排隊超時時間來緩解,但提高查詢併發數有可能會導致集羣過載,查詢性能進一步下降,反過來又會延長正在排隊的查詢的等待時間。另外一種可行的方式是直接向用戶提示“當前系統負載過高,稍後再試”,避免用戶在短時間內重複刷新頁面導致情況惡化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於SQL內存或進程內存超值等錯誤,一般是由於複雜的大查詢或對查詢所需資源預估不準導致,對於前者,需要進行查詢優化,比如減少數據掃描的範圍等。對於後者,可通過補上表的統計信息來提高評估的精度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"系統負載類一般通過優化查詢性能來解決。當然,如果一個集羣每天都有好幾個小時集中出現大量系統負載類錯誤,那麼可以考慮是由於集羣可用資源不夠,應該及時擴容。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"四、後續優化計劃"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"經過大數據團隊及業務的共同努力,網易雲音樂、嚴選等業務的Impala集羣在查詢性能和錯誤收斂上取得了一定的成果,得到了音樂數倉團隊的認可,達成了嚴選“雙十一”確定的性能指標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Impala的性能優化仍在繼續。下面簡單例舉正在做的事情。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"全面替換公司內部業務的Impala集羣版本,從Impala 2.12升級到Impala 3.4版本,提供更強大的功能特性和性能表現。目前已完成音樂Impala集羣升級;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"引入Alluxio作爲Impala與HDFS間的緩存層;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"基於歷史查詢信息的表統計信息自動計算功能;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"基於物化視圖(臨時表)的SQL重寫功能,通過創建預聚合表來優化查詢性能;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","text":"與有數產品團隊合作實施有數查詢診斷項目。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者簡介:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"溫正湖,網易數帆數據庫開發專家,10+年數據庫和存儲開發經驗,2013年加入網易杭州研究院,一直從事數據庫開發相關工作,現爲大數據產品部OLTP&OLAP數據庫內核團隊負責人,主要負責MySQL、Impala等內核特性設計和開發、落地使用以及問題定位;專注於數據庫內核技術和分佈式系統架構,樂於挑戰和解決疑難問題;累計申請並授權10+技術發明專利,《MySQL 內核:InnoDB 存儲引擎 卷1》作者之一。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章