效能提升30%、埋點線下bug率下降50%,網易雲音樂數倉建設之路

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據倉庫是當前數據中臺體系的核心組件之一,也是網易雲音樂數據化運營的發動機,本文總結了 2020 年網易雲音樂數據倉庫團隊的一些核心工作、取得的進展以及相關實踐經驗,希望對讀者有所啓發。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2020 年已結束,網易雲音樂(以下簡稱雲音樂)數據倉庫團隊取得了較爲滿意的成績,也獲得不小的成長。回顧團隊過去一整年的工作,我們主要聚焦於兩件事:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據交付提效"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據質量提升"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"交付提效"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我於 2019 年加入雲音樂,當時數倉團隊給我的第一印象是忙碌、年輕,這羣基本都是 90 後的年輕人每天都會加班,晚一點的甚至會加班到 10 點後,大家每天都在忙碌的處理工單,包括數據報告、數據埋點、臨時數據拉取等工作,但這些工作是瑣碎的,交付效率也不如意。事實上,大家都意識到,工作可以是瑣碎的,但是數據倉庫是需要化零爲整的,於是,管理層啓動了當時數倉的“任督計劃”,要構造新的雲音樂的數倉數據體系。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數倉數據體系建設,是個很大的工程,包含數據架構、數據模型、數據質量、元數據管理、數據服務等工作,但是大家首當其衝的意識,是要做好數倉建模,也就是數據的通用層抽象,雲音樂也不例外,在數據體系建設第一個重點工作便是數倉的通用層數據建模(data modeling)抽象。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"音樂數倉的通用層建模規範"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏大家可能會有個疑問:爲什麼數倉的通用層數據建模會是第一高優的重點工作?這裏我講一個實際例子。2019 年 11 月,我剛來雲音樂不到兩個月便收到一個需求:業務策劃同學希望通過一系列策略運營,最終能促成 look 直播裏的主播同時也能成爲雲音樂主站的高質量創作者(比如 mlog 創作者、電臺創作者、DJ、音樂人等等),這意味着在數據層面,look 直播的主播、音樂主站各種角色的創作者必須在數據上是通路的。較爲遺憾的是,當時數倉通用層建設不夠,很多人與人之間的關係、人的實體標籤體系並未建立,所有的工作都需要從 0 開始做,因此,這個需求最終也是集結了當時雲音樂所有數倉開發同學、耗時一個多星期才最終完成。所以我當時的想法便是,要快速的整理出雲音樂數倉的通用層數據建模規範與方法並落地。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"整理的過程細節便是數倉建模的標準流程,這裏就不一一展開介紹了。不過最終,我們編寫了《雲音樂數據倉庫建模規範》,並在該規範基礎上與負責集團通用基礎軟件研發的網易數帆(以下簡稱數帆)團隊共同創建了“EasyDesign”數倉模型設計系統,雲音樂的數據主題域、主題域之間的關係也明確下來,如下圖:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/d2\/d2f3e7368fa66f76f9b87c0684b116ba.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"EasyDesign 數倉模型設計系統截圖:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/0d\/0de4c8da387b0218a905102a105663bc.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於此數倉通用層建模規範,雲音樂的數據資產沉澱開始變得有方向,在半年內,便把大量關於人、物、場景等實體的高頻數據資產完成通用化設計並落地,工單需求的開發鏈路得以縮短、整體交付效率開始改觀。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不過,雖然通用層的建設減少了重複性的開發工作,但畢竟是中間層,在針對一線同學的業務需求交付的“最後一公里”仍然存在瓶頸:手工、Excel、報表的交付始終是被動的、要排期的、不那麼高效的,我們如何才能更高效的交付呢?甚至讓一線的業務用戶能自助的獲取想要的數據?爲解決這個問題,EasyFetch 孕育而生。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"自助取數,數據配送的“最後一公里”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"EasyFetch 的誕生還是一波三折的,當我意識到在交付最後一公里的問題後,我便聯繫數帆大數據專家郭憶,看看是否有現成的工具能幫助解決這個問題。我記得當我把需求給郭憶講清楚後,一開始郭憶的反應是選型類似考拉海購的數據提取產品,但幾經討論,最後確定在介於有數報表、數據模型之間創造出一個新的分層 OLAP 工具,計算引擎選用 Impala,並且在初期獨立開發以快速迭代試錯,後期出成果後合併至有數。至此,自助取數產品 EasyFetch 誕生。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"EasyFetch 初期時截圖:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/4b\/4bd53dada4e2164b9423d75bb5a2d99b.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"EasyFetch 上線之後,團隊很快收到了來自雲音樂業務策劃、運營團隊的反饋,主要集中在如下兩個方面:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"有些同學仍然習慣於人工手工取數排期,不瞭解自助拉取數據的新工作模式"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"好用,取數再也不用排期了,希望獲得更高的性能,以及報表、推送、分享等功能"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一點反應的是工作習慣轉變的適應過程,我們做了如下的努力:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"加入了精煉的使用入門視頻,幫助運營、策劃快速上手;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"維度、指標數據獲取的過程,儘可能只需通過拖拉拽就可實現;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"明確每個指標、口徑,並明確其使用方法;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"加入數據首頁,對每個數據模塊進行使用場景介紹;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"前前後後組織 30+ 場線上線下培訓,針對每條業務線的運營、策劃成立自助取數 POPO 運維羣,甚至一對一交流;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以上努力的效果是顯著的,到 2020 年 9 月份後,我們的自助取數 POPO 運維羣的交流驟減,但是取數的吐吞量穩步上升,這意味着自助取數已成爲普遍的習慣。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"至於第二點,一方面需要在有數的基礎上做功能迭代,於是在去年的 5 月份,雲音樂數倉團隊同數帆的有數 BI 團隊完成了到有數 EasyFetch 的遷移以滿足用戶不斷對系統功能的需求;另一方面我們同數帆的大數據平臺研發團隊針對慢查詢的場景,做了大量的 Impala 性能優化工作;藉此契機,雲音樂數據開發與數帆大數據平臺研發團隊建立了良好的數據開發知識分享機制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏也簡單介紹下依據當時策劃、運營要求,EasyFetch 所實現的功能要求:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圈人,且圈人後能生成人羣包對接投放"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自助拉取的數據能生成圖表"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"能自驅生成數據任務,並定時調度同步至其他系統或 Excel"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自己生成的取數、報告甚至是維度、指標能分享其他人"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有數版本的 EasyFetch 截圖:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/90\/902363bb72d19220354e85581050fb91.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/68\/687f1a8ffd04492f8c4afa3e8ba73ad3.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最終,在雲音樂、數帆雙方不懈努力下,EasyFetch 在音樂發芽、生長,成爲了業務策劃、運營臨時取數、多維分析的重要手段,也大幅度的釋放了雲音樂數據開發的人力資源。總的說來,"},{"type":"text","marks":[{"type":"strong"}],"text":"一線業務策劃、運營獲取數據的時間從天級縮短到秒級、分鐘級內,極大縮短了策劃、運營取數週期,在數據開發人力方面預計節約 4-5 人力(30% 效能提升),"},{"type":"text","text":"因爲人力的節省、釋放,雲音樂數據倉庫團隊才得以喘息,有更多的人力投入在數倉基礎建設工作上,整體進入良性循環。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自 2020 年 6 月起的數據表現:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/56\/5651f1bfdcf185ef2fd69f36ab1dae60.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"數據需求的管理面板"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"任務管理是本分工作,做了並不會有多少正面的誇讚,不做,一定會有反面的投訴。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"過去的一年,雲音樂數倉團隊一共承接了約 1200 個功能性的大小需求。在初期,很多業務策劃、運營同學抱怨自己提完需求後,根本不知道要做多久、什麼時候交付,即使口頭約定了交付日期,也可能因爲諸如老闆需求插入等原因導致交付 delay。另一方面,我們的數據開發同學每天都很忙,經常是滿負荷運轉,手頭上需求很多,但是不知道哪個優先級是最高的,即使交付了需求還經常面臨着被投訴 delay 的情況。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對這樣的情況,我們務必要針對各業務線建立其任務管理的面板,其目的有二:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"讓業務策劃、運營同學非常清楚知道自己的需求在哪提、怎麼提、要多久、目前進度在哪"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"讓數據開發同學的工作有計劃、有優先級"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"依照任務管理的四象限法則,我們對任務進行了分類,在任務收集、任務執行、任務完成三個環節定義了若干管理的要素,如下圖:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/1b\/1bf2d3c9925d8369471fc2b639d730d0.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/83\/83ec5122621395178831500606b45892.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同時,我們建立的任務看板,並同步至一線的業務策劃、運營同學。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/23\/239128973db361ef31937f832d177032.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"至此,數據開發任務對於策劃、運營同學們完全透明,隨之而來的結果也是業務對數據開發的整體滿意度評價提高了 40%。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"質量提升"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據質量是任何一家數據驅動的互聯網公司都避不開的高優話題,雲音樂也是如此,而埋點的數據質量治理則是雲音樂數據質量治理的核心工作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而像雲音樂這樣的音樂內容生產、消費的商業公司,數據質量的問題主要集中在用戶行爲日誌的採集上,也就是埋點。同大多數社交類互聯網公司一樣,在雲音樂,核心 KPI、產品功能的迭代、AB 測試、市場投放、流量分發、內容效率的評估等等業務都是構建在埋點數據基礎之上的,因此,埋點的重要性可見一斑。比較遺憾的是,截止 2020 年初,雲音樂的埋點的 bug 還是較多的,各業務線時不時會投訴埋點數據不準。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/83\/8399cdaca2233a8f38a9078079be7fb4.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"有的同學會問,要解決埋點的數據質量問題爲何不採用市面上成熟的商業化埋點產品?要回答這個問題,我們先了解下一般隨着企業的發展,埋點的發展歷程會經歷的以下三個階段:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"階段一:在 APP 的關鍵流程中,記錄曝光、點擊等時間,以滿足基本的轉化率統計,如點數 \/ 曝光;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"階段二:在階段一的基礎上,增加事件之間的關聯追蹤,以達到用戶鏈路分析;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"階段三:在階段二的基礎上,增加用戶鏈路上消費的任何商品、信息、內容,通過 ETL 最終建立用戶畫像分析、資源畫像分析、用戶流量分析、流量地圖等數據資產;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"很明顯,雲音樂已步入了第三個階段,市面上成熟的商業化埋點產品無法自由的定義、管理、採集用戶鏈路上的任何商品、信息、內容,這與企業的業務是息息相關的。因此,市面上通用的商業化的產品是無法滿足雲音樂埋點需求的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於以上原因,我們再次選擇了與數帆團隊進行了共建,制定流程、規範並完成系統化,埋點管理平臺“EasyTracker”面世。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"借力 EasyTracker 埋點管理平臺,雲音樂實現了埋點的事前、事中、事後管理。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"事前 - 基於 SPM\/SCM 的需求管理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"先介紹下概念。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SPM (Super Position Model 超級位置模型):與 Google Analytics 在 URL 裏面拼上 utm_source, utm_medium 等參數大同小異,主要是用來定位;"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"SCM(Super Content Model 超級內容模型):與業務內容一起下發的埋點數據,用來唯一標識一項業務內容。客戶端打點時,將 SCM 編碼作爲埋點的參數上傳;"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雲音樂 SPM 格式由 6 段組合而成,即 A.B.C.D.E.F(其中 A、E 在採集中實現),具體信息如下"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/e5\/e5fb6bb097bb7b6686f559bd3414e40a.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在雲音樂原有的埋點需求 JIRA 中,業務策劃、業務分析師會各顯神通描述各自要實現採集的數據格式、內容,沒有統一的語言來描述埋點需求,導致的結果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"理解成本極大,需要花大量時間進行需求溝通;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"無統一實現規範,無法形成技術沉澱,不同的客戶端同學對同一埋點的埋點時機、枚舉值都可能不一致,最終導致數據質量問題;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"SCM 枚舉值較爲隨意,需要大量的 hardcode 來處理異常值,難以保證效率、質量;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,在事前約定好一套同一的埋點數據需求規範(DRD)變得極爲重要。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"新增埋點數據需求的規範化,也是圍繞坑位來進行的,最終雲音樂與所有產品策劃、業務開發、業務 QA 共同制定了一套規範化的數據需求格式文檔,格式如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/8a\/8a41885056ce763bc1b1d4990b32951e.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"整個埋點數據需求(DRD)描述清楚用戶需求的三個要素:SPM、Action、SCM。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一方面,SCM 定義後需在 EasyTracker 系統裏進行需求登記:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/85\/8507de105a2c8e54066792810c4fa87c.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另一方面在 EasyTracker 進行 SCM 的枚舉值管理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/34\/34c93d7f872f83c90fb2bc5ea7ed9f90.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"至此,在埋點事前的需求管理上,雲音樂統一了語言。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"事中 - 建立規範的多方的協作、任務流轉"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"流程與效率有時候是針鋒相對的,但是流程對規範化的執行是促進的。過重的流程當然會影響工作效率,因此,雲音樂本次的流程改造仍偏向輕量級,着重解決因需求不規範造成的數據需求理解偏差與數據質量管理的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"經與埋點參與的各方討論後,制定瞭如下流程:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/b2\/b2e79278cba4dd2193d931a050d44fe9.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在此流程中,數據開發作爲需求的承接方、埋點格式的設計方,需在評審中把控兩件事:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"埋點格式在 APP 的各版本、各終端是規範、一致的"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"明確埋點內容、埋點時機與業務需求期望是相符的"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以上兩件事確認後,方可進入埋點開發實現的流程。通過此流程的規範,數據分析需求與埋點實現的 gap 得以控制。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"系統實現如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/e6\/e63c9742b0aa19b5fbaa4865e49615d4.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"事後 - 建立數據稽覈機制"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在埋點實現後,埋點的監控着重針對以下兩個場景進行數據稽覈:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"灰度版本"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"線上版本"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據稽覈報告,着重監控埋點日誌、坑位兩部分數據質量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"整體埋點監控規則設計:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/a9\/a954c3488fce8e5f4aa594c094f40248.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於上圖的規則設計,我們在 2020 年 8 月整體上線了埋點數據稽覈的模塊"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/06\/067ec3d2a089b079e54a453380f024a9.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/dc\/dc2d1b06950f68d268def79d856bed29.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後,經埋點的前、中、後三環節的治理,最終在 2020 年年底,埋點的線下 bug 率下降了 50%。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後在總結環節,談談自己這一年來在數倉開發團隊管理方面的一些體會:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"交付效率的提升,不是埋頭苦幹,也不是簡單的制度流程,反而更像是是需求方與開發方的小生態平衡:當開發方讓開發任務變得有序、透明會反推需求方的需求提得更合理、更聚焦;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"質量的管控是一場持久戰,要不斷在事前、事中、事後三個環節折騰、PDCA(plan、do、action、check),要學會歸納總結,找到 root cause,問題自然會慢慢收斂;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":"無論是提效抑或是質量管理,沉澱的經驗、規則最好是固化成系統產品,數據的研發也應該有自己的 CI\/CD 體系;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","text":"最最重要的一點,團隊成員的成長至關重要,我們堅持的每週分享、學習幫助大家刷新了數據驅動的思維,並逐漸學會了新的任務管理模式,也夯實了每位同學的開發技巧。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者簡介:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"黃浩然,網易雲音樂數據平臺開發專家,數據倉庫負責人,在大數據架構設計、數據治理、數據模型設計和數據倉庫領域已經擁有 16 年工作經驗。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章