DàYé玩轉數據戰略Step By Step

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爺我天蠍座,打小不喜歡湊熱鬧,更不會強迫自己融入熱鬧,什麼國慶長假,什麼網紅餐廳...資要是讓我排隊,就感覺是浪費生命,我是萬萬不會屈從的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個說不上毛病的毛病,或多或少也在影響我在專業領域的判斷。比如,19年中臺這股熱鬧妖風颳起來的時候,我基本是捏着鼻子遠遠躲開的。直到最近在調整組織戰略,數據運營重新進入我的視野,落地卻屢屢不得法時,不得不靜下心來探究一番,熱熱鬧鬧的數據中臺到底在說什麼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"翻閱了不少資料,這裏推薦下《"},{"type":"text","marks":[{"type":"strong"}],"text":"數據戰略"},{"type":"codeinline","content":[{"type":"text","text":"-如何從大數據、大數據分析和萬物互聯中獲利"}]},{"type":"text","text":"》《"},{"type":"text","marks":[{"type":"strong"}],"text":"數據中臺"},{"type":"codeinline","content":[{"type":"text","text":"-讓數據用起來"}]},{"type":"text","text":"》這兩本書(部分內容也參考了它們),讀罷獲益匪淺。小標題裏的關鍵字"},{"type":"text","marks":[{"type":"strong"}],"text":"獲利、用起來"},{"type":"text","text":"都是今天我要說的重點。"}]}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"以史爲鑑"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們先看看"},{"type":"text","marks":[{"type":"strong"}],"text":"工業革命"},{"type":"text","text":"的演進路徑,從1.0的蒸汽機時代,到2.0的電力、流水線和大規模生產時代,再到3.0的計算機自動化時代,最後是"},{"type":"text","marks":[{"type":"strong"}],"text":"4.0的智能化時代"},{"type":"text","text":"。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/5c/5c416fcb0ce12d5b4960a69732aacc0e.jpeg","alt":"","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"再瞅一眼"},{"type":"text","marks":[{"type":"strong"}],"text":"互聯網Web"},{"type":"text","text":"的發展路徑,從1.0的計算機互聯"},{"type":"text","marks":[{"type":"strong"}],"text":"只讀"},{"type":"text","text":"時代,以網絡"},{"type":"text","marks":[{"type":"strong"}],"text":"單向"},{"type":"text","text":"提供靜態內容給人的方式出現;到2.0的互動分享社交"},{"type":"text","marks":[{"type":"strong"}],"text":"讀寫"},{"type":"text","text":"時代,以人與人的溝通、創作、傳播、協作的"},{"type":"text","marks":[{"type":"strong"}],"text":"雙向"},{"type":"text","text":"特徵出現"},{"type":"codeinline","content":[{"type":"text","text":"(如Facebook/Youtube,HTML5/CSS3技術)"}]},{"type":"text","text":";再到3.0的移動語義和物聯 "},{"type":"text","marks":[{"type":"strong"}],"text":"讀寫執行"},{"type":"text","text":" 時代,計算機可以智能生成、理解和分發用戶需要的內容,更理解語義更通人性"},{"type":"codeinline","content":[{"type":"text","text":"(如蘋果Siri, 小度)"}]},{"type":"text","text":"; 2020開始的4.0時代,目前還沒權威定義,有"},{"type":"codeinline","content":[{"type":"text","text":"雲操作系統、去中心化區塊鏈、共生網絡"}]},{"type":"text","text":"等,下圖的Web OS其實也只是形態之一罷了。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/f9/f9689f6713d1974d8338f131207d1337.jpeg","alt":"","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖自: https://medium.com/@vivekmadurai/web-evolution-from-1-0-to-3-0-e84f2c06739"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://medium.com/@tuhfatussalisah/world-wide-web-from-web-1-0-to-web-4-0-and-society-5-0-48690a43b776","title":null},"content":[{"type":"text","text":"參考1:從1.0到3.0的Web進化"}],"marks":[{"type":"strong"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://medium.com/@tuhfatussalisah/world-wide-web-from-web-1-0-to-web-4-0-and-society-5-0-48690a43b776","title":null},"content":[{"type":"text","text":"參考2:萬維網:從WEB 1.0到WEB 4.0和SOCIETY 5.0"}],"marks":[{"type":"strong"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"http://ahmadfaizar.blogspot.com/2018/08/evolution-of-web-web-10-web-20-web-30.html","title":null},"content":[{"type":"text","text":"參考3:萬維網的進化: 從Web1.0到Web4.0"}],"marks":[{"type":"strong"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不管是工業革命還是萬維網,它們的進化史都有驚人相似之處。蒸汽機只是讓車跑的更快讓機器更有力量麼?你更應該知道火車進入運輸業讓印刷業極大受益,並帶來知識的大範圍傳播。電力就不用說,催生出電視電話,更加速了資訊的傳播。而且"},{"type":"text","marks":[{"type":"strong"}],"text":"人"},{"type":"text","text":"在整個工業或者信息鏈路上主導的能力越來越弱,隨着連接(互聯、物聯)的不斷髮生,信息流轉的速度不斷加快,信息數據的量級也呈爆炸性態勢,導致對信息處理效率和準確性的要求也持續增強。這種進化的"},{"type":"text","marks":[{"type":"strong"}],"text":"本質"},{"type":"text","text":"是,我們不再滿足於"},{"type":"text","marks":[{"type":"strong"}],"text":"信息的共享和傳播"},{"type":"text","text":",而是更加關注"},{"type":"text","marks":[{"type":"strong"}],"text":"價值的迅速轉移"},{"type":"text","text":"。什麼是價值轉移,舉個例子,外賣平臺識別到糟糕的天氣狀況,智能延長外賣的承諾送達時間,讓外賣小哥可以不那麼拼命趕路,這就是從天氣預報信息到外賣小哥交通安全的價值轉移。而做到這一切的關鍵內核,就是信息,也就是我們今天要談的數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而今各色公司都在喊的“數字化轉型”“數據中臺”“數據運營”“產業互聯網”等等都是數據(中臺)戰略的不同包裝或展現形式。"},{"type":"text","marks":[{"type":"strong"}],"text":"數字化轉型"},{"type":"text","text":"可不要與以前的"},{"type":"text","marks":[{"type":"strong"}],"text":"信息化轉型"},{"type":"text","text":"混淆,信息化可以簡單理解成以前的線下手工操作變成了線上系統操作,信息存起來了卻是零散的割裂的(即便是結構化的數據)。而數字化是在信息化之後的資源整合、數據連接後的價值挖掘和商業應用。"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"統一語言"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"開始之前,先跟各位讀者統一下數據語言,防止出現理解偏差,順便羅列一下我理解中的數據戰略體系內容。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#ffffff","name":"user"}}],"text":"1. 術語"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"數據資產"},{"type":"text","text":":能直接作用於業務領域,業務人員能閱讀和理解的,可計量、可控制、可變現的數據,才能稱之爲數據資產。"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"數據湖"},{"type":"text","text":" 裏的那些原始數據或者貼源數據,只能算是 "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"數據資源"},{"type":"text","text":"。數據湖一旦維護不當就可能變成 "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"數據沼澤"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"數據分類"},{"type":"text","text":":數據的定義五花八門,不同場景有不同的叫法,甚至重疊度很高。我簡單羅列了下自己知道的一些數據種類和名稱,錯誤或者疏漏之處歡迎指出。"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/14/14285b793c6af9ae276f9e807e790cfd.jpeg","alt":"","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#ffffff","name":"user"}}],"text":"2. 四大體系"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據戰略的落地規劃一般要假設這三個體系:"},{"type":"text","marks":[{"type":"strong"}],"text":"技術體系、數據體系、應用體系和監控體系"},{"type":"text","text":"。"},{"type":"text","marks":[{"type":"strong"}],"text":"技術體系"},{"type":"text","text":"無非就是平臺系統、數據組件等的開發。"},{"type":"text","marks":[{"type":"strong"}],"text":"數據體系"},{"type":"text","text":"是核心,就是專心把數據收集、彙總、加工這個形成數據資產的過程做好。而"},{"type":"text","marks":[{"type":"strong"}],"text":"應用體系"},{"type":"text","text":"提供數據服務,在數據資產的基礎上提供類似用戶畫像、信用評估、預警告警等應用服務。爲了讓前三個體系可以健康持續的運轉,需要規範、流程、評估、優化、改進等一系列監督輔助性職能,這個就是"},{"type":"text","marks":[{"type":"strong"}],"text":"監控體系"},{"type":"text","text":"。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#ffffff","name":"user"}}],"text":"3. 五個環節"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據開發5個環節:數據收集 -> 數據彙總 - > 數據開發 -> 數據應用 -> 數據優化。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#ffffff","name":"user"}}],"text":"4. 五個關鍵步驟"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"梳理現狀 -> 架構規劃 -> 開發數據資產 -> 應用數據到業務場景 -> 運營及優化。"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"第一步: 動起來 & 用起來"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/09/09a5078ff46ed28ab7b242939c600434.jpeg","alt":"horse.png","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"horse.png"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先,數據戰略行動一定是“"},{"type":"text","marks":[{"type":"strong"}],"text":"一把手"},{"type":"text","text":"”工程,因爲只有一把手才能推動數據戰略的落地。然而,再強勢的一把手,也有一拳打在棉花上的時候,畢竟執行力彪悍的團隊可遇不可求,而且更多時候不是執行的問題而是組織的問題:"},{"type":"text","marks":[{"type":"italic"}],"text":"令不出“朝堂”、政令不暢、部門牆、陽奉陰違"},{"type":"text","text":",不一而足。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以第一步,特別強調,務必從"},{"type":"text","marks":[{"type":"strong"}],"text":"可實操、有價值、可感知"},{"type":"text","text":"的業務場景來切入,我在這一步上是吃了大虧的。而通常符合這個標準的業務場景,從業務運營團隊的痛點中比較容易獲取。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"可實操"},{"type":"text","text":",說的是技術實現、政策規範、組織協調等層面進行實際操作的難易度。想象一下,"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"你想看PV/UV數據,系統連埋點功能還沒有規劃,實操性就很弱。後端規劃數據上傳和數據存儲倒不麻煩,前端需要仔細設計如何埋、埋哪裏、何時觸發、耗電、權限、性能等等一系列,就不是個小事情。等把埋點實現好,App發版對外,用戶更新完,之前的活動頁PV UV需求可能都過去式了;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"風控授信需要大量數據支撐,資信數據、社交數據、行爲數據、設備數據...有些需要用戶的授權,有些授權也沒用,因爲政策上就不合規,更令人惱火的是即便用戶授權了也會投訴你誤導用戶點擊...這種數據的獲取必須慎之又慎;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"當你的數據分析部門只有關係型數據庫的技能,對Hive HQL語法、圖GraphQL語法千推萬阻之時,你能怎麼辦?"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"有價值"},{"type":"text","text":",比較容易理解。沒有價值的數據戰略行動是沒有生命力的,且不具備可持續性。它不應該滿足某些個人的喜好,也不能是勞民傷財的政績工程,而一定是數據的"},{"type":"text","marks":[{"type":"strong"}],"text":"場景應用"},{"type":"text","text":",可以應用於運營,可以是風控,也可以是市場,但儘量不要只是應用於老闆的桌面。就我而言,判斷一個數據項目是否有價值,需要重點關注4個領域:"},{"type":"text","marks":[{"type":"strong"}],"text":"用戶、市場和競品、財務、運營"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"可感知"},{"type":"text","text":",是對數據戰略升級的一個特別重要的鋪墊。直白點說,數據成果要能大張旗鼓的展示出來,讓團隊感受到它的價值和成就感。比如很多人當成政績工程的***監控大屏***,就是能讓團隊感受到業務流淌的脈搏,感受到與用戶面對面的呼吸,還有什麼比這打出的雞血更濃郁?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"做到以上三點,這一步就算基本做踏實了。也不建議做多,做的越多落地週期就越長,你的數據戰略也就遲遲不見蹤影。總結下這個階段的特點是"},{"type":"text","marks":[{"type":"strong"}],"text":"業務驅動數據"},{"type":"text","text":",不做高大全的頂層設計,夠用就行。不追求完美的規劃和架構設計,hard code都行。從獨立的小項目切入,甚至多個小項目是各自爲戰的狀態也別介意。至少你已經走起來了不是麼?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"提醒一點,先動起來不是說沒有設計沒有規劃,而是不要追求完美和完整,這個度請自行揣度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"至少對於“"},{"type":"text","marks":[{"type":"strong"}],"text":"先儘可能多的把全量數據收集起來"},{"type":"text","text":"”這一點,我是持反對意見的。埋點收集一堆用不上的或者垃圾數據,光存儲成本就是一種成本浪費,更不要說垃圾數據可能造成的決策誤導。有句俚語是:"},{"type":"text","marks":[{"type":"strong"}],"text":"別讓數據變成白色大象(White Elephant,代價高昂卻一無是處)"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"第二步: 打造數據文化"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/8c/8c538b46a3abdf2cc624b356d90bb1b6.jpeg","alt":"","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"打鐵趁熱,一鼓作氣。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當組織感受到數據的甜頭之後,就會想從數據中攫取更多好處。當然,隨之而來的阻力也會更大。比如承載巨量數據資產的硬件成本是高的嚇人的,無論怎麼逼業務部門提出來的數據需求都不怎麼像樣,或者永遠都是就是那幾個無關痛癢的業務指標,隨着數據戰略的全面鋪開,難度不斷加大,玩不轉打退堂的、想砸錢的、不想花錢的...各種衝突都會浮現出來了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"關於提不出來需求這個事,相信很多做過數據的朋友都會有共鳴。拿着一堆數據就好像知道了問題的答案,卻不知道問題是什麼。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"正如科幻聖經《銀河系漫遊指南》裏的超級計算機,對“生命、宇宙和萬物”計算出的最終答案是數字 "},{"type":"text","marks":[{"type":"strong"}],"text":"42"},{"type":"text","text":"。你卻不知道這個42對應的問題是什麼,這纔是問題本身的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而"},{"type":"text","marks":[{"type":"strong"}],"text":"數據文化的意義"},{"type":"text","text":"就是,讓員工意識到數據的無限可能性,沉醉於數據的價值挖掘,並最終得意於數據的業務應用、商業決策以及經濟利益。再概括濃縮到一句話,"},{"type":"text","marks":[{"type":"strong"}],"text":"數據文化就是放大數據金礦的誘惑力,讓員工趨之若鶩、甘之若飴"},{"type":"text","text":"。相信每個組織都有自己專屬的文化特徵,而數據文化,也並無不同。無非是將數據思想植入產品的全生命週期,Data Driven Everything。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這裏還是從我自己的角度先給幾點血淚踩坑建議吧:"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"一把手一定得親自參與數據文化的打造,在民主投票出現衝突時、推行卡滯時、優先級排序、方案評審時,這些重要時刻一定不能偷懶或者敷衍,一把手不認真,文化就不可能認真,這是第一原則;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"數據文化不能成爲政績工程,當數據的產出成爲績效的一部分或者吹噓的資本時,必須提高警惕。就像寫的代碼有Code Smell,數據文化同樣有Data Smell\u0018\u0018。比如數據體系(元數據/標籤/數倉)這個地基都沒打好,就說自己的數據服務-用戶畫像多麼多麼牛掰,這種頭重腳輕的Smell特別普遍\u0018;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"數據建設需要專業人才,數據團隊作爲數據文化的智囊團和踐行者,需要極高的專業度。從架構到工具,從模型到服務,從展示到安全,必須面面俱到;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"數據文化需輔以完善的規範和流程制度來護航,對於數據資產的管理除了靠人靠技術,就剩下靠制度了。規範制度是數據戰略可以持續健康運作的必備條件,否則面對數據中臺這種級別的航空母艦,一個歷史包袱完全可以讓你無法彌補。一個行之有效的“航母操作手冊”,比苦口婆心的口水有用的多;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"文化的落地前期可能需要一些命令式操作,但是隨着組織意識的完善,中後期不能讓文化走偏成爲一種約束。比如逢必談數據,過於偏執於數據,可能會讓決策過程複雜化,產品流程延長,更壞的情況是組織的數據能力不到位,給出錯誤的數據結論,那就糟糕了。所以相信數據的力量,但不能迷信數據的結論"},{"type":"text","text":"。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個階段的重點是要打造較爲完整的數據中臺架構和組織,整合散落的數據,消除數據孤島,規範數據的採集、存儲和分析,所以"},{"type":"text","marks":[{"type":"strong"}],"text":"規劃、整合和規範"},{"type":"text","text":"是本階段的關鍵字,劍指"},{"type":"text","marks":[{"type":"strong"}],"text":"數據驅動業務"},{"type":"text","text":"。在上面“"},{"type":"text","marks":[{"type":"strong"}],"text":"統一語言"},{"type":"text","text":"”章節提到的體系和步驟都是架構核心內容,你的數據中臺不是在建設這些內容,就是在建設這些內容的路上。再解釋一下何謂完整,比如容易被人忽略的“數據安全”,是數據戰略的重中之重,絕對不能輕視和忽略,甚至延後再補都不能允許;組織上,數據團隊的委員會、產品、開發、質量、模型、運維工程師都不可或缺。"}]},{"type":"heading","attrs":{"align":null,"level":1},"content":[{"type":"text","text":"第三步: 數據優化和可持續"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/f2/f2cb0e38d6a26c7b2562195fa44cc8e4.jpeg","alt":"","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"說實話,我們也正在這個階段的河裏摸着石頭的人。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當你在天量的數據源上又沉澱出海量的數據資產時,沒有人可以保證這些數據資產的質量如何,價值難以評估,安全性未知或者堪憂,數據的管理可能也浮於表面,更多是爲了業務應用而倉促成形。另外隨着組織戰略的調整,一些歷史性數據對整體數據中臺的衝擊和負擔,也應該同步清理,拋棄歷史包袱。這些都需要"},{"type":"text","marks":[{"type":"strong"}],"text":"治理、監控、優化和升級"},{"type":"text","text":"的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"先說"},{"type":"text","marks":[{"type":"strong"}],"text":"數據治理"},{"type":"text","text":",這個概念也歷史悠久了,甚至理論體系都有好多,像 DAMA、CMMI、DGI等等。基本上數據治理的目標有這幾個:"},{"type":"text","marks":[{"type":"strong"}],"text":"提升數據質量、構建統一的數據標準、組織內達成一致的解決數據問題的方法、透明完善的數據管理流程、數據的可持續運營"},{"type":"text","text":"。而數據治理的發展趨勢也各有選擇,有采用AI來提升數據治理效率的,有采用元數據爲核心的分佈式治理,我們採用的是後者。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"元數據Metadata是什麼?描述數據的數據。抽象的定義...它一般分爲技術元數據,如表結構、字段約束、字段字典,業務元數據,如業務指標、業務術語,管理員數據,如數據Owner,數據安全等級等等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"元數據的應用場景很多,常見的如ETL程序做數據轉換時需要知道源數據的結構和字典,通過數據血緣分析發現改一個字段的長度,會影響哪些系統,都是特別典型的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"再說"},{"type":"text","marks":[{"type":"strong"}],"text":"數據質量"},{"type":"text","text":",它的高低直接關係到數據決策的對錯。所以要對源頭數據質量、加工過程質量和使用價值質量進行全方面的評估和改進。簡單展開下,源頭數據的準確性、時效性至少你得確認清楚吧,1年前的用戶資信數據你敢用麼? 加工過程的質量更不必說,生成的標籤數據,也是有準確率、時效性、覆蓋量等,比如不是所有用戶都登記了性別的,那男女標籤覆蓋的只是用戶登記過性別的羣體而已。加工所需要的模型也是需要不斷調整的,以前是你看尿布給你推薦尿布,現在是你看尿布給你推薦啤酒。使用價值的質量相信不太好理解,其實像某個標籤的使用量,越多業務部門使用,說明這個標籤越有價值;某個高頻使用的標籤價值可以很低(用戶姓名),低頻使用的標籤價值可以很高(用戶信用評分)。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然後說點"},{"type":"text","marks":[{"type":"strong"}],"text":"數據成本的優化策略"},{"type":"text","text":"吧,線上常見的有重複計算、冗餘計算導致資源浪費,上面說的低價值的計算卻耗費了大量計算資源,不合理的任務調度或者邏輯實現導致並行成了串行,數據資產的產出頻率過於密集導致明明日報就行非得小時報刷新。這些也是數據運營的重要策略,而評估數據運營的兩個關鍵維度就是 "},{"type":"text","marks":[{"type":"strong"}],"text":"投入產出比 + 數據質量及安全"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後對於"},{"type":"text","marks":[{"type":"strong"}],"text":"可持續性"},{"type":"text","text":"就提一下"},{"type":"text","marks":[{"type":"strong"}],"text":"數據安全"},{"type":"text","text":",因爲數據安全關係到數據的全生命週期(產生-存儲-傳輸-使用-共享-銷燬),脆弱的安全體系甚至可以瞬間摧毀一個組織。這個我是真的不怎麼專業,只知曉一些基本的,如安全認證和權限管理、資源隔離、加密、脫敏、容災備份等等。這裏面還隱含一個數據來源的合規性、合法性,數據本身就是不安全的,當然你在此基礎上搭建的任何數據應用就更加的不安全了,極端點就是你的人身安全。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個階段基本上就是在迭代優化的路上不斷持續運轉,技術面、組織面、制度面都是需要跟蹤的,如何保持數據文化長久的生命力,將是核心話題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而我們將要長期走在數據鋪就的這條路上,不斷成長..."}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#6a737d","name":"user"}},{"type":"bgcolor","attrs":{"color":"#fff9f9","name":"user"}}],"text":"市面上談中臺的文章,開篇通常都是從芬蘭的supercell公司說起,無趣的很。今天看到一個有趣的史料說法,中國東漢的中樞機關尚書檯,號稱中臺。唐朝的三省六部制,尚書省也是中臺,轄六部。當然,此中臺非彼中臺,權當一笑。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"參考文獻"},{"type":"text","text":":"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"《數據中臺-如何從大數據、大數據分析和萬物互聯中獲利》"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"《數據戰略-讓數據用起來》"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"《中臺戰略-中臺建設與數字商業》"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/76/765846a42244156e258a346cc326a096.jpeg","alt":"","title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章