從入職到放棄再到改革成功:我是如何從0到1建立數據團隊的?

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"本文最初發表於作者個人博客,經原作者 Erik Bernhardsson 授權,InfoQ 中文站翻譯並分享。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這篇文章所提及的故事背景是在一家處於創業中期階段的初創企業(年收入約 1000 萬美元)組建了一支小型數據團隊(大約 4 人),儘管這個故事可能發生在很多不同的公司。這個故事是根據第 n (n≤3) 手經驗編造的,側重於團隊和組織,而非技術本身。爲了表示準確,我特意使用了“數據科學家”這一術語來代表非常寬泛的概念。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"初出茅廬,困難重重"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這是你成爲超級大公司數據團隊負責人的第一天,在你的面試過程中,首席執行官迅速而充滿激情地介紹了世界正在發生的變化,以及公司爲什麼需要跟上數據增長的趨勢,整個執行團隊都很興奮。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在最初幾個小時內,你可以訪問所有的主要系統。你開始在 Git repo 中瀏覽,並發現了一些有趣的代碼。這看上去像是一個用於預測流失率的神經網絡。你開始分析這些代碼,但是你被日曆通知打斷了,提醒你要與首席營銷官進行 30 分鐘的對話。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首席營銷官充滿激情。“我們對你的到來感到很興奮。近期,我與 HyperCorp 公司的銷售夥伴談過,他們正在與一家供應商合作,利用人工智能對用戶進行細分。太好了!我已經等不及要你沉下心來。” 在閒聊了幾句之後,你開始研究營銷團隊的數據操作。你問:“客戶獲取成本如何?”首席營銷官回答說:“嗯……其實還不錯。數據科學家們計算了這些數字,我們的在線廣告每次點擊成本都在下降。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你有點困惑,因爲你被告知所有的數據科學家都會向數據團隊報告,但是顯然其他部門也有自己的數據科學家?爲了後續行動,你做了筆記。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首席營銷官繼續說道:“真正的問題是,增長團隊並沒有把我們帶來的所有流量都轉化到網站上。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所以你會問是否有一個能查看轉化漏斗(conversion funnel)的儀表盤,但是首席營銷官卻說,轉化銷售渠道是增長團隊的工作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那天晚些時候,你與產品經理進行交談。剛剛對首頁進行了大改版,負責此項工作的產品經理非常激動,因爲用戶註冊量增加了 14%。當你詢問這種差別是否有統計意義時,你得到的卻是茫然的目光。“搞清楚,這不是我的工作,而是你們團隊的工作”,這位產品經理說。“上次我們問他們時,他們說他們沒有數據,這需要幾個月的時間。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"無論什麼原因,你會發現產品經理有更多的話要說,所以你讓她繼續。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"“此外,令人驚訝的事情並不是基於增量變化。我們決定不做 A\/B 測試,因爲有時你需要下很大的賭注,這會使你偏離最高值。喬布斯在發佈 iPhone 時並沒有做 A\/B 測試!我們團隊在最後期限的前兩天完成了這個版本,這纔是關鍵!”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你在筆記本上潦草地記下筆記,以顯得很忙碌的樣子。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"剩下的時間就是和新團隊聊天。這是一支只有三個人的小團隊,但你得到的消息是在年底前將其擴大到 10 人。你的團隊成員顯然爲你的到來而激動。他們向你介紹了迄今爲止所建立的一切。這裏有你之前見過的用於預測流失率的神經網絡。還有一個 Notebook,裏面有完整的推薦系統實現,可以幫助你找到相關購買項目。還有很多東西,有些還很酷。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你會注意到,很多代碼要經過非常複雜的預處理步驟,其中的數據必須從許多不同的系統中提取。看起來好像要運行幾個腳本,必須按照正確的順序手動運行,纔可以順利啓動。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你詢問爲什麼團隊還沒有投入生產。數據團隊似乎感到沮喪:“當我們和工程師交談時,他們說要將這個項目達到生產級別是一項很大的工程。產品經理已經將其納入待處理項目中,但是由於其他事情不斷,他們一直在推諉。對於這個項目,我們需要管理方面的支持。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當天晚些時候,你要和供應鏈負責人談話。看來他並不像首席營銷官那麼激動。他說:“老實說,我不知道我是否需要數據團隊的幫助。我們沒有這類問題。我們需要的是業務分析師。我們有一支團隊,他們每天都要花上好幾個小時做一個複雜的模型。他們連回答我基本問題的時間都沒有。我有一整張電子表格,裏面都是我渴望得到答案的問題。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你看一下電子表格,就會發現如下內容:提交支持請求並在 1 小時內得到解決的客戶轉化率和 1 小時之外得到解決的客戶轉化率分別是多少?以 100 美元爲間隔對訂單價值進行細分。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"問起“模型”,你會發現在谷歌表格中,這是一個非常複雜的東西,有很多 VLOOKUP 和數據,必須以正確的格式複製粘貼到正確的標籤。這些數據每天都會更新,模型的輸出決定了團隊當天的工作重點。不僅僅是這樣,他們還依賴電子表格來計算支付給供應商。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這基本上是很多公司在數據成熟的早期階段可能發生的事情:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"缺乏數據,數據碎片化。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 該產品的儀表化非常糟糕,所以數據通常一開始就沒有。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 數據系統碎片化,並且數據分佈在許多不同的系統中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 脆弱的業務流,雖由數據驅動,但很少或者沒有自動化。"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"對數據團隊的工作內容期望不明確。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 僱傭數據科學家是爲了進行研發,找出一些部署人工智能的其他方法,結果是沒有任何明確的業務目標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 數據團隊抱怨機器學習難以生產,但是看起來產品團隊並不關心這個功能。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 需要“英語到 SQL 翻譯”的人。"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"未經過數據驅動培訓的產品團隊。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 產品經理沒有把數據作爲構建更好功能的工具來考慮。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 在產品團隊想要構建的東西與數據團隊所擁有的之間缺乏一致性。"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"從根本上說,一種與數據驅動相沖突的文化。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 慶祝交付的文化,而不是慶祝可以衡量的進展和學習文化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 在團隊實際使用指標的情況下,它們是不一致的,衡量標準不高,而且在某些情況下與其他團隊有衝突。"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"沒有數據領導力。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 一個分裂的數據組織,不同的數據人員向其他職能領域報告。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 其他部門沒有得到所需的幫助,因此他們圍繞着數據團隊,並僱傭了很多分析師。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 缺乏標準化的工具鏈和最佳實踐。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面我們來談談如何才能真正擺脫這種困境。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"開始爲團隊制定方向"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在接下來的一週,你將爲數據團隊確定新的方向。數據團隊中的一個人在基礎設施方面有較多的經驗,因此你讓他負責建立一箇中央數據倉庫。目前你只需以最快的方式將數據發送到一個位置。計劃基本上就是每小時將生產數據庫的錶轉儲到數據倉庫中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"結果表明,你在前端用於廣告跟蹤的框架能夠輕鬆地將大量事件日誌導出到數據倉庫中,因此你也可以進行設置。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你將這些記在心裏,這是你以後要重新考慮的技術債務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/aa\/aa384278859fe96e83c50072422ac29d.jpeg","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 1:對數據如何進入數據倉庫的極其粗略的概括"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你與招聘團隊合作,爲通用數據角色定義簡介,強調核心軟件技能,但應具有通用的態度,並深入瞭解業務需求。現在,你將所有涉及人工智能和機器學習的內容從招聘廣告中刪除。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你花更多時間與不向你報告的各種數據人員接觸。營銷團隊中的數據科學家是個年輕人,你可以看得出來,她和你交談非常興奮。她說:“我一直想成爲數據科學家,我等不及要向你學習。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當天晚些時候,你打電話給經營編碼訓練營的朋友,詢問他們是否有 SQL 培訓方面的好課程。他們說有,所以在那個月的晚些時候,你做了一些安排。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你開始爲產品團隊做一個關於 A\/B 測試及其工作原理的演講 PPT。你提供了很多從以前的經驗中獲得意想不到結果的測試實例,並使演示的部分內容具有互動性,讓觀衆去選擇。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你跟蹤首席執行官的執行助理,並在那一週晚些時候在她的日曆上得到了一些安排。你的目標是弄清楚她每週要通過自動電子郵件彙報的指標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那周晚些時候,你和供應鏈團隊的幾個業務分析師交談,你意識到他們也很通情達理,但是他們似乎在與數據團隊之前的互動中受到了傷害。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"他們中的一位在過去的工作中有過 SQL 經驗。他有一個關於轉化率的問題,你意識到應該用一些已經複製到數據倉庫的表來回答這個問題,所以你給他權限,讓他試試。你真的不知道會發生什麼,但是你覺得這值得一試。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你每週都會和整個組織中需要數據的關鍵員工建立一對一的關係。重點是發現數據差距和機會,然後把它們交給數據科學家。有些數據科學家對研究工作的輕重緩急感到失望。你說:“我們需要集中精力盡快實現業務價值”,但你補充道:“我們也許很快又回到機器學習的領域……讓我們來看一看”。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"三個月之後:陷入無盡的溝通和協調"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"已經過去三個月了,但是你感覺自己開始在某些方面有所進步。每星期與客戶進行一對一的會談,都會不斷髮現巨大的盲點和數據發揮作用的機會。你使用這些內容作爲許多核心平臺工作的強制功能。特別是,需要建立許多管道來生成“衍生”數據集。這些分析的前期成本非常昂貴,但是一旦建立正確的數據集,後續的分析就會更容易。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你已經開始將訪問數據倉庫的權限向其他部門的其他團隊開放。有些人開始學習 SQL,自己做很多基礎分析。一位初級產品經理髮現 iOS Safari 的轉化率很低,這是一個早期的成功。結果發現在本地存儲中出現了前端錯誤,並且只需一行代碼就可以修復。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在思考自己所有進步時,你突然被一封來自供應鏈主管的郵件打斷。他非常生氣。很明顯,他們的模型都不管用,這是他們的一個大問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你立即給你在那裏認識的人發了一條 Slack 消息。他是業務分析師,當你給他權限時,他急切地開始寫 SQL。壓力超級大。“數據庫中的表發生了變化,我們用來填充電子表格的 SQL 查詢突然產生了無意義的輸出”。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"看到這個 SQL 查詢的時候,你差點崩潰,這個查詢長達 500 行,查詢的作者也有些生氣。他說:“我們曾多次來找你們,請你們幫忙解決這些問題,但是你們說沒資源,所以我們就自己建造。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你團隊中的數據科學家被指派負責大型 SQL 查詢,他們很不開心。他說:“那個團隊寫這些查詢很傻,我們告訴他們,這種情況將會發生。這種 MBA 類型的人沒有什麼用處。另外,我受僱研究機器學習,而非調試 SQL 查詢。”你很絕望,你試着給他許下物質承諾。你說:“請盡力而爲,我保證本月晚些時候給你找到一些很酷的機器學習問題。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當天晚些時候,你正在參加一個會議,討論最近的版本。結算團隊的產品經理對信用卡流程進行了重大改革。但是當你問他,他們是否看到了相關指標的改進,他卻感到困惑。他說:“我們還沒有時間去研究這個問題。”你很失望,因爲作爲一個數據科學家,只進行一項粗略的分析還是很容易的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最起碼,那天晚些時候你感覺好些了。營銷團隊中的數據科學家發郵件告訴你,她已經跟她的經理談過了。首席營銷官對她向你彙報完全沒有意見,但明確表示:“我需要她 100% 的時間來做營銷。”你聯繫人力資源部門,要求他們對內部系統進行更新,以便作出管理方面的改變。雖然她的資歷顯然很淺,但她對複雜業務問題的把握能力令你印象深刻。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那天晚上 9 點,你結束了工作。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"開始改革"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你已經開始爲最緊迫的需求打下最基本的基礎:所有重要的數據都在同一個位置,查詢起來很容易。公開 SQL 訪問和培訓其他團隊使用 SQL,意味着很多“SQL 翻譯”將消失。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而另一方面,有些團隊將會用他們新獲得的自由走得太遠。爲數據訪問設置嚴格的“護欄”來防止這種情況的發生是很有誘惑力的,但這常常帶來更多的弊端。一般而言,人們都是理性的,做一些能給企業帶來正面投資回報的事,但是他們可能不明白數據團隊能爲他們做什麼。你的工作就是爲了證明!"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同樣,在結算團隊中,你也會看到類似的情況:有一個簡單的分析,你的團隊本可以完成,但並沒有,因爲團隊不知道該問誰。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這主要是組織方面的挑戰。團隊不知道如何與數據團隊合作。即使你沒有意識到,你也可能成爲瓶頸。其他團隊將圍繞數據團隊開展工作。許多“簡單的”分析都沒有完成。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在我看來,最應該推動的是集中的報告結構,但同時保持工作管理的分散。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲什麼?很大程度上是因爲它在數據和決策之間形成了一個更加緊密的反饋循環。若每一個問題都要通過中心瓶頸,則交易成本將非常高。而你又不想把管理權力下放。有能力的數據人員希望向瞭解數據的經理報告,而非業務人員。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/3f\/3f214a3b0bf6946b3d57877a34dd7661.jpeg","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 2:擁有集中積壓和集中管理的數據團隊"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"取而代之,將資源管理工作推給其他團隊。給他們一小撮數據人員,讓他們一起工作。這些數據人員將能夠更快地完成迭代,而且還可以開發寶貴的領域技能。這樣可以減少其他團隊對數據團隊工作的依賴,並且可以形成自己的資源。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/ae\/ae0db52b61b75cfae06730510bd5c0c9.jpeg","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 3:數據團隊,積壓分散但管理集中"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個好的方面是,在某種程度上,你的結果本身會推動整個組織的集中化:營銷團隊中的初級數據科學家轉到你的團隊中,因爲她想爲你工作。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"團隊擴大了"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此時,你的數據團隊已經擴大到六人。其中一人忙於處理與數據倉庫有關的基礎設施。對於其他五人,你將他們每人都分配到一個團隊:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個被分配到產品團隊。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個被分配到供應鏈團隊。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一個被分配到結算團隊。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你已經有來自營銷團隊的數據科學家在從事營銷工作。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後一個人被分配服務於首席執行官和投資者 \/ 董事會。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你向一大羣人發送了一封電子郵件,概述了這一變化,並且清楚說明了人們應該與誰合作以滿足他們的數據需求。當你僱用員工時,你計劃在公司內將他們分配到不同的團隊。大部分都是產品 \/ 工程團隊,但是在某些情況下還有其他團隊。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"團隊人員出現變動"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你以一封令人沮喪的電子郵件開始一天的工作。你的一位數據科學家決定離開,他寫道:“我要去 XXX 公司,加入他們新的機器學習團隊。”你不想說服他留下。老實說,有一陣子他看起來並不高興,而且你也沒有什麼工作能使他興奮。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"相反,你的團隊裏有一羣興奮的新人。他們中的大多數人都懂得一點軟件工程,一點 SQL,但是最重要的是要從數據中發現有趣的洞察力。你認爲他們是“數據記者”,因爲他們的目標是從數據中發現“爆料”。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你的團隊中有一位特殊成員直接與業務團隊合作。她幾乎每天都和產品經理談話,團隊也很喜歡她,因爲她提出了很多見解。舉例來說,當前業務中有一個很大的阻礙是需要問客戶要地址,儘管其實運算中並不需要。在隨後的 A\/B 測試中,除去這一步驟,轉化率增加了 21%。在一開始就很難發現這個問題,因爲數據庫中的數據模型非常複雜,必須建立一套 ETL 作業,以便數據“扁平化”成表格,纔可以便於查詢。然而,一組 Python 作業的組合,就能發揮作用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"那天晚些時候,所有主要項目都進行了季度回顧。這是件大事兒,首席執行官也在場,一切進展都讓她感到興奮。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"輪到增長計劃時,主要的產品經理介紹了他們推出的新的引人注目的登錄頁面。產品經理多次指出,由 20 名工程師組成的團隊正在加班加點地趕着最後期限,她把設計師們的工作介紹給大家。首席營銷官對此參與度極高,她沉默了片刻,說道:“迄今爲止的指標是什麼?我們知道客戶獲取成本是否已經下降了嗎?”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"產品經理回答道,已經做過 A\/B 測試,並且在演示的附錄中有數字。它顯示了一個雜亂無章的畫面。有些指標上升,有些下降。並未表明有什麼明顯的結果。有一張表格,是對早期客戶獲取成本數據的總結,但是這個數據看上去很糟糕。首席營銷官強調,這些數據“還在發酵”,對於這類行爲,可能要花費數月時間來處理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你給數據團隊的人員發消息,並告訴他們下一次應該將這些數據做成隊列圖。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這是怎麼回事?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"值得慶幸的是,產品團隊開始了 A\/B 測試。壞消息是,它忽視了結果,項目看起來主要受里程碑事件和人爲截止日期的驅動。好消息是,首席執行官鼓勵團隊將數據當作事實。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當組織的壓力越來越大,要求更多的數據驅動時,就應該加快數據團隊與其他團隊的合作。特別是,最高層的人會開始把注意力集中在指標上,而與他們合作是你的職責。做一件簡單的事兒能起到很大的作用,那就是和每一個團隊合作,確保他們都有自己的儀表盤,其中包含他們關心的最重要的一組指標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/wechat\/images\/f4\/f4f7c7c9a74e6bb4deff95b1f2640e0c.jpeg","alt":null,"title":null,"style":null,"href":null,"fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","text":"圖 4:在組織的不同級別上,不同的服務推動了最大的進展"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"除了一個例外,幾乎所有數據團隊過去做的機器學習工作都是毫無結果的。在庫存產品團隊工作的數據科學家中有一位對早期推薦很有興趣。她是你僱傭的新成員之一,而且她有的背景更加全面。她在 Notebook 上找到推薦系統,並能夠將其轉變爲內部部署的小型 Flask 應用程序。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"庫存團隊的產品經理看到它時欣喜若狂。“我們如何交付?”她問道。該團隊跟蹤的指標之一是平均訂單值,她認爲這能推動訂單顯著提高。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一項快速評估表明,要大規模使用它仍然是個問題。但是你的數據科學家有一個想法。她說:“如果我們只爲所有客戶中的 1% 推出會怎麼樣?我們可以讓它被 cron job 驅動,並在數據庫中預先生成所有建議。我認爲幾天之內我就能搞定事情。”大家都很興奮,於是她開始工作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你已經在供應鏈團隊中花費了很多時間,並且發現了更多大型 SQL 查詢,用於各種關鍵業務。它們中斷了很多,但是你的團隊正在重新編寫代碼,使之成爲合適的運行管道。供應鏈負責人希望可以和你的團隊深入合作。他說:“一旦你開始參與進來,我的業務分析師團隊將會做得更好。爲了支持你,我願意爲你們做任何事,僱傭更多的數據科學家!”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如今,一些很酷的機器學習工作帶來了希望。看起來產品團隊終於因爲推薦系統的小型測試而興奮不已。它之前被卡住了,因爲產品工程團隊不能評估工作,也不想承諾,數據團隊又沒有實際的軟件技能,不能將其帶到生產業務中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據團隊更深入地解決了這個問題,真正建立了演示。這樣做,不但使其接近於生產,而且潛力也更加清晰。在這些項目停滯不前時,數據團隊很容易感到失敗,就像他們被僱來做人工智能的工作一樣,但是現在沒有了管理支持。在時間中,我認爲更普遍的情況是,他們並不主動將工作做得有價值和更容易交付。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外一件事是關注供應鏈團隊在做什麼。這個過程大致如下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在開始的時候,團隊有自己的“業務分析師”(數據團隊之外),但是需要數據團隊爲他們運行查詢來獲取數據。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在數據團隊的幫助下,這些業務分析師開始自己運行查詢。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"他們開始積聚“影子技術債務”(在本例中是一個大型 SQL 查詢),這首先會引起大量與數據團隊的摩擦。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據團隊開始嵌入到業務中,幫助他們進入更好的位置。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲嵌入,業務分析師的需求下降,對數據科學家的需求上升。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"請注意,在你開始直接將生產數據庫錶轉儲到數據倉庫時,你需要承擔大量的“技術債務”。下游的數據消費者會有很多中斷的 SQL 查詢。久而久之,你就必須在兩者之間添加某種層,從生產數據庫中提取元數據,並將它們轉換成各種派生數據集,使之更穩定,更易於查詢。從安全角度來看,這很有必要:你需要從生產數據中分離出大量 PII。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"終於迎來轉機"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這是第三季度的計劃會議。在此之前,這些會議常常變成一場關於公司在未來幾個季度重要方向的大辯論。這次,你首先瀏覽了公司的高級關鍵結果。每個團隊都有子級指標,從而形成更細化的高層指標劃分。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"顯然,你和產品管理團隊的工作得到了回報。對於他們在運行測試時所學到的或在數據中發現的東西,產品經理們常常爲他們對各種項目的投資是合理的提供證據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"一項重要的成就是,你的一位數據科學家和結算團隊一起發現了一個嚴重的錯誤,即用戶在確認頁面點擊“返回”按鈕,最終會導致購物車對象出現問題。解決了這個問題之後,轉化率就大大提高了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外一種見解是,不同廣告活動所帶來的流量一登陸網站,就會產生截然不同的轉化情況。結果發現,一些網站的點擊價格低廉,但是轉化率並不高。有些廣告活動價格很高,但是這些用戶的轉化率很高。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲現在你跟蹤 UTM 參數並將它與賬戶創建聯繫起來,你現在就可以衡量廣告點擊到購買的轉化率。除非所有數據都進入相同的數據倉庫並進行歸一化,否則無法做到這一點,因此你可以輕鬆查詢。目前,主要的 KPI 是與營銷團隊合作,以端到端獲取客戶的成本,而非每次點擊成本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外一個令人振奮的消息是,推薦系統的 1% 測試表現非常出色。雖然把它擴展到 100% 的用戶是一個非常重要的項目,但是首席執行官還是給這個項目開了綠燈。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當然,並非所有結果都是正面的,也有一些不成功的測試都不成功,但整體是向好的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"經過了這麼長時間的磨練,你已經將組織轉變爲真正的數據原生型架構。數據團隊與許多不同的利益相關者進行跨職能的合作。數據和見解被用於規劃,使用數據推動業務價值,而非目標不明確的獨立作坊。公司採用迭代的方式工作,而非大型的“瀑布”式規劃,具有快速數據驅動的反饋週期。衡量標準的定義讓人們覺得產生業務價值是一種責任。數據文化由上面(首席執行官推動)和下面(基層員工)共同推動。失敗並沒有什麼,至少你可以從中吸取教訓。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者介紹:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Erik Bernhardsson,Better 的前首席技術官,目前正在從事數據領域的一些創業想法。他寫了很多代碼,其中一些最終被開源了,如 Luigi 和 Annoy。他還共同組織了紐約市機器學習聚會。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文鏈接:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/erikbern.com\/2021\/07\/07\/the-data-team-a-short-story.html","title":"","type":null},"content":[{"type":"text","text":"https:\/\/erikbern.com\/2021\/07\/07\/the-data-team-a-short-story.html"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章