伴魚事件分析平臺:設計篇

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"背景"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在伴魚,服務器每天收集的用戶行爲日誌達到上億條,我們希望能夠充分利用這些日誌,瞭解用戶行爲模式,回答以下問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最近三個月,來自哪個渠道的用戶註冊量最高?"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最近一週,北京地區的,發生過繪本瀏覽行爲的用戶,按照年齡段分佈的情況如何?"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最近一週,註冊過伴魚繪本的用戶,7日留存率如何?有什麼變化趨勢?"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最近一週,用戶下單的轉化路徑上,各環節的轉化率如何?"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了回答這些問題,事件分析平臺應運而生。本文將首先介紹平臺的功能,隨後討論平臺在架構上的一些思考。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"功能"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"總的來說,爲了回答各種商業分析問題,事件分析平臺支持基於事件的指標統計、屬性分組、條件篩選等功能的查詢。其中,事件指用戶行爲,例如登錄、瀏覽伴魚繪本、購買付費繪本等。更具體一些,事件分析平臺支持三類分析:「事件分析」,「漏斗分析」,和「留存分析」。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"事件分析"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"事件分析是指,用戶指定一系列條件,查詢目的指標,用於回答一個具體的分析問題。這些條件包括:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"事件類型:指用戶行爲,採集自埋點數據;例如登錄伴魚繪本,購買付費繪本"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"指標:指標分爲兩類,基礎指標和自定義指標基礎指標:總次數(pv),總用戶數(uv),人均次數(pv\/uv)自定義指標:事件屬性 + 計算類型,例如 「用戶下單金額」的「總和\/均值\/最大值」"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"過濾條件:用於篩選查詢所關心的用戶羣體"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"維度分組:基於分組,可以進行分組之間的對比"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"時間範圍:指定事件發生的時間範圍"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"讓我們舉個具體的例子。我們希望回答「最近一週,在北京地區,不同年齡段的用戶在下單一對一課程時,下單金額的平均數對比」這個問題。這個問題可以很直觀地拆解爲下圖所示的事件分析,其中:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"事件類型 = 下單一對一課程"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"指標 = 下單金額的平均數"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"過濾條件 = 北京地區"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"維度分組 = 按照年齡段分組"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"時間範圍 = 最近一週"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/06\/06bf82c2a7c180a5aa00005e10a8279a.png","alt":"event_analysis_flow","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"圖注:事件分析創建流程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/ac\/ac2ef277ea93fde442bca918dcd41d72.png","alt":"event_analysis","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"圖注:事件分析界面"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"漏斗分析"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"漏斗分析用於分析多步驟過程中,每一步的轉化與流失情況。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"例如,伴魚繪本用戶的完整購買流程可能包含以下步驟:登錄 app -> 瀏覽繪本 -> 購買付費繪本。我們可以將這個流程設置爲一個漏斗,分析整體以及每一步轉化情況。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此外,漏斗分析還需要定義「窗口期」,整個流程必鬚髮生在窗口期內,纔算一次成功轉化。和事件分析類似,漏斗分析也支持選擇維度分組和時間範圍。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/7d\/7d26aab9dcb331d3ff35646e9bbf0a5d.png","alt":"funnel_flow","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"圖注:漏斗分析創建流程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/91\/91ad9606510c58b3af9fb92fc3b5cd90.png","alt":"funnel","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"圖注:漏斗分析界面"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"留存分析"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在留存分析中,用戶定義初始事件和後續事件,並計算在發生初始事件後的第 N 天,發生後續事件的比率。這個比率能很好地衡量伴魚用戶的粘性高低。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在下圖的例子中,我們希望瞭解伴魚繪本 app 是否足夠吸引用戶,因此我們設置初始事件爲登錄 app,後續事件爲瀏覽繪本,留存週期爲 7 天,進行留存分析。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/25\/2529bf8e2fd12e53d6c454a5e5e36b29.png","alt":"retention_flow","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"圖注:留存分析創建流程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/4d\/4d739b2a5f030c707e1888b8f4ec2ee6.png","alt":"retention","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"圖注:留存分析界面"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在架構上,事件分析平臺分爲兩個模塊,如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據寫入:埋點日誌從客戶端或者服務端被上報後,經過 Kafka 消息隊列,由 Flink 完成 ETL,然後寫入 ClickHouse。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分析查詢:用戶通過前端頁面,進行事件、條件、維度的勾選,後端將它們拼接爲 SQL 語句,從 ClickHouse 中查詢數據,展示給前端頁面。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/30\/305a3a6f8f3c8c245b6c4f5333fca343.png","alt":"design","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"圖注:總架構圖"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不難看出,ClickHouse 是構成事件分析平臺的核心組件。我們爲了確保平臺的性能,圍繞 ClickHouse 的使用進行了細緻的調研,回答了以下三個問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何使用 ClickHouse 存儲事件數據?"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何高效寫入 ClickHouse?"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如何高效查詢 ClickHouse?"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"如何使用 ClickHouse 存儲事件數據?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"事件分析平臺的數據來源有兩大類:來源於埋點日誌的用戶行爲數據,和來源於「用戶畫像平臺」的用戶屬性數據。本文只介紹埋點日誌數據的存儲,對「用戶畫像平臺」感興趣的同學,可以期待一下我們後續的技術文章。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在進行埋點日誌的存儲選型前,我們首先明確了幾個核心需求:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持海量數據的存儲。當前,伴魚每天產生的埋點日誌在億級別。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"支持實時聚合查詢。由於產品和運營同學會使用事件分析平臺來探索多種用戶行爲模式,分析引擎必須能靈活且高效地完成各種聚合。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ClickHouse 在海量數據存儲場景被廣泛使用,高效支持各類聚合查詢,配套有成熟和活躍的社區,促使我們最終選擇 ClickHouse 作爲存儲引擎。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據我們對真實埋點數據的測試,億級數據的簡單查詢,例如 PV 和 UV,都能在 1 秒內返回結果;對於留存分析、漏斗分析這類的複雜查詢,可以在 10 秒內返回結果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"「存在哪」的問題解決後,接下來回答「怎麼存」的問題。ClickHouse 的列式存儲結構非常適合存儲大寬表,以支持高效查詢。但是,在事件分析平臺這個場景下,我們還需要支持「自定義屬性」的存儲,這時「大寬表」的存儲方式就不盡理想。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"所謂「自定義屬性」,即埋點日誌中一些事件所獨有的屬性,例如:「下單一對一課程」這一事件在上報時,會帶上「訂單金額」這個很多其它事件所沒有的屬性。如果爲了支持「下單一對一課程」這個事件的存儲,就需要改變 ClickHouse 的表結構,新增一列,這將使得表結構的維護成本極高,因爲每個新事件都可能附帶多個「自定義屬性」。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"爲了解決這個問題,我們將頻繁變動的自定義屬性統一存儲在一個 "},{"type":"link","attrs":{"href":"https:\/\/clickhouse.tech\/docs\/en\/sql-reference\/data-types\/map\/","title":null,"type":null},"content":[{"type":"text","text":"Map"}]},{"type":"text","text":" 中,將基本不變的公共屬性存爲列,使之兼具大寬表方案的高效性,和 Map 方案的靈活性。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"如何高效寫入 ClickHouse?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在設計 ClickHouse 的部署方案時,我們採用了業界常用的讀寫分離模式:寫本地表,讀分佈式表。在寫入側,分爲3個分片,每個分片都有雙重備份。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於事件分析的絕大多數查詢,都是以用戶爲單位,爲了提高查詢效率,我們在寫入時,將數據按照 user_id 均勻分片,寫入到不同的本地表中。如下圖所示:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.geekbang.org\/infoq\/16\/166d8161965ddb5ca70d8e951431c93f.png","alt":"import_to_clickhouse","title":null,"style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"圖注:將埋點數據寫入到 ClickHouse"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"之所以不寫分佈式表,是因爲我們使用大量數據對分佈式表進行寫入測試時,遇到過幾個問題:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":"Too many parts error:分佈式表所在節點接收到數據後,需要按照 sharding_key 將數據拆分爲多個 parts,再轉發到其它節點,導致短期內 parts 過多,並且增加了 merge 的壓力;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":2,"normalizeStart":2},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":"寫放大:分佈式表所在節點,如果在短時間內被寫入大量數據,會產生大量臨時數據,導致寫放大。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"如何高效查詢 ClickHouse?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們可以使用 ClickHouse 的內置函數,輕鬆實現事件分析平臺所需要提供的事件分析、漏斗分析和留存分析三個功能。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"事件分析可以用最樸素的 SQL 語句實現。例如,最近一週,北京地區的,發生過繪本瀏覽行爲的用戶,按照年齡段的分佈,可以表述爲:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"SELECT\n count(1) as cnt,\n toDate(toStartOfDay(toDateTime(event_ms))) as date,\n age\nFROM event_analytics\nWHERE\n event = \"view_picture_book_home_page\" AND\n city = \"beijing\" AND\n event_ms >= 1613923200000 AND event_ms <= 1614528000000\nGROUP BY (date, age);"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"留存分析使用 ClickHouse 提供的 "},{"type":"link","attrs":{"href":"https:\/\/clickhouse.tech\/docs\/en\/sql-reference\/aggregate-functions\/parametric-functions\/#retention","title":null,"type":null},"content":[{"type":"text","text":"retention"}]},{"type":"text","text":" 函數。例如,註冊伴魚繪本後,計算瀏覽繪本的次日留存、7日留存可以表述爲:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"SELECT\n sum(ret[1]) AS original,\n sum(ret[2]) AS next_day_ret,\n sum(ret[3]) AS seven_day_ret\nFROM\n(SELECT\n user_id,\n retention(\n event = \"register_picture_book\" AND toDate(event_ms) = toDate('2021-03-01'),\n event = \"view_picture_book\" AND toDate(event_ms) = toDate('2021-03-02'),\n event = \"view_picture_book\" AND toDate(event_ms) = toDate('2021-03-08')\n ) as ret\nFROM event_analytics\nWHERE \n event_ms >= 1614528000000 AND event_ms <= 1615132800000\nGROUP BY user_id);"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"漏斗分析使用 ClickHouse 提供的 "},{"type":"link","attrs":{"href":"https:\/\/clickhouse.tech\/docs\/en\/sql-reference\/aggregate-functions\/parametric-functions\/#windowfunnel","title":null,"type":null},"content":[{"type":"text","text":"windowFunnel"}]},{"type":"text","text":" 函數。例如,在 瀏覽繪本 -> 購買繪本,窗口期爲2天的這個轉化路徑上,轉化率的計算可以被表達爲:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":"text"},"content":[{"type":"text","text":"SELECT\n array( sumIf(count, level >= 1), sumIf(count, level >= 2) ) AS funnel_uv,\nFROM (\n SELECT\n level,\n count() AS count\n FROM (\n SELECT\n uid,\n windowFunnel(172800000)(\n event_ms, event = \"view_picture_book\" AND event_ms >= 1613923200000 AND event_ms <= 1614009600000, event = \"buy_picture_book\") AS level\n FROM\n event_analytics\n WHERE\n event_ms >= 1613923200000 AND event_ms <= 1614182400000\n GROUP BY uid\n )\n GROUP BY level\n)"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在結束功能梳理和架構設計後,我們開始了事件分析平臺有序的建設。我們期待在大規模使用後,與大家分享事件分析平臺的下一步演進。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"參考文獻"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[1] Fast and Reliable Schema-Agnostic Log Analytics Platform. "},{"type":"link","attrs":{"href":"https:\/\/eng.uber.com\/logging\/","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/eng.uber.com\/logging\/"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[2] How ClickHouse saved our data. "},{"type":"link","attrs":{"href":"https:\/\/mux.com\/blog\/from-russia-with-love-how-clickhouse-saved-our-data\/","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/mux.com\/blog\/from-russia-with-love-how-clickhouse-saved-our-data\/"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"[3] 最快開源 OLAP 引擎!ClickHouse 在頭條的技術演進 "},{"type":"link","attrs":{"href":"https:\/\/www.infoq.cn\/article\/ntwo*yr2ujwlmp8wcxoe","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/www.infoq.cn\/article\/ntwo*yr2ujwlmp8wcxoe"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作者:應京含"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文:https:\/\/tech.ipalfish.com\/blog\/2021\/06\/21\/event-analytics-design\/"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文:伴魚事件分析平臺:設計篇"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"來源:伴魚技術博客"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"轉載:著作權歸作者所有。商業轉載請聯繫作者獲得授權,非商業轉載請註明出處。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章