多維數據分析(OLAP)技術選型(2):數據分析與OLAP差異

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":1}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"此文主要是跟大家聊聊數據分析與OLAP差異,只有清晰它們之間差異,我們纔可以深入瞭解所選型技術使用場景,脫離使用場景的技術選型就如無源之水,無本之末;對技術使用場景認知程度,決定了我們所選擇的解決方案;如“盲人摸象”的例子,如果你只摸到了象的耳朵,那麼你的解決方案就是拿一個筐,把耳朵拎走。如果你摸到的是整頭大象,你的解決方案就會是集裝箱,用卡車把它運走。但是如果你摸到了整個象羣,你的解決方案可能是修橋、修路。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你的世界有多大,在於你發現的世界有多大,然後你的解決方案就有多大。","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,爲了選擇一套符合我們業務場景的OLAP技術,我們要先深入瞭解OLAP使用場景---數據分析。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"什麼是數據分析","attrs":{}}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據分析是指用適當的統計分析方法對收集來的大量數據進行分析,將它們加以彙總和理解並消化,以求最大化地開發數據的功能,發揮數據的作用。數據分析是爲了提取有用信息和形成結論而對數據加以詳細研究和概括總結的過程 ---《百度百科》","attrs":{}}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以上是百度百科對數據分析定義,聽起來比較抽象;","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"數據分析是指把業務問題轉化爲數據問題,然後對數據處理分析形成分析結果,最終可以支持業務決策","attrs":{}},{"type":"text","text":";","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/3e/3ef6738c2634de6cbbee25c0166c1ac7.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這樣講還是比較難理解,我們通過一個例子具象化這個定義。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖是訂購明細,公司領導想了解以下公司哪個產品最受歡迎,哪個產品貢獻最大,請給出分析結果。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ed/edff8953fc4759b277e503986c92c77d.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"結合上面對數據分析定義,我們需要明確幾點:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"業務問題是什麼:哪個產品最受歡迎?","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"轉爲成數據問題是什麼:產品銷量","attrs":{}}]}]}],"attrs":{}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"數據分析過程","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/da/dad1f67aaa1cbf0502bf923a099eed76.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一步:明確目的","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"明確分析目的和確定分析思路, 是數據分析有效進行的先決條件,爲後續過程提供清晰的指引方向。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二步:收集數據數據","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"收集爲數據分析提供素材和依據。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第三步:數據預處理","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據處理是對收集到的數據進行加工整理,形成適合數據分析的樣式,是數據分析前必不可少的階段。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第四步:分析數據","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據分析是指用適當的分析方法及工具,對數據進行分析,提取有價值的信息,形成有效結論的過程。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第五步:呈現數據數據","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分析結果的呈現:","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"表格:更準確地描述數據","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圖形:更直觀地表達觀點","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第六步:撰寫報告","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據分析報告是對整個分析過程的一個總結,通過科學分析來評估企業運營質量,爲決策者提供依據。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"分析思路","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在明確目標階段時,需要梳理分析思路,分解分析目的,搭建分析框架;確定從哪幾個角度進行分析,採用哪些分析指標?從宏觀角度指導如何進行數據分析,複雜問題進行分解,將籠統的問題進行細化。","attrs":{}}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從哪幾個維度進行分析才完整而不會遺漏?","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分析時可以分爲幾個步驟?","attrs":{}}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"需要定義哪些KPI指標來表示?","attrs":{}}]}]}],"attrs":{}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"五種典型的分析思路","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/75/751d1280cc5da04b85e24c8d1fe98e3e.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"數據分析方法","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要了解數據分析,首先要了解數據分析的幾種基本辦法;掌握了它們,可以讓你更瞭解你的用戶(數據分析工程)訴求,同時可以更高效的溝通。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"常用的數據分析三大類方法","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/14/14fc231f25acd8d6ff2a1429525eb3ed.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"數據分析與OLAP差異","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"關於什麼是OLAP,大家可以參考這篇文章","attrs":{}},{"type":"link","attrs":{"href":"http://www.woshipm.com/data-analysis/486373.html","title":"","type":null},"content":[{"type":"text","text":"OLAP聯機分析:數據產品經理必備技能","attrs":{}}]},{"type":"text","text":",這裏就不再說明;OLAP核心思想就是建立多維度的數據立方體,以維度(Dimension)和度量(Measure)爲基本概念,輔以元數據,實現可以鑽取、切片、切塊、旋轉等靈活、系統、直觀的數據展現。","attrs":{}}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/28/28b82d9373d678bd2636dde8085d2c6d.webp","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上面對數據分析與OLAP理解,我們總結下數據分析與OLAP差異:","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"1、OLAP分析更多的是指針對數據分析的一種解決方案","attrs":{}},{"type":"text","text":"OLAP分析首先是把數據預處理成數據立方。把有可能的彙總都預先算出來。然後在用戶選擇某種彙總時,OLAP分析可以在預先的計算出來的結果基礎上很快地計算出結果。從而可以很好地支持極大量數據的及時分析。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"2、OLAP是分析思路是從假設到驗證","attrs":{}},{"type":"text","text":"OLAP是自上而下、不斷深入的驗證型分析工具,它常常以用戶的假設爲基礎對數據查詢與分析, 進而提取相關的信息。舉個例子,數據分析師在爲超市規劃貨品架櫃擺設時,可能會先假設嬰兒尿布與嬰兒奶粉會常被一起購買,接着便可利用OLAP的工具去驗證假設是否爲真。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"3、OLAP是對現有數據的總結","attrs":{}},{"type":"text","text":"OLAP提供描述型的模型,告訴你什麼樣的產品在什麼地區的銷售額和去年的對比。","attrs":{}}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"總結","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 回到我們分析數據分析與OLAP差異目的:選擇適合業務場景的OLAP技術。從以上分析知道,OLAP只是數據分析的一種解決方案:通過建立多維度的數據立方體,基於多個維度的交叉細分,能夠讓分析人員從多個角度、多個層面去觀察和理解數據。此解決方案在提供各種便利同時,也有一些不足地方,最大問題是:業務靈活多變,必然導致業務模型隨之經常發生變化,而業務維度和度量一旦發生變化,技術人員需要把","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"整個Cube(多維立方體)重新定義並重新生成","attrs":{}},{"type":"text","text":",另外一點是業務人員","attrs":{}},{"type":"text","marks":[{"type":"strong","attrs":{}}],"text":"只能在此Cube(多維立方體)上進行多維分析","attrs":{}},{"type":"text","text":",這種方式限制了業務人員快速改變問題分析的角度,從而使所謂的商業智能系統成爲死板的日常報表系統。","attrs":{}}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" 哪我們OLAP技術選型,只是爲了滿足OLAP這個場景嗎?其實很明顯不是的,我們也沒必要被OLAP這個概念限定死,應該更多結合業務場景與需求進行技術選型,畢竟技術只是業務實現的工具。所以我們在OLAP技術選型時不止是爲了滿足OLAP這個場景,更多是爲了通過選型出來技術怎麼更好的服務數據分析,如有些OLAP技術選型需要考慮是否支持跨庫查詢、是否支持與消息中間件對接從而實現實時數據分析等等。","attrs":{}}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章