雲原生數據中臺技術與趨勢解讀

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據中臺發展至今,大體經歷了4個重要階段:數據庫-數據倉庫-大數據平臺-數據中臺。每次新的變革,都是爲了解決上一階段存在的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當前,走向雲原生成爲數據中臺的必然和必須。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"雲原生從何而來?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雲原生是用於指導如何在雲上構建和運行應用的方法論。奇點雲高級技術專家,奇點雲數據智能平臺DataSimba總負責人地雷談到,“雲原生”並不是一個新的概念。回顧雲計算史,從個人端應用到企業級應用,都早已開始“上雲”。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"起初,這些上雲的“非原住民”應用,延續了私有化部署的技術架構,把本地軟件不加修改地通過ECS遷至雲端。而ECS的弊端在於只能承載計算,無法實現存儲。雖然上雲後的應用實現了業務打通,但隨着業務擴大,原有的架構“可用性”明顯下降。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"國內雲廠商爲了解決數據存儲問題,製作了雲磁盤,將其掛靠在雲主機上,實現數據備份,且無需更改程序。傳統軟件上雲的“高可用”問題得以解決。然而,這種方式引發了另一弊端——成本高。客戶把Hadoop不加修改直接部署到ECS節點上,數據通過HDFS存儲在雲磁盤上,需花費大量成本。因此必須修改HDFS底層,把數據存到對象存儲上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隨着需求不斷豐富,系統必須按照IaaS、PaaS的技術特點進行重構,以便跟上業務和數據的爆炸性增長。 在私有化部署以及上一代傳統技術的軟件架構運維方法論的基礎上,帶着“高可用”、“低成本”等屬性,“雲原生”升級而出。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"雲原生數據中臺具有哪些技術要素?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"“雲原生”概念發展至今,我們已並不陌生。而爲什麼要強調“雲原生數據中臺纔是未來”?分級多域數據治理的剛需、雲原生技術降本增效的天然特徵、國內基礎設施自主可控的要求……都將數據中臺推向雲原生。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,奇點雲將雲原生數據中臺的技術要素歸納爲6點:CI\/CD(持續集成持續交付)、容器化、對象體系、存儲計算分離、跨雲多域數據治理和元數據管理。這其中,對象體系、跨平臺、自主可控是全新迸發出的幾個要素——奇點雲的雲原生數據中臺DataSimba,實現跨雲的多workspace管理,以幫助客戶的數據和應用跨雲治理和遷移。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"CI\/CD(持續集成持續交付)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CI\/CD的本質是提高開發和部署效率。在業務量巨大的情況下,大數據和雲的運維人力成本極高。因此需要使用大量的自動化工具和大數據預測算法進行自動化運維。通過版本管理系統和DevOps基礎設施,實現自動化測試和持續集成。一個典型流程是,程序員提交代碼到特定的tag,觸發測試接口自動化測試腳本執行併發送報告。由此實現測試、發佈和部署自動化。在此基礎上構建特定的數據環境,對重要接口和鏈路進行自動化檢測。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"容器化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"容器化本質上是一種虛擬化技術,一臺主機可虛擬出上千個容器。單個容器的啓動時間更快,佔用空間更小,而且可以根據實際應用的大小來彈性分配資源,無需額外採購服務器,加快研發速度。使用容器編排基礎設施,對服務和作業進行治理,根除版本地獄,大幅度提高運維和集成效率。容器化編排與CI\/CD是相互結合的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在數據中臺領域,往往幾十臺機器、上百個進程同時運行,且在這些進程中不僅要運行本身的程序,也要運行客戶的程序。因此,底層微服務的進程繁多。基於安全合規要求,客戶之間的程序需要保持分隔。因此,數據中臺對於容器化的要求高於其他基於雲原生的應用。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"對象體系"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"根據現有業務抽象出核心對象,以標準Restful風格提供API服務,解耦核心對象與業務層服務,以應對不同環境、不同業務場景的需求。這一系列正交的核心對象就構成了平臺對象體系,上層業務可在此基礎上構建應用,高效演進。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對象體系的API應該是優雅且向前兼容的,一旦發佈,很難改變。例如,在WIN32研發時,出現某個單詞錯誤,幾十年後都無法修改。因此,需要把對象體系設計得極爲詳盡和準確。奇點云云原生數據中臺作爲開放式平臺,其上的API提供了一套對象,如:項目、作業、數據、源數據、賬號等,具有集中數據接口。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"存儲計算分離"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於雲具有分佈式特點,在雲上無法天然將數據存儲在ECS中。因此必須將關鍵數據、狀態型數據存儲在對象存儲中。大量私有化組件都需要被改寫。如果把Hadoop、Spark等常規開源大數據引擎直接應用於雲主機,海量數據帶來的存儲成本和吞吐壓力,很快會壓垮客戶。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,必須引入中間緩存實現計算存儲分離,將數據存儲到對象存儲上,同時兼容HDFS協議,能夠根據業務需求進行彈性擴容,就能大幅度降低成本,提高集羣性能。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"跨雲多域數據治理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雲原生數據中臺的一大優勢在於可以實現跨雲多域。例如,客戶在AWS上使用數據中臺,一旦需要轉移到其他平臺,雲原生數據中臺可實現在不修改代碼的基礎上直接遷移。對於具有多重業務、龐大數據體量的大型企業來說,爲避免數據資產被一個平臺所綁定,供應商必須呈現多樣化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,在客戶與一家供應商合作的同時,也需要使用獨立的第三方數據中臺提供跨雲多域的數據治理能力,從而提高基礎設施的可控性和安全性。奇點云云原生數據中臺DataSimba以「第三方」的角色,爲企業解決數據多雲並存的跨雲多域治理問題。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"元數據管理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"由於數據量急劇增長,對數據的管理成爲一大問題。雲原生數據中臺的元數據管理功能,對數據的結構、指標、標籤、權限、上下游血緣、生產作業等元信息進行規範化管理,建立智能數據治理體系。同時支持數據盤點、安全審計、血緣分析、關鍵分級等應用,最終實現數據資產化。例如,某頂級品牌商具有73個業務系統,各自存儲在不同的數據庫和存儲介質中,需要將73個系統全部集中在一個數據中臺上進行標籤打通。在此需求下,數據治理十分重要,核心就是元數據的管理。因此,雲原生數據中臺必須具備元數據管理功能。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"雲原生數據中臺能爲用戶解決什麼問題?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"具備以上6大技術能力的數據中臺是走向雲原生後的重要升級。基於這些能力,雲原生數據中臺究竟能爲用戶解決哪些問題,帶來降本增效?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"提高研發效率"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過微服務、CI\/CD、對象體系、DevOps等一系列技術,提升迭代速度,增強在雲的複雜環境下的控制、自動化運維控制等。提高代碼開發、測試、發佈效率,降低迭代成本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"降低運維成本"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過上述的技術也可以實現開發及運維高效協同,有效提升對故障的響應速度,實現持續集成和交付。使得快速部署應用成爲業務流程和企業競爭力的重要組成部分,降低運維成本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"降低存算成本"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大數據基礎設施的存儲計算成本驚人。存算分離和容器化能夠更高效地使用IaaS資源,降低存儲成本。存儲和計算節點分離後,可以在不對存儲進行擴容的情況下快速增加計算資源。另一方面,單個容器的啓動時間更快,佔用空間更小,而且可以根據實際應用的大小來彈性分配資源,無需額外採購服務器。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"提高治理效率"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"治理效率不侷限於數據治理,也包括微服務治理、系統治理和API治理,需要自動化設計和框架。使用跨雲治理、元數據管理等技術,會大幅度提高企業積累數據資產的效率,降低安全風險,提高供應商的多樣化。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"作者介紹"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"地雷,奇點雲高級技術專家,奇點雲數據智能平臺DataSimba總負責人,阿里大數據底層核心引擎ODPS初代產品經理。曾支持螞蟻金服、菜鳥等算法與應用建設。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章