雲時代,我們需要怎樣的數據庫?

{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"引言"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據庫技術發展已達半個世紀之久,數據庫圖靈獎得主Michael Stonebraker曾在"},{"type":"text","marks":[{"type":"italic"}],"text":"Readings in Database Systems"},{"type":"text","text":"中將數據庫模型技術分爲9個不同的時代與類型,而云時代開始以後,我們可以從全新的視角審視數據庫等基礎技術的過去和未來。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於雲計算,包括數據庫在內的IT基礎技術發生從技術形態到線上線下整個市場結合的大幅變化,數據庫技術呈現從傳統集中式到雲時代分佈式遷移替換的趨勢,這也給國產數據庫賦予機遇與挑戰。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在2020年11月,Gartner發佈了2020年度的數據庫廠商評估報告,中國數據庫廠商佔據三席,標誌着國內數據庫進入全新發展階段。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同時,Gartner預測,到2022年,世界上3\/4的數據庫都會跑在雲上,而我們認爲,雲數據庫的發展,"},{"type":"text","marks":[{"type":"strong"}],"text":"目前正在經歷從第一階段“數據庫上雲,即從數據庫到雲數據庫”,到第二階段“從雲數據庫到雲原生數據庫”的變革"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"歸根結底,雲數據庫做了什麼得到了業內的認可?未來數據庫發展趨勢是什麼?我們可以如何在新機遇下的雲融合時代把握技術創新的脈搏?在當前國產數據庫也成爲一個熱門話題之際,我們談一談我們的理解和思考,與大家共勉。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"雲時代的IT基礎技術形態演變"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"隨着雲計算的發展,整個IT基礎技術翻天覆地的變化體現在幾個方面:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"IT設施部署,從過去的零散化走向今天的集中化、規模化。"},{"type":"text","text":"過去,每一個企業自建各自的數據中心等IT基礎設施,包括服務器、網絡到操作系統、數據庫等,形成企業市場上零散化的IT設施模式。而今天基於雲計算服務,企業IT設施呈現集中化、規模化效應,對效率、性能、成本的要求提升。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"IT服務交付,從過去的軟件交付模式走向服務交付模式。"},{"type":"text","text":"過去購買商業化軟件,或者是使用開源軟件產品,基本是通過商業化或開源的方式進行分發,而現在完全變成一個個服務的形式進行交付。這帶來的變化是,用戶不需要再盤算該購買幾臺服務器,而是在具有數據庫使用需求時,直接雲上使用即可。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"開發方式,將呈現從過去業務進行非常底層的開發以及調用底層API等操作的模式,轉向SaaS化、Severless模式的服務。"},{"type":"text","text":"在雲上,開發者可以使用各種各樣的SaaS服務。無論從效率、基礎技術能力等方面來說,這都是一個巨大的變化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而數據形式及應用場景領域而言,事實上過去的數據形式或應用場景相對單一,以傳統數據庫爲例,場景主要集中在了比如金融、運營商、政務等傳統行業領域。"},{"type":"text","marks":[{"type":"strong"}],"text":"隨着互聯網、移動互聯網、產業互聯網的發展,各個行業也正逐步加速其電子化、信息化發展趨勢,應用服務形式呈多樣化發展,使得當前行業的數據形式及應用場景也越來越多樣化,並對底層數據庫能力提出更多的要求和挑戰。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"過去,行業場景中更多以結構化的數據爲主,關係型數據庫可以支撐極大部分場景需求,現在我們可以看到湧現出了許多如NoSQL、Graph圖數據庫等各種類型的數據庫,NoSQL下屬同時還可以細分KV型、文檔型等多種類別,而且整體數據庫類型還有持續增加的趨勢。這是非常合理的現象。也就是說,對於未來數據庫來說,其自身發展也會呈現多樣化,而且是融合、創新的趨勢。我們知道,按照傳統經驗來說,如果一個技術產品是單一的形態,那麼追求的是儘量做到通用化,然而,在當前多樣化需求的趨勢下,技術應用層面需要進行各種權衡和取捨。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此可以說,這是雲時代的發展變化,對數據庫帶來的新的挑戰和要求。在當前雲數據庫成爲大勢所趨的同時,我們認爲,國產雲數據庫要發展好,需要持續在基礎能力、成本效率、產品化、未來技術融合等各個層面進行探索突破。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"雲數據庫技術演進的挑戰"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"結合雲計算的特點,國產雲數據庫發展面臨着需要持續探索可用性與一致性、高併發性能、彈性可擴展等基礎能力突破,同時面向雲時代的多樣化趨勢打造新一代分佈式數據庫產品的挑戰和要求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第一,可用性與一致性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"作爲數據庫,高可用性、數據一致性是最基礎的挑戰。高可用性,要求達到99.999%以上;數據強一致性,意味着數據不出錯,數據庫高度可靠。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雲計算時代,技術設施的升級換代對技術實現方式帶來變革。過去,比如金融行業,系統基於穩定性較高的傳統集中式大型機或小型機來保障系統的可用性與一致性。然而,傳統集中式結構存在明顯的技術邊界,包括性能和吞吐量的邊界,今天它們已然面臨較大的吞吐和性能瓶頸,無法滿足雲時代的產業需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"自然地,當前產業趨勢是向分佈式架構轉型升級,轉向基於x86等的分佈式、開放式平臺。"},{"type":"text","marks":[{"type":"strong"}],"text":"傳統架構系統依賴於大型機或小型機在硬件層面進行的大量的冗餘設計,在硬件層面實現可用性與一致性保障。而相對來說,基於x86機器部署的新一代分佈式架構系統,則在如何實現性能、無限水平擴展的基礎上保證數據一致以及系統高可用提出新的挑戰要求"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第二,性能成本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雲計算時代,如果實現了規模化以後,還不能實現成本降低的話,是不可接受的。雲計算要幫助提升整個社會資源利用率,性能成本需要控制到最低。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"對於騰訊雲的服務來說,我們需要考慮的是如何能夠保證客戶以最便宜的價格買到最高級的服務——比如花最少的錢買到最大的磁盤空間、以及最好的TPS等產品表現。而在這個過程中,最核心的就是資源利用率。"},{"type":"text","text":"舉個例子,雲計算服務商如果把資源利用率提升20%,對客戶、服務商本身而言將能極大地降低一部分成本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第三,雲原生意味着一定是彈性伸縮的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"彈性伸縮,也就是可以根據用戶的實際需求進行資源分配與使用"},{"type":"text","text":",而不再是過去通過預採購或預分配的方式。過去,客戶大部分都是先預估,然後採購,所以資源利用率一直被詬病;現在則不需要用戶再預估自己未來可能會用到多少資源,而是可以根據實時的使用需求實現彈性伸縮。也因爲這樣,通過提高資源利用率,雲數據庫纔可以實現成本上的優勢。但是,極致彈性伸縮對數據庫在更高程度的SQL支持、分佈式事務能力方面,提出了更高的要求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第四,雲數據庫產品化服務化程度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"國內數據庫發展也經歷多個階段,但正是雲計算、互聯網的時代興起,國內諸多雲廠商得以抓住機會,基於自身業務場景特點和需求,發展新一代數據庫等基礎軟件技術。互聯網廠商基於內部業務場景發展自己的技術體系,這是優勢的一面,而在to B開放的過程中,同時也面臨產品標準化、通用性、使用體驗等挑戰。面向行業客戶提供技術產品,其要求比支撐內部使用高得多。對於傳統企業客戶而言,騰訊雲希望提供給到客戶的是一個完整的產品,而不是一個半成品。因此,產品化程度,是騰訊一直持續強調的能力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"第五,海量場景驗證。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最後關鍵的一點是,對於雲數據庫而言,包括穩定性、特性需求等基礎能力的發展,核心條件是需要有足夠的應用場景進行打磨。數據庫系統的研發、完善是一個非常複雜的過程,如何讓數據庫得到實踐、得到應用?走到今天,我們認爲,持續的、海量的場景打磨,是產品發展的關鍵條件。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這些挑戰是雲數據庫發展過程中的必經之路,也是我們在雲計算時代創造出新一代分佈式數據庫產品的機遇。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"雲數據庫未來關鍵趨勢"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"基於這些挑戰以及雲計算時代賦予的機會,我們認爲未來雲數據庫發展將包括幾大趨勢要求:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"彈性伸縮:解決成本核心問題——資源利用率"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"前面提到,成本與性能是核心的要素。這裏引申出來一個雲計算時代的差異,那就是我們需要實現對CPU、內存和磁盤等基礎設施資源的靈活調度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"雲數據庫時代我們將通過對極致的彈性伸縮架構探索,來綜合解決性能、效率和成本問題。針對不同的場景側重,雲原生分佈式數據庫可分爲兩種架構:一種是Shared Nothing,一種是Shared Storage,兩者都可以通過實現計算與存儲分離架構來整體獲得更優秀的彈性伸縮能力,克服傳統架構下的存儲量受限、擴展難、主從延遲高等缺點,同時也能夠幫助我們將成本控制得更低,充分釋放領先技術的成本效益。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而計算與存儲全Serverless架構的數據庫服務也是未來可重點關注的方向,它在可自動無感擴縮容的基礎上,同時實現可按實際使用計費,不用不付費,提升雲數據庫效用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"多模多引擎趨勢下的數據庫底層與服務超融合"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"新基建、產業互聯網快速發展,各行各業數字化進程加速,數據形式越來越多樣化,越來越海量,如何能最高效地解決數據庫在性能、成本、服務等方面問題,超融合是必然趨勢。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當下我們處於各行各業都在推進電子化、信息化建設和數字化轉型的趨勢浪潮之下,行業不斷湧現出大量的新興場景。數據庫作爲支撐各類IT系統架構的基礎軟件技術,其整個技術形態也隨之出現各類新的應用實現,包括大量的NoSQL實踐,以及存儲領域有傳統的B+ Tree、現在的LSM Tree,和行存、列存等架構形態產品;而根據workload類型區分的話,則湧現出包括OLTP、OLAP,或者兩者混合形成的HTAP型數據庫等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"而多種多樣的引擎產品,在大多數情況下不會獨立存在來服務於一個企業或系統。One size fits none。從技術角度看,極致的性能成本與通用性有着天然的矛盾,因此,在多樣化場景下,一定會是多引擎共存,充分發揮各種引擎的特點與優勢,才能實現極致與通用的兼得。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是不是作爲雲數據庫服務廠商,我們把這些各類引擎產品都暴露給客戶、開發者自行選擇呢?從產品服務體驗的角度看,必然不是。多模態技術引擎的現狀必然對開發者選型帶來選型、開發應用上的困難——即如何能夠在保證適應不同的場景下,同時獲得足夠高的性能表現,這也是當前數據庫發展面臨的一個困境。爲了解決這個問題,未來我們希望是不需要用戶來進行這些複雜的選擇,而是系統基於AI智能調度、serverless等解決方案,來徹底實現多引擎的統一標準化服務。從底層的角度看,未來開發者無需感知具體的產品選型,而比如在做數據分析的時候,系統能夠自動幫助調度性能最好、事務交易一致性得到保障的方案。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在此基礎上,未來雲數據庫服務的趨勢還是交付方式的融合,包括軟硬件一體化、私有云與公有云平臺融合等多種產品和服務交付方案,能夠讓客戶在敏感業務和運營成本之間實現更加精細化管理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"智能化:AI+DB"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"智能化技術等底層技術生態融合變革,實現數據庫自治與智能管理也是未來數據庫趨勢之一。過去,對於一個企業,也許幾個DBA來管理幾十套實例就足夠了,但比如對騰訊來說,數十萬的數據庫實例,難以通過配置人力來維持運營,因此倒逼我們必須要通過工具或平臺來解決運營效率的問題。此外,當前分佈式微服務改造的趨勢下,未來企業IT運營也將具有越來越強烈的自治需求。智能化技術與數據庫底層的融合,能實現對數據庫進行全生命週期智能管理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"加速釋放新硬件紅利"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"過去一個新硬件的推廣週期很長,很多傳統企業在採購新硬件方面相對非常保守。而對於雲廠商來說,相對有條件逐步率先探索新硬件的應用,比如先在非關鍵性應用,同時也具備海量的場景驗證,來實現穩步規模化的推廣。這個角度來說,基於雲計算服務,雲原生數據庫相對更加容易探索、釋放到新硬件帶來的紅利。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當前我們也出於新硬件創新層出不窮的時代,包括SSD、NVM、RDMA+SPDK、千核服務器、異構處理器等,基於雲數據庫服務,廣大的客戶、普通開發者也能夠更快速地享受到新硬件帶來的加持。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,融合、自治、效用是未來企業級分佈式數據庫基本特點。騰訊雲數據庫將從實踐層面對以上趨勢進行落地推進,來滿足各行各業客戶未來對數據庫的多樣性需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"作者介紹:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"潘安羣,"},{"type":"text","text":"騰訊雲數據庫技術負責人、中國計算機學會CCF區塊鏈專委委員。潘安羣自加入騰訊起,主要作爲核心技術負責人開展騰訊雲分佈式數據庫、區塊鏈等技術研發,至今帶領團隊研發出騰訊雲數據庫TDSQL、企業級區塊鏈平臺TBaaS等,擁有超過13年分佈式計算和分佈式數據庫研發經驗,研發成果多次入選國際頂會VLDB等。他帶領團隊打造的安全可控分佈式數據庫TDSQL,是業內首個應用於互聯網銀行核心交易系統、首個進入銀行傳統核心系統、首個助力傳統大型銀行實現銀行業首例“大型機”下移分佈式平臺的國產分佈式數據庫。"}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章