從NoSQL到Lakehouse,Apache Doris的13年技術演進之路

{"type":"doc","content":[{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"整理:蔡芳芳、Tina"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"採訪嘉賓:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"百度 Apache Doris 主創團隊"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"馬如悅、張志強、陳明雨、武雲峯、楊政國、繆翎、魯志敬等"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"從2008年第一個版本開始到今天,"},{"type":"link","attrs":{"href":"https:\/\/github.com\/apache\/incubator-doris","title":null,"type":null},"content":[{"type":"text","text":"Apache Doris"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"已經走過了13個年頭。從推出之初爲了滿足百度商業系統的業務專用需求,到後來爲解決通用報表與數據分析需求進一步改造,並在2017年改名Palo開源(詳見InfoQ當時"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.cn\/article\/baidu-palo-database-announces-open-source","title":null,"type":null},"content":[{"type":"text","text":"報道"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"),再到2018年用回Doris這個名字並進入Apache軟件基金會孵化,Apache Doris的願景一直是成爲世界頂級的分析型數據庫產品。但與此同時,進入雲原生時代,Apache Doris也已經有了它新的定位和目標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"早在Apache Doris開源之初,InfoQ就曾採訪過項目負責人馬如悅,而今年正好是這個項目開源的第四個年頭,我們再一次找到百度Apache Doris主創團隊,跟大家聊聊Apache Doris的過去、現在和未來。據透露,目前Apache Doris的畢業籌備工作已經啓動,團隊接下來的工作重心之一就是推動 Apache Doris儘快從Apache基金會畢業成爲頂級項目。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"以下內容整理自訪談實錄。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"Apache Doris的新目標"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:Apache Doris發展至今,已經13年了,如果要將發展歷史劃分成幾個階段,您們認爲是怎樣的?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Doris的十多年曆史,走到今天,我們重新去審視,去掉細枝末節,大體可以分爲三個階段:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"“NoSQL\"階段(2008-2011年)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這個階段主要是滿足百度商業系統幾個大業務的專用需求。這幾個業務,需要給幾十萬到幾百萬的客戶或者用戶提供實時的報表分析與可視化能力。而傳統的分析數據庫,基本上主要支撐公司內部自己的BI需求,而這些BI需求,對數據入庫的時效性、查詢的併發性、查詢的延遲性要求都不是很高。所以使用傳統的分析數據庫根本無法支撐互聯網公司全新的分析需求。當時,我們採用了那時候市場上比較火的NoSQL KV數據庫來存取數據,並且自己實現了一個專用的分佈式查詢引擎,這個查詢引擎不是SQL接口,而是類似REST API,提供了一些聚合函數調用給業務使用來解決需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"“NewSQL\"階段(2012-2020年)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這一階段的研發重點主要是滿足以下新的需求:1) 通用的報表與數據分析需求開始增多,大家需要SQL接口;2) 原來的KV存儲引擎無法提供足夠的性能支撐越來越多的需求。所以,我們開始研發新的Doris系統。首先,我們研發了全新的單機列式存儲引擎olapengine,先是使用單機MySQL來做SQL查詢引擎,通過分庫分表方式來解決分佈式大規模問題;後來又將單機的列式存儲引擎改造爲全分佈式列式存儲引擎,把單機MySQL查詢引擎改造爲MPP的SQL查詢引擎。分佈式存儲和分佈式SQL查詢引擎的改進,大大提升了性能和應用場景滿足度,Doris在百度被大規模採用。2017年,Doris也正式對外開源。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"“"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.cn\/theme\/106","title":null,"type":null},"content":[{"type":"text","text":"LakeHouse"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"\"階段 (2021年開始)"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"隨着用戶需求不斷進化和雲計算技術的廣泛推進,Doris需要考慮離線在線一體化、存算分離、實時更新、半結構化數據分析支持等需求。這些需求總結下來,簡單地說就是用戶希望擁有傳統MPP數倉和基於數據湖的湖分析融合能力。目前Doris就處在這一階段,正在全力研發這些新的功能。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:Apache Doris的設計目標是爲了解決什麼問題?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"因爲技術和需求會隨着時間發生變化,Doris也會跟着每個階段去制定不同的目標。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"第一階段Doris主要還是滿足專用系統的統計分析需求,第二階段主要是滿足通用的報表與數據分析可視化需求。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"到今天,我們發現用戶或者客戶對數據的分析需求,逐漸收斂爲三大塊:"}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"50%的需求依舊是各類報表和數據分析可視化需求,就是我們經常提的BI的需求;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"20-30%的需求,是對日誌等半結構化數據的搜索分析需求;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"20-30%的需求,是對數據科學與機器學習的需求;"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"而新的Doris將會針對這三類場景,進行重點功能和性能設計,以便支撐這三類需求。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:Apache Doris最初的定位是什麼?10多年過去後,這個目標定位是否有了變化?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Doris最初的定位是"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"新式數倉"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",滿足在線的數據分析場景,主要以高併發小查詢的性能最爲出色。但是發展到今天,它的定位正在發生變化,這個主要變化可以用一個"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"T形(一縱兩橫)"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"來說明。一縱就是指把原來Doris最擅長的在線結構化MPP數據分析性能優化到最快,而導入實時化、存儲讀寫性能優化、計算性能優化,這些會學習和借鑑ClickHouse的一些設計。兩橫之一是"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"支持半結構化數據"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",當前全球很多對日誌等半結構化數據分析都使用Elasticsearch,Doris後續會加強對ES所支持場景的滿足能力;另一橫,就是"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"擁抱雲原生技術"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",支持存算分離,支持較大的查詢,滿足對數據科學與機器學習場景的支持,這一塊需要多去借鑑Snowflake和Databricks的一些設計。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"當前Doris的新目標,就是主攻這個類似T形的一縱兩橫。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"只關注性能過於片面"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:現在業內出現了越來越多的各種OLAP軟件,相比較起來,您認爲Doris具有什麼樣的優缺點?適合什麼樣的使用場景?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Doris和很多其它競品不大相同的,主要是源於產業實踐。數據庫技術不同於應用層軟件,數據庫技術的研發需要積累多年,並且還要經歷大規模的實踐檢驗。在實踐中發現問題、發現需求,然後解決,這樣整個系統纔會比較實用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Doris運維非常友好:很多數據庫公司研發數據庫,但是自己又沒有大規模使用,所以對運維友好性支持欠缺。Doris來自於實踐,所以在多年的發展中增加了大量方便運維的特性,比如高可用、方便的擴縮容等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"比如爲了節省成本,Doris支持分層存儲,即一個表的一個Partition分區,可以設置爲過了多久以後自動從SSD磁盤轉移到SATA硬盤上。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"比如Doris的後端節點,需要管理員在前端主節點手動添加,好多人可能不理解,爲什麼不是後端節點自動彙報?問出這個問題,就可以發現其沒有一線工程經驗,自動彙報會帶來很多潛在的運維風險,都是我們曾經有過的血淚教訓,比如一個很久以前死掉的節點,突然重新啓動,那麼很可能就會誤加入進來,造成查詢不可控。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"比如Doris支持物化視圖和基礎表的數據一致性,這都是源自一線業務對數據一致性的強烈要求,業務無法接受物化視圖表和基礎表的不一致,因爲對終端用戶來講,不一致會帶來很多的理解問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"綜上,Doris裏面有大量的這種設計,這些功能對於不是一線運維的同學,或者運維經驗不豐富的同學,可能不會了解到其好處,反而還會認爲是壞處。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Doris主要做的不好的我認爲有兩處,一個是對傳統數倉的兼容性,畢竟它來自互聯網公司,在推廣到傳統數倉領域時,在一些SQL兼容性上遇到了一些問題,當前正在優化解決;另一個是對雲原生技術的全面擁抱,Doris最初設計時,主要還是考慮私有化部署,那時雲計算還不火。但當前雲技術的採用正在加速,所以Doris後續也會加強對雲原生的深度融合適配。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:2017年,您在"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.cn\/article\/baidu-palo-database-announces-open-source","title":null,"type":null},"content":[{"type":"text","text":"InfoQ的採訪"}]},{"type":"text","text":"中說過“性能不該是唯一關注點”,現在您們對Apache Doris的要求是否有變化?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們的觀點還是沒有變化,雖然市場上依舊是看性能爲主。我們認爲一個生產級別的數據庫,要綜合考慮各個方面,穩定性、易用性等,都需要考慮在內。比如,很多人一直抱怨Doris沒有"},{"type":"link","attrs":{"href":"https:\/\/github.com\/ClickHouse\/ClickHouse","title":null,"type":null},"content":[{"type":"text","text":"ClickHouse"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"快,這個我是認爲比較片面的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"就拿性能來說,一個在線系統,尤其針對高併發的在線分析系統,需要關注整個系統對衆多併發查詢都能提供穩定的響應,還要充分考慮預留足夠的資源給可能突發的一些查詢。如果一個查詢就把所有磁盤和CPU全部用滿,那麼其它查詢如何保證得到足夠的資源進行響應?多併發來了,如何保證系統內存不崩?所以,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"有些設計不是能不能做到的問題,而是要考慮應該不應該這樣做的問題。"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"比如Doris的每個查詢,就會控制內存和IO線程的使用,並不是全量將系統的算力資源耗盡,而是在儘量滿足性能響應需求的情況下,理性控制其使用量。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"而易用性、運維友好這個可以追求極致,你會看到Doris爲了"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.cn\/article\/PHF3gFjUTDhWmctg6kXe","title":null,"type":null},"content":[{"type":"text","text":"不額外引入ZooKeeper"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這種系統造成運維複雜,自己研發了一套內置的多FE系統。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"當然,我們在面向To B推廣Doris時,很多人經常會通過單一SQL的查詢性能來衡量這個系統優還是劣,POC測試對性能非常看重。針對這些情況,Doris後面會採用類似汽車中的駕駛模式那種形式,提供Normal和Sport模式。當你將Doris設定爲Sport模式時,Doris將會以性能最快方式運行,榨取系統每一滴算力。而Normal模式,我們更建議在線上使用,以保持系統的穩定性和應對突發請求的能力,不要讓系統始終運行在崩潰邊緣。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:您們團隊在這幾年的維護過程中,投入了多少人力,解決了哪些比較關鍵的技術問題?做了哪些功能優化?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這幾年團隊成員有過變化,但團隊規模一直在穩步增加,目前好幾個方向的人員數量加起來有40多人,既包含了Doris Core 核心數據庫的研發,也包含了百度智能雲上產品和外圍生態組件的前後端開發人員,還有一支實力強大的產品和運營團隊。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"從開源至今,在社區的共同努力下,Doris得到了前所未有的飛速發展,做了非常多的功能迭代和更新。主要包括以下幾方面:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"流式導入功能幫助 Doris 從分鐘甚至小時級別的導入延遲推進到了秒級,更好地支撐了準實時的業務需求;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"完全重構了存儲引擎,提升擴展性的同時,支持了包括二級索引、字典壓縮編碼在內的多項實用功能;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"進行了大量的大數據生態打通工作,包括 "},{"type":"link","attrs":{"href":"https:\/\/spark.apache.org\/","title":null,"type":null},"content":[{"type":"text","text":"Spark"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"、"},{"type":"link","attrs":{"href":"https:\/\/flink.apache.org\/","title":null,"type":null},"content":[{"type":"text","text":"Flink"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"、"},{"type":"link","attrs":{"href":"https:\/\/github.com\/elastic\/elasticsearch","title":null,"type":null},"content":[{"type":"text","text":"ES"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"、"},{"type":"link","attrs":{"href":"https:\/\/github.com\/apache\/hive","title":null,"type":null},"content":[{"type":"text","text":"Hive"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"、"},{"type":"link","attrs":{"href":"https:\/\/kafka.apache.org\/","title":null,"type":null},"content":[{"type":"text","text":"Kafka"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" 的直接連通,使得Doris不再成爲數據孤島;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在明細數據上擴展了預聚合模型,完成了明細、聚合模型的數據統一訪問;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"全新的向量化執行引擎和資源隔離方案也即將發佈,將進一步提升 Doris 的數據分析性能和業務應用場景;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"還有其他非常多的穩定性和易用性的提升,也是得益於開源後社區用戶的不斷打磨和反饋。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:Apache Doris和數據湖架構之間有哪些區別和聯繫?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Doris最初設計是存算一體化的MPP數據倉庫,偏在線分析。而數據湖架構的分析,主要是存算分離,偏離線或者交互式分析,存儲引擎一般是HDFS或者對象存儲,而分析引擎類似Spark\/Hive\/Presto。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"從去年開始,大家已經開始廣泛地推進Data Warehouse和Data Lake架構的融合,即是所謂的湖倉一體,"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.cn\/theme\/106","title":null,"type":null},"content":[{"type":"text","text":"Lakehouse"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"的架構。Doris也正在從數倉架構向Lakehouse演進。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:在周邊生態上,最近幾年有了一個什麼樣的變化?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:最大的變化就是SQL的取勝,實時的取勝,雲原生的取勝。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"SQL的取勝:從使用Java寫MapReduce、Pig,用Scala寫Spark程序到PySpark,最終還是SQL笑到了最後,SQL佔據了數據分析的80%;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"實時的取勝:人們對於速度的追求是無止境的,一個事情不能做,希望可以做到,這個事情可以做到了,希望能越快越好。數據分析領域正在全面擁抱實時化的需求,希望實時的數據導入,希望實時的數據產出。從離線做起的Hive、Spark正在不斷優化查詢性能,而那些直接從實時性能切入的MPP數倉和實時湖分析,比如Presto,正在全面攻佔在線實時市場;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"雲原生的取勝:雲原生已經不再是噱頭,而是正在成爲關鍵賦能技術,Snowflake的大賣,讓雲原生成爲每個數據分析產品都繞不開的領域。"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"基礎設施軟件必然要開源"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:您們當初是如何選擇開源的時機的?Doris加入Apache經過了一個什麼樣的流程?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Doris從13年設計新版時,就考慮到了未來會開源出去,所以,我們在13年設計時,就沒有依賴百度內部任何一個庫,並且整個系統也不依賴百度任何服務就可以獨自運作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"百度很多系統難以開源,主要是開始設計時,對百度內部閉源庫和內部系統的依賴較多,導致開源的時候需要大量重寫,最終使得開源難度非常大。Doris沒有這個問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Doris從13年就堅信未來基礎設施軟件必然是開源的"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",只有開源才能保持活力和持續迭代。並且像Doris這種基礎軟件,需要較大投入,如果不開源,不尋找其它價值點,是很難讓一個大公司持續投入資源來維持其不斷髮展的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https:\/\/www.apache.org\/","title":null,"type":null},"content":[{"type":"text","text":"Apache"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"是對開源極其友好的基金會,在大數據領域,Apache軟件基金會的項目都極具影響力,比如Hadoop和Spark都是Apache軟件基金會的項目,所以Doris開源時也選擇了Apache軟件基金會。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:您們認爲什麼樣的開源軟件可以稱之爲是開源成功的?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們認爲衡量開源的成功與否關鍵在於以下三點:"}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"被廣泛認可的產品價值"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"繁榮、自治、良性發展的社區生態"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"開源與商業化的平衡與共存"}]}]}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:您們怎麼看開源文化?您們團隊是如何構建開源文化的?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"作爲任何一個技術人員,開源已經成爲了一種信仰,一方面是解決更多人的問題所帶來的成就感,另一方面就是社區的廣泛參與必定爲項目帶來更好的活力,所以我們非常鼓勵團隊成員參與開源。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:在參與開源的過程中,您們有什麼樣的經驗可以和大家分享?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"開源社區不是隻有維護團隊,每一個開源產品的使用者其實都是開源社區的一份子。在使用開源產品的同時,也可以多多回饋社區,這樣開源產品纔能有更旺盛的生命力。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這裏引用我們社區裏一些用戶的話  “在開源過程中,你會結識志同道合的朋友,獲得朋友的認可與支持,甚至能夠與自己崇拜的業界大佬共同交流。”、“我們每個人都有能力讓社區變得更好,在社區幫助我們成功支持業務的同時,我們也應該儘自己所能,去回饋社區、幫助社區,哪怕只是一個文檔的修復,也是幫助。”"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"上面其實也是我們想傳達的理念,參與開源其實沒有什麼門檻,我們希望能有更多的小夥伴參與到社區建設中來。不論是提交 Issue 或參與討論、幫助我們打磨產品和豐富功能,或者是修改和完善系統文檔,或者是貢獻應用案例、讓我們知道 Apache Doris 在真實業務場景中還能發揮出超出我們想象的能力,亦或是口碑相傳、讓 Apache Doris 被更多人知曉,都是幫助 Apache Doris 在成長道路上更進一步!"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:您們如何看待開源項目社區之間的競爭與合作?面對中國開源市場,您有什麼好的建議、寄語與大家分享麼?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:開源社區之間其實不存在競爭一說,倒是有非常大的合作空間。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"代碼和社區其實不用一概而論,代碼是代碼,社區是社區。"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"使用代碼的人是用戶,這些用戶是完全自由的,如何選擇一款開源產品及其代碼是由用戶自己的技術認知和業務需求來決定的,這裏的競爭是存在於代碼層面的。而開源社區其實在代碼之上,也就是Apache 理念的Community Over Code,每個人都可以參與到社區,不管是不是用戶,不管有沒有需求,都可以作爲獨立的身份加入到社區裏來。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"社區的發展有先後之分,社區間的合作可以幫助社區在更大範圍的人羣中得到傳播,也能幫助新興社區更快成長,還可以讓開源代碼汲取到更豐富的養分。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"對於中國開源市場,希望能有更多的開源項目可以蓬勃發展,這也會讓每一個人從中受益。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"開源與商業化協同"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:您們如何理解開源和商業化之間的關係?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"當前大量底層技術產品都採用開源模式,客戶也願意採用開源產品,所以大環境也會逼着你去開源;另外,在商業市場中存在着 2\/8 原則,即 80% 的收入來自 20% 的付費用戶,而另外 80% 的用戶貢獻收入並不高,然而前者無論開源與否,都可能付費;而後者則更喜歡開源產品;但是,其中最重要的一條規律是,前面 20% 付費用戶的選擇會參考後面 80% 用戶的選擇。因此從商業上來看,讓產品開源,讓 80% 的用戶免費使用你的產品,必然會帶來很好的口碑,這直接會影響到那 20% 的高付費用戶,20% 的這羣高付費用戶更多地關注服務。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"所以,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"對於未來的技術產品,開源可能成爲必須,這個“必須”不一定損害商業模式,反而會促進商業上的成功。"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"最近一兩年我們也跟很多面向開源軟件領域的投資人有過多次溝通,開源和商業化的之間必定是相互成就的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"但開源與商業化如何協同是當前和未來開源面臨的問題。"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"開源與商業化需要找到一個良性並存的方式,才能將開源推向另一個高度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"當前開源與商業化如何協同,業內都在探索,還在苦苦尋求中。付費技術支持、Open Core、SaaS模式仍然是三個主要的商業化模式,但是在實際操作中都有其大的問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"但是,我相信,隨着各類基於開源的商業化公司的不斷探索,成功與失敗,最終一定會探索出比較好的商業模式。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:Doris的商業化路徑是怎麼規劃的?目前已經有哪些商業客戶?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"商業化路徑方面,我們認爲雲上纔是未來,因此我們數年前就在百度智能雲上推出了基於Apache Doris的企業版產品Palo並提供了雲端託管服務,通過雲服務的優勢(比如按需取用和更加可控的海量資源、從繁瑣的運維工作中解放人力等)去滿足更多企業上雲的需求。我們"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"雲上Palo的核心代碼與開源版完全一致"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":",避免用戶可能擔心被公有云廠商強綁定。我們公有云託管服務的價格,比用戶購買物理機甚至雲上虛機自行搭建的費用還要低。我們還基於Palo提供了管控運維平臺等一系列雲上組件,通過豐富的外圍組件給用戶帶來體驗更加的雲上服務,目前我們的自助分析平臺Studio和可視化運維監控Manager已經逐步成熟起來。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"目前我們已經拿下的商業化客戶大概有接近50家,包括銀聯商務、知乎、四川航空等,更具體的數字就不進一步展開了。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"InfoQ:Doris所在的市場或所覆蓋的應用場景,市場潛力還有多大?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"Apache Doris團隊:"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"數據分析場景主要是三大塊:數據倉庫與商業智能、日誌檢索與分析、數據科學與機器學習場景,這三大場景佔據了客戶80%的數據分析需求。這三大場景的不斷髮展,未來一定會將數據分析的需求推爲企業No.1的需求。從各大諮詢調研報告來看,數據分析產品的增長依舊位列各種軟件產品的第一位。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文選自《中國卓越技術團隊訪談錄》(2021 年第五季),"},{"type":"link","attrs":{"href":"https:\/\/www.infoq.cn\/minibook\/LxX7bFUwKH17bzxQkSKt","title":null,"type":null},"content":[{"type":"text","text":"點擊下載電子書"}]},{"type":"text","text":",查看更多獨家專訪!"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"《中國頂尖技術團隊訪談錄》品牌升級,現正式更名爲《中國卓越技術團隊訪談錄》"},{"type":"text","text":",這是InfoQ打造的重磅內容產品,以各個國內優秀企業的IT技術團隊爲線索策劃系列採訪,希望向外界傳遞傑出技術團隊的做事方法\/技術實踐,讓開發者瞭解他們的知識積累、技術演進、產品錘鍊與團隊文化等,並從中獲得有價值的見解。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果你身處傳統企業經歷了完整的數字化轉型過程或者正在互聯網公司進行創新技術的研發,並希望 InfoQ 可以關注並採訪你所在的技術團隊,可以添加微信:caifangfang842852,請註明來意及公司名稱。"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章