支撐700億數據量的ClickHouse高可用架構實踐

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"講師介紹"},{"type":"text","text":":蔡嶽毅,攜程旅行網酒店研發中心高級研發經理,資深架構師,負責酒店大住宿數據智能平臺,商戶端數據中心以及大數據的創新工作。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大家好,我是來自攜程的蔡嶽毅,今天給大家分享ClickHouse在我們大數據平臺的應用,主要從應用的角度來介紹我們的高可用架構。其實這個百億,我沒太糾結,來之前我查了一下,現在我的平臺上面是將近700億數據,壓縮前是8T,存儲是壓縮後1.8T。以前我認爲大數據要給用戶提供更好的體驗都要靠分佈式的大量鋪機器才能換得更好的性能,但ClickHouse的出現,改變了我的看法。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"今天也是主要分享我們如何合理的利用好ClickHouse,如何合理的利用硬件資源,根據我們的數據量、應用場景以及合理的架構來支撐我們的數據量和使用場景,爲用戶提供更好體驗的大數據平臺。當前根據我們的積累主要是十臺物理機的ClickHouse支撐,當然,我也評估過數據量再翻個倍,除了SSD存儲空間需要擴容,CPU和內存不會成爲我們的瓶頸。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我今天主要從這幾個方面給大家分享一下ClickHouse,爲什麼選擇ClickHouse?我們在實際應用中的一個高可用架構,最後就是給大家介紹ClickHouse的一些優點,還有現在ClickHouse的一些問題和我們對未來的一些展望規劃。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"一、爲什麼選擇ClickHouse"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"根據實際業務場景需要來選擇"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1、不固定的查詢條件,不固定的彙總維度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我的數據底層都是基於Hive,然後我們要給業務提供一個大數據實時計算平臺,所以我們的查詢條件不固定。比如像酒店,國內、省份、城市、商圈、星級等一堆條件,這樣的場景按平時我們用的SQL是無法支持的,因爲篩選條件太多,我們是不能限制用戶必須選擇什麼條件的,所以SQL的索引,左側原則基本上很難用上。再就是彙總維度,大數據裏十個查詢至少有八個都可能通過不同方式來sum彙總數據,很少有獲取明細數據的場景,所以彙總維度也是不固定的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"2、數據量日益增量,每天要更新的數據量也不斷增大"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們平臺是從近五年開始做的,我們沉澱了各種業務線的報表。其實大家工作中也能發現,我們上線了一個東西無論它價值如何也很難將其下線,所以你必須要去運維它,如果一直沉澱下來,我們的數據量也是越來越大,場景越來越多,維護成本也越來越高。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"3、業務場景不斷增多,涉及面越來越廣。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"系統我們是要給各個業務線用的,包括很多都是老闆。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"4、需要保證高可用並秒出。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果我們不能保證秒出,那就是我們的技術問題啦。從技術的角度來說,我們要爲用戶提供一個體驗好的產品,也必須要做到秒出。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"5、從SQL、ES、Kylin、Ingite、CrateDB、MongoDB、HBase 不斷的研究,實踐。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從上面的一些場景,我們研究了很多數據庫。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最初我們是用SQL,2015年開始做這個平臺的時候,其實沒有像現在這麼多成熟的大數據技術。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"大概2016年,我開始在部分場景應用ES,其實這幾年ES在很多公司裏面都有應用,大家也應該比較瞭解,ES在搜索上面有很大的優勢,速度也是非常快的。特別是對於做大寬表之類的訂單查詢、搜索產品信息,ES其實很強,高併發QPS上千基本上沒問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Kylin也是大數據方面的,其實它是相當於把時間花在它的機器層面提前算好,但是如果我們的維度不固定,建Cube的時間會非常長。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Ingite是內存數據庫,我在2018年測試過這個數據庫,10臺虛擬機是8核24G內存,性能確實能做到秒級,但是不能支持高併發,併發上來內存就會被打爆,同比ClickHouse硬件成本是不划算的,完全基於分佈式內存,所以我後面也沒有用它。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"CrateDB底層是基於ES,但CrateDB解決了ES不能join的問題,但是一樣的,在高併發的時候一樣會把內存打爆,ES大家應用的時候發現它的語法比較複雜,CrateDB解決了這個問題,我們可以通過寫SQL的方式獲取數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"MongoDB要建一些索引,強依賴左側原則,當走索引的時候性能確實很好,但我們的搜索條件不固定,無法每次都能靠上索引。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"HBase屬於非結構化數據存儲的數據庫,在實時彙總計算方面也不合適。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"ClickHouse的特點"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過這一系列的實踐和對比,最後是在2018年7月份左右選擇了ClickHouse。我不知道大家對ClickHouse的瞭解有多少,其實它也是這一兩年才被國內大部分大廠認可的一個OLAP數據庫。2018年我開始在用它的時候,百度上面的資料非常少。它是俄羅斯的一家做搜索引擎的公司開源的列式數據庫。"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1、優點"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"1)數據壓縮比高,存儲成本低。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ClickHouse最大的特點就是快,其他的比如數據壓縮比高、存儲成本低等等,所以以前我們有很多的功能埋點都集中在ES裏面,但是從年初開始到現在應該是把所有的ES埋點全部轉成ClickHouse,所以根據ClickHouse的數據壓縮比,首先就可以評估到我們硬件成本比採用ES的方案時它至少降低60%以上,日誌在ES和ClickHouse上面的查詢性能這裏就不展開對比。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"2)支持常用的SQL語法,寫入速度非常快,適用於大量的數據更新"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"它的語法跟MySQL比較類似,但是它有一個特點就是它的join不能太複雜,A表join B表的時候不能直接join C表,需要把A表join B表的AS成一個帶別名的臨時表以後再去join C表,所以它的語法主要還是在join上面會比較獨特。如果你的查詢語句很複雜,你的join就會看起來很長,所以查詢語句可讀性不像SQL那麼好理解。但是它的寫入速度非常快,特別適合於像我們的離線數據每天都是幾億幾十億數據量的更新。官方資料介紹它是按照每秒鐘50-200兆導入速度。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"3)依賴稀疏索引,列式存儲,CPU\/內存的充分利用造就了優秀的計算能力,並且不用考慮左側原則"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"它是依賴稀疏索引,列式存儲。我們在去取數據的時候,經常會只取某幾個字段,按列存儲對IO比較友好,減少IO的次數,也是在查詢速度上一個輔助。再就是它很大程度利用了CPU,我們都知道MySQL是單線程獲取數據的,但是ClickHouse服務器上面有多少個CPU,它就會用服務器的一半CPU去拉,像我們平時用的40核或者32核的物理機,基本上拿一半的核去拉數據。當然,這個可以修改配置文件每個query用多少CPU。因爲它一個查詢需要消耗太多的CPU,所以在高併發上面是一個短板。當然,我們也不需要考慮什麼左側原則之類的,就算你的查詢條件不在索引裏面,ClickHouse的查詢一樣非常快。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"2、缺點"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"1)不支持事務,沒有真正的update\/delete"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不支持事務,沒有真正的update\/delete,主要還是高併發的短板,所以我們應用都在一些能Hold住的場景下。如果對外放在公網,這個QPS就可能很難控制,這種場景用ClickHouse就要謹慎。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"2)不支持高併發,可以根據實際情況修改qps相關配置文件"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ClickHouse喫CPU,可能團隊十個人通過執行同一個查詢就可以把一臺CPU 40C的物理機打爆,但是爲什麼我前面說我們有700億的數據只需要十臺物理機就可以扛得住呢?其實我們對ClickHouse做了很多保護。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"二、ClickHouse在大數據平臺的應用"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"ClickHouse在酒店數據智能平臺的架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這是我們實際應用中的架構圖,底層數據大部分是離線的,一部分是實時的,離線數據我們現在大概有將近3000多個job每天都是把數據從HIVE拉到ClickHouse裏面去,實時數據主要是接外部數據,然後批量寫到ClickHouse裏面。我們數據智能平臺80%以上的數據都在ClickHouse上面。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/12\/89\/1276cdf3ffyyd07686f27a0474f71b89.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"應用端在查詢的時候中間也有緩存的概念,其實所有的數據我都會通過緩存去過一遍,如果說所有的數據都從ClickHouse上拿的話,ClickHouse的高併發是扛不住的。剛開始我用ClickHouse的時候,我們幾臺物理機每天早上9點鐘的時候CPU會拉到很高,如果CPU打到百分之六七十的時候,響應時間就會變慢,這個時候會造成排隊查詢積壓,然後就形成惡性循環,服務器可能就被打掛了,查詢也進不來了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"針對這種情況我們對於常用的一些查詢進行緩存,具體緩存方案後面我們再展開。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"ClickHouse的全量數據同步流程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲ClickHouse在數據同步的時候對MySQL的數據同步是很友好的,就類似於MySQL裏面一個表的數據導到temp表裏面去,加一個服務器地址,賬號密碼就可以了。"},{"type":"text","marks":[{"type":"strong"}],"text":"下圖是我們的一個數據同步的流程:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"清空A_temp表,將最新的數據從Hive通過ETL導入到A_temp表;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"將A rename 成A_temp_temp;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"將A_temp rename成 A;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"將A_ temp_temp rename成 A_tem。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/96\/13\/965bef46ec57d34d1ebcc7612ccf8413.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"最初我們的ETL工具不支持直接從HIVE往ClickHouse裏面導,所以我們是通過MySQL,然後我們自己寫的程序,由MySQL往ClickHouse導,後面我們有一個job一直在輪詢檢查insert into的這個進程是否完成了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其實全量的數據同步是很簡單的,按我們現在的ETL工具也可以做的更簡單,可以做到不依賴MySQL維表維護,不需要程序job來做rename,我們這樣只是爲了保持統一的技術方案,因爲我後面的增量也是用類似的方式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以前通過MySQL導的時候,幾千萬數據導入ClickHouse的時候不是1秒就可以導入完成的,每執行一個insert的時候會產生一個進程ID,如果沒有執行完,直接Rename就會造成數據丟失,數據就不對了,所以必須要有一個job在輪詢看這邊是不是執行完了,只有當insert的進程id執行完成後再做後面一系列的rename。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"ClickHouse的增量數據同步流程"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲我們有大量的單表數據量是超過好幾億的,所以每天的更新是儘量往增量的方式去更新,如果大家都全量更新,再好的服務器也會被我們拉掛,所以要儘量的根據增量方式來更新數據。"},{"type":"text","marks":[{"type":"strong"}],"text":"下圖是我們增量的數據同步流程:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"numberedlist","attrs":{"start":1,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"清空A_temp表,將最近3個月的數據從Hive通過ETL導入到A_temp表;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"將A表中3個月之前的數據select into到A_temp表;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"將A rename 成A_temp_temp;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"將A_temp rename成 A;"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"將A_ temp_temp rename成 A_tem。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/13\/b1\/13c0eebe8865810fcd8a85e542669db1.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比如說訂單業績數據。因爲我們訂單比較特殊,比如兩個月前的訂單狀態可能還會發生變化,可能會提前好幾個月預定。不像我們購物訂單,一個月前的訂單狀態基本上已經固定下來了,快遞、物流的訂單兩週前的狀態基本上也不會發生變化,酒店、機票都會存在提前很長時間預定以及歷史訂單是否成交這種很久之前的訂單狀態還會發生變化,這種訂單狀態變化時間跨度會比較長,所以我們更新的歷史數據也會比較多。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們現在主要是增量更新過去三個月到未來,因爲過去三個月的數據變化是基本上可以涵蓋大部分,我們會把三個月的數據先導到一個temp表裏面去,如圖上也有一個輪詢,一定要輪詢檢測到最近3個月的數據導入完成後,再把正式中三個月以前的數據導到這個temp表裏面來。這個動作也不是一秒兩秒就能執行完的,所以同樣有個job會輪詢,並且這個操作我們實踐中也發現了一個隱患。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"比如說我們的表裏面數據已經有五個億了,但是我們每天更新最近三個月的三五千萬數據量,那另外幾個億的數據需要從正式表導到temp表的時候,CPU和內存都會有波動,這是我一直在想辦法解決的一個問題。現在我們的方式就是到第二步就把正式表往temp表裏面導,後面就是和全量的流程是一樣的走完,每天這樣循環。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"增量的整個流程我覺得挺複雜的,所以我在兩週前跟騰訊雲的ClickHouse團隊也有過交流,這種case從ClickHouse應用的角度暫時沒有很好的解決方案,但建議是從業務場景的角度來切割,讓不再發生變化的數據沉澱在固定的場景給用戶查詢,這是一個方案。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"另外我們自己也嘗試過用視圖的方式,但視圖有個問題是會讓查詢性能慢2s左右,這個是我不能接受的,所以我們現在正在用REPLACE PARTITION的方式,但這個涉及到文件操作,執行時間雖然是毫秒,我們還在謹慎的灰度中。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"針對ClickHouse的保護機制"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"這個緩存架構是我們對ClickHouse集羣的一個保護,分爲主動和被動。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/b4\/41\/b4e284ef6c3c6b27f30dd1b3d01a1b41.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖是一個主動緩存架構圖。首先大部分數據從HIVE往ClickHouse裏面導,所有數據都是兩份或者三份。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們現在沒有用ClickHouse自帶的分佈式。因爲2019年年初採用分佈式遇到過一個問題,就是查A節點的時候,要從B節點和C節點拿數據過來時,這個中間分佈式內部會存在一個數據交互,如果QPS太高壓力足夠大的時候一個節點掛掉,可能整個節點都會受到影響。當然,如果你的節點足夠多是不用care這個問題的。如果分佈式只是三到四個節點我們覺得意義並不大,所以我當時就臨時把分佈式給去掉了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"還有一點考量是分佈式需要藉助Zookeeper,ClickHouse大部分我們都是自運維的,如果我們要保證ClickHouse高可用首先要保證Zookeeper高可用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"採用這個架構就強依賴Zookeeper了,這個時候我們團隊運維任務也會相應的增加很多。這個問題我們也跟騰訊雲的同事有聊過同樣的話題,瓶頸會出現在Zookeeper,因爲每天大量的數據更新會導致Zookeeper產生大量的日誌,所以我現在完全靠物理機。同一塊數據保存在兩臺或者三臺機器上面,然後我們會有一個job檢測同一個表的數據在幾臺服務器上是否運行完成,完成後我們的job會輪詢檢查這兩臺機器上面的數據是不是一致,通過這種方式起到數據監控的作用。如果用分佈式集羣並且要檢測數據的準確性,用同樣的方式就需要更多的硬件成本,所以以我們的量級採用單機多份存儲我也認爲是合理的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"回到緩存上面來,比如針對A表建立緩存,我們有個job配置某幾個ETL流程,然後一直在輪詢今天是否運行完成,如果完成了我們會針對A表會有一個緩存的標誌,比如設置一個時間戳,然後外面在應用的時候會獲取一個緩存標誌的時間戳,構建成爲一個新的緩存key,所以其實我們的緩存大家看上圖就知道了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"先拿到緩存的標誌中的值,構建成爲一個查詢緩存的key,然後查緩存裏面有沒有數據,如果有則直接從緩存返回數據,如果沒有才到ClickHouse裏面拿數據同時寫入緩存,下一次同樣的請求就會走緩存了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其實做大數據產品的人都知道,如果數據產品足夠好的情況下,用戶默認條件就可以看到他需要的大部分數據,那用戶也不會再切查詢條件了。根據這個用戶習慣,我們會模擬過去5天都有訪問過某些熱點數據的用戶,然後根據默認場景爲這些用戶提前創建一些緩存,這就是我們通常說的主動緩存。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們主動模擬用戶創建緩存後,用戶在9點鐘上班的時候查詢的數據大部分就走緩存了,這樣既可以提升整體平臺的響應時間也可以降低ClickHouse的高併發。當然我們不可能100%攔住用戶所有的查詢,但根據我們的埋點來看,我們現在整個平臺每天有60多萬的查詢量,一半以上都是走緩存了,被動緩存其實與主動緩存類似,主動緩存是我們監控etl流程完成後靠job來模擬用戶創建緩存,被動緩存就完全靠用戶。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/34\/53\/34b8e43d42ca586d3915e0e438edd353.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"上圖是我們的CPU的波動,2018年剛上線應用的時候,我每天早上9點鐘之前必須要到公司,因爲擔心ClickHouse服務器扛不住,但是現在我不用擔心這個問題了,因爲我的緩存機制已經把ClickHouse服務器保護的足夠好了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"ClickHouse集羣架構"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖這個是我們的ClickHouse集羣架構,都是虛擬集羣。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/74\/32\/742yy8576a16dab77797b7bc89cf6232.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1、數據讀取通過應用程序做負載平衡"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因爲我的機器是在兩個機房,兩個機房的作用一個是起到互備,同時起到負載均衡。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"2、虛擬集羣最少兩臺機器在不同的機房"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"每一個虛擬集羣必須在兩個機房裏面至少存在一臺機器,萬一一邊機房網絡有問題,另外一邊的機房來支撐這個查詢。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"3、數據獨立,多寫,相互不干擾"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我稱之爲虛擬集羣,其實是由我們的程序來創建的集羣,程序通過不同集羣的連接串會讀這兩臺機器上面的數據。簡單的說我們可以通過自己的業務線來分,比如像我們國內的數據可以放在01、02的機器上面,海外的可以放在03、04的機器上面,其他運營數據可以放在05、06的機器上面。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"4、靈活創建不同的虛擬集羣用於適當的場合"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通過這種方式有一個好處就是可以充分利用機器,特別是大數據平時日常工作需要關注的數據與節假日不一樣的時候,這個時候要構建一些新的集羣,這樣比較靈活。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"5、隨時調整服務器,新增\/縮減服務器"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"可以充分利用服務器資源,而不會因爲有新的場景就要申請機器,然後走上線流程,整個過程就會比較長。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"採用ClickHouse後平臺的查詢性能"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下圖這是我們平臺的一個查詢性能。其實這個圖都是2019年截的,因爲我的平臺是2018年開始接ClickHouse,2019年初我們PC版每天超過15秒的次數還要上萬次,因爲那個時候還有一部分場景用的是SQL Server+ES的架構。2018年我們開始用ClickHouse的時候,業界基本上沒有太多使用案例,我也不敢保證它就是高可用的,所以一開始只能小範圍內嘗試。到2019年的3、4月份左右我們開始大範圍的替代SQL Server和ES的架構,大概5個月左右完全下線原來的SQL\/ES服務器強依賴ClickHouse,所以後面查詢響應就非常平穩了,99%控制在3秒內。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/57\/31\/57d564afe62162bc319808be2d219131.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"然後這個圖是手機端的響應時間監控,手機端一開始每天超過2秒的次數有幾百次,經過採用ClickHouse的方案後現在每個手機端查詢接口超過2秒的每天就三五個。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/96\/83\/9680fd433642ded47bc132ab645eef83.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現在我們主要考覈指標PC版在3秒,目前爲止99.6%以上的查詢都是3秒內查詢響應時間。這個整個查詢次數,上文講過每天大概60多萬,1秒內的響應時間也有98%以上;app考覈是2s,這兩點在我們接入ClickHouse效果是很明顯,但也有一些比較複雜的查詢需要case by case優化並且大數據的查詢性能也需要長期跟進優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/d8\/e5\/d894bdfa6f13744ce3710265ce7939e5.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"從另外一個角度我認爲這個指標也是我們在大數據技術上的一個沉澱結果,因爲對於用戶來說他並不知道查詢的數據背後有多少億數據量,多少表在支撐他的一個query,所以在業務合理的情況下我們不能說因爲大數據數據量很大所以我們慢。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"ClickHouse應用小結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"下面是我對ClickHouse的一些小結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1、數據導入之前要評估好分區字段"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"ClickHouse因爲是根據分區文件存儲的,如果說你的分區字段真實數據粒度很細,數據導入的時候就會把你的物理機打爆。其實數據量可能沒有多少,但是因爲你用的字段不合理,會產生大量的碎片文件,磁盤空間就會打到底。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"2、數據導入提前根據分區做好排序,避免同時寫入過多分區導致clickhouse內部來不及Merge"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據導入之前我們做好排序,這樣可以降低數據導入後ClickHouse後臺異步Merge的時候涉及到的分區數,肯定是涉及到的分區數越少服務器壓力也會越小。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"3、左右表join的時候要注意數據量的變化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"再就是左右表join的問題,ClickHouse它必須要大表在左邊,小表在右邊。但是我們可能某些業務場景跑着跑着數據量會返過來了,這個時候我們需要有監控能及時發現並修改這個join關係。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"4、根據數據量以及應用場景評估是否採用分佈式"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"分佈式要根據應用場景來,如果你的應用場景向上彙總後數據量已經超過了單物理機的存儲或者CPU\/內存瓶頸而不得不採用分佈式ClickHouse也有很完善的MPP架構,但同時你也要維護好你的主keyboard。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"5、監控好服務器的CPU\/內存波動"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"再就是做好監控,我前面說過ClickHouse的CPU拉到60%的時候,基本上你的慢查詢馬上就出來了,所以我這邊是有對CPU和內存的波動進行監控的,類似於dump,這個我們抓下來以後就可以做分析。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"6、數據存儲磁盤儘量採用SSD"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"數據存儲儘量用SSD,因爲我之前也開始用過機械硬盤,機械硬盤有一個問題就是當你的服務器要運維以後需要重啓,這個時候數據要加載,我們現在單機數據量存儲有超過了200億以上,這還是我幾個月前統計的。這個數據量如果說用機械硬盤的話,重啓一次可能要等上好幾個小時服務器纔可用,所以儘量用SSD,重啓速度會快很多。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"當然重啓也有一個問題就是說會導致你的數據合併出現錯亂,這是一個坑。所以我每次維護機器的時候,同一個集羣我不會同時維護幾臺機器,我只會一臺一臺維護,A機器好了以後會跟它的備用機器對比數據,否則機器起來了,但是數據不一定是對的,並且可能是一大片數據都是不對的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"7、減少數據中文本信息的冗餘存儲"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要減少一些中文信息的冗餘存儲,因爲中文信息會導致整個服務器的IO很高,特別是導數據的時候。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"8、特別適用於數據量大,查詢頻次可控的場景,如數據分析、埋點日誌系統"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於它的應用,我認爲從成本角度來說,就像以前我們有很多業務數據的修改日誌,大家開發的時候可能都習慣性的存到MySQL裏面,但是實際上我認爲這種數據非常適合於落到ClickHouse裏面,比落到MySQL裏面成本會更低,查詢速度會更快。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"三、ClickHouse當前存在的問題和規劃"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"需要解決的問題:"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1、部分場景下內存泄漏"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"現在我們遇到的一些問題是當你服務器上面的數據存儲超過你的服務器內存時,會存在內存泄漏。但每天就掉一點點,比如說128g內存可能2-3個月時間可用內存只有60%左右,但是這個還是在我用2018年的版本時候。我們現在也正在灰度升級到今年的20.9的版本,看似還是沒有解決。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"2、歷史數據更新的CPU消耗問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"3、死鎖問題"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我們每天大量的數據更新後爲了減少用戶端使用的影響,我們都是通過rename的方式,但對於有些查詢併發比較高的表rename的時候會存在死鎖的情況,這個在20.9的版本中已修復。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"建議性問題:"}]},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"1、如何保證高優先級的表在服務器維護後第一時間投入生產應用的問題?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"對於ClickHouse一個建議性的問題就是服務器重啓以後,如果服務器上面的數據量過大,可能要很久的數據加載,重新整理文件後服務器纔可用,所以這個我跟俄羅斯研發團隊有過溝通,讓表分級,高優先級的表先啓動,可以早點讓服務器起來後投入生產應用,後面的表可以通過lazy的方式加載。"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"新功能的實踐:"}]},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"1、20.9的新版支持訂閱MySQL的binlog方式同步數據"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"新功能的時間包括現在有訂閱MySQL的binlog方式同步數據方式,這個我們發現最近幾個版本都有在修復一些bug,所以暫時沒有應用,但如果這個做好了是可以用於更多的場景,也可以更好的接入實時數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":5},"content":[{"type":"text","text":"2、查看執行計劃"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"以前的版本只能到服務器上看執行日誌,但這個比較費勁,現在可以像SQL一樣直接看執行計劃了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q&A"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q1:蔡老師您好,我們還沒用,但是我前期調研了一下。我就發現有兩個問題:第一個,比如在對比的時候,跟ES比,ClickHouse不支持高併發,不知道這樣對比結果對不對?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A1:對,ClickHouse應用場景一定是在QPS控制得住的情況下,如果放在外網控制不住,那ClickHouse一臺40核的128G的物理機,可以靠十個人手點把服務器點掛。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q2:80%以上的業務都用ClickHouse您介紹的,但是您還用Hive,是不是那邊又存了一份數據?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A2:是這樣,比如數據倉庫團隊會負責每天把生產數據同步到倉庫,後面有一個DM團隊專門做數據整理,每個業務線的數據做一些邏輯,或做一些沉澱的主題表或者寬表,我們團隊負責把我們用到的數據同步到我的ClickHouse裏面來。因爲我的應用不可能直接調Hive,應用程序直接調Hive大家都知道很慢或者基於其他spark,presto從性能或者高可用上不能達到我的要求,但ClickHouse可以解決這個問題,所以我們用到的數據是Hive上有一份,ClickHouse中也有一份。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q3:明白了,就是其實數據集包括數據處理還是都在Hive層處理的對吧?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A3:對,我們都是將Hive數據通過ETL同步到應用端來的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q4:這個ClickHouse同步MySQL的binlog是多久同步一次?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A4:ClickHouse比較適合於批量寫,這個時間間隔可以設置的,之前準備嘗試的,後面發現每個月的新版本都有在更新相關的bug,所以暫停測試了,等穩定一點後我們準備測試;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q5:ClickHouse主機重啓的時候會導致數據錯亂,我們實際上在測試一些業務場景,在嘗試用ClickHouse。但是這個問題我不知道重啓之後爲什麼會數據錯亂?原因是什麼?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A5:是這樣的,ClickHouse我們用了兩年多了,根據我們的經驗發現,如果是這個機器上面的數據少,只有六七十G的時候,重啓是沒問題的。但是數據如果超過100G,它的文件合併可能會有問題。當然我也沒有具體去測試多大的數據文件是瓶頸,這個問題我反饋給了俄羅斯的研發團隊,但現在應該還沒有很好的解決方案。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q6:我們是從Oracle裏面導一些批量文件固定格式,但是發現一個問題就是我們在往裏導ClickHouse的時候,它的數據是不一致的,會差一些數據,但是它導入過程中也不會報錯,所以這個問題我一直不知道怎麼去排查或者怎麼處理。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A6:你是從Oracle直接同步到ClickHouse裏面來是吧?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q7:不是同步,是按照CSV、平面文件,再通過別的方式導進去,但是它丟數據。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A7:這個問題我還真沒有遇到過,如果丟數據我這裏也會很容易發現,你的模式與我不同的是中間多了一個CSV中轉,這個我不建議,CSV有各種分隔符分數據的方法,可能某些數據中包含一些特殊字符,你也不可能打開CSV一行一行檢查,所以我建議你跳過CSV的中轉,如果自己測試弄一些數據從CSV我覺得可以,投入生產不建議走CSV中轉;因爲我這裏每天也有兩千多用戶都在用我這個平臺,我的數據同時存在多份,如果丟數據我們監控是很容易發現的,但是從來沒有遇到過丟數據這個問題。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q8:有可能是不是我的導出來的數據質量的一個問題?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A8:這個你要看一下日誌,導數據的時候也會有日誌,你要看ClickHouse內部的日誌。我在社區裏面也沒有看到過誰說導數據的時候會有丟數據的情況。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"不建議你用CSV,因爲它會中轉,你完全不知道中轉的時候會做什麼事情,導致文件中的數據行數可能變了。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q9:這種情況哪種方式更好一些?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A9:其實ClickHouse支持從MySQL同步過去,也支持走程序的方式批量寫入或者spark etl寫入,有很多種寫入方式,但是我不建議中間是通過CSV中轉這種方式去做。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"Q10:MySQL我們在嘗試創建一個類似的MySQL引擎的ClickHouse表的時候,我就發現它的時間會很長,我也不太好評估這一個SQL下去會跑多久。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"A10:一開始我們也不是通過ETL流程往ClickHouse上面導數據,也是先寫到MySQL裏面,通過程序的job。因爲你知道,它其實就相當於在MySQL上面把一個表導到另外一個表,只是加了一個服務器的ip上去,然後加上賬號密碼,其實很簡單。但是我們以前都用這種方式,幾千萬不會有問題,只是說瓶頸就會壓到你的MySQL服務器上面,幾千萬上億的數據是否能從MySQL服務器上取出來,大量的數據讀取會對MySQL服務器造成很大的壓力,所以後來我們ETL工具支持後我全部切走斷開了對MySQL的依賴,每個數據你要評估數據源是什麼,可以有很多種寫入ClickHouse的方式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"本文轉載自:DBAplus社羣(ID:dbaplus)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"原文鏈接:"},{"type":"link","attrs":{"href":"https:\/\/mp.weixin.qq.com\/s\/vNHFOkG_2j9J2gvQA-fJWw","title":"xxx","type":null},"content":[{"type":"text","text":"支撐700億數據量的ClickHouse高可用架構實踐"}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章