InfluxDB是一個當下比較流行的時序數據庫,InfluxDB使用 Go 語言編寫,無需外部依賴,安裝配置非常方便,適合構建大型分佈式系統的監控系統。
1 下載安裝
wget https://dl.influxdata.com/influxdb/releases/influxdb-1.4.3_linux_amd64.tar.gz
tar xvfz influxdb-1.4.3_linux_amd64.tar.gz
mv influxdb-1.4.3_linux_amd64 ~/disk/influxdb14
啓動守護進程
cd ~/disk/influxdb14/usr/bin
./influxd &
創建管理員用戶
./influx
show users
create user fsj with password 'fsj' with all privileges
將配置文件中auth-enabled字段修改爲true
重啓服務
service influxdb restart
重新登錄
./influx -username fsj -password fsj
2 配置
查看當前配置 influxd config
設置密碼
Enable authentication by setting the auth-enabled option to true in the [http] section of the configuration file
https://kiswo.com/article/1020
$ cat run_influxd.sh
app=/home/work/workspace/apps
log=/home/work/log/influxdb
influxd=$app/influxdb14/usr/bin/influxd
conf=$app/influxdb14/etc/influxdb/influxdb.conf
$influxd -config $conf 1>$log/stdout 2>$log/stderr & # -config指定配置文件
3 基本用法
influx -precision rfc3339 -username fsj -password fsj
CREATE DATABASE mydb
SHOW DATABASES
> show measurements;
name: measurements
name
----
TableTest
> select * from TableTest limit 10 # 大小寫敏感
name: TableTest
time App Area Cid ContentId FloorId FloorName Imei Page count
---- --- ---- --- --------- ------- --------- ---- ----- -----
1520921431000000000 TMall list 0 271 居家 863276004580322 $Home$ 1
1520921431000000000 JD banner 77 首頁 352042013052762 $Home$ 1
...
4 聚合函數
COUNT()
Returns the number of non-null values in a single field.
5 連續查詢
查看CQ
SHOW CONTINUOUS QUERIES
name: TableTest_stat
name query
---- -----
pv CREATE CONTINUOUS QUERY pv ON TableTest_stat BEGIN SELECT sum(count) INTO TableTest_stat.autogen.TableTest_pv FROM TableTest_stat.autogen.TableTest GROUP BY time(30m) END
pv_day CREATE CONTINUOUS QUERY pv_day ON TableTest_stat BEGIN SELECT sum(count) INTO TableTest_stat.autogen.TableTest_pv_day FROM TableTest_stat.autogen.TableTest GROUP BY time(1d) END
pv_hour CREATE CONTINUOUS QUERY pv_hour ON TableTest_stat BEGIN SELECT sum(count) INTO TableTest_stat.autogen.TableTest_pv_hour FROM TableTest_stat.autogen.TableTest GROUP BY time(1h) END
pv_minute CREATE CONTINUOUS QUERY pv_minute ON TableTest_stat BEGIN SELECT sum(count) INTO TableTest_stat.autogen.TableTest_pv_minute FROM TableTest_stat.autogen.TableTest GROUP BY time(1m) END
uv_day CREATE CONTINUOUS QUERY uv_day ON TableTest_stat BEGIN SELECT count(distinct(Imei)) INTO TableTest_stat.autogen.TableTest_uv_day FROM TableTest_stat.autogen.TableTest GROUP BY time(1d) END
創建CQ
CREATE CONTINUOUS QUERY pv_all_1h ON TableTest_stat BEGIN SELECT sum(count) INTO "h.pv.all.1h" FROM TableTest GROUP BY time(1m),App,Area,Cid,ContentId,FloorId,FloorName,Imei END
CREATE CONTINUOUS QUERY uv_all_1d ON TableTest_stat BEGIN SELECT count( distinct(Uid)) INTO "h.uv.all.1d" FROM TableTest GROUP BY time(1d),App,Area,Cid,ContentId,FloorId,FloorName,Imei END
創建之後INTO "h.pv.all.1h"會自動變成 INTO TableTest_stat.autogen."h.pv.all.1h"
刪除CQ
drop continuous query pv_all_1m on TableTest_stat;
聚合group by
all_1d CREATE CONTINUOUS QUERY all_1d ON TableTest_stat BEGIN SELECT count(distinct(Imei)) as CDImei, sum(count) as SCount INTO TableTest_stat.autogen."h.all.1d" FROM TableTest_stat."10day".TableTest GROUP BY time(1d), App, Area END
all_1h CREATE CONTINUOUS QUERY all_1h ON TableTest_stat BEGIN SELECT count(distinct(Imei)) as CDImei, sum(count) as SCount INTO TableTest_stat.autogen."h.all.1h" FROM TableTest_stat."10day".TableTest GROUP BY time(1d), App, Area END
6 數據保存策略
InfluxDB沒有提供直接刪除Points的方法,但是它提供了Retention Policies。主要用於指定數據的保留時間:當數據超過了指定的時間之後,就會被刪除。
新建存儲策略
> CREATE RETENTION POLICY "10day" ON "TableTest_stat" DURATION 240h REPLICATION 1 DEFAULT
> SHOW RETENTION POLICIES
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
autogen 0s 168h0m0s 1 false
10day 240h0m0s 24h0m0s 1 true
> ALTER RETENTION POLICY "10day" ON TableTest_stat SHARD DURATION 1w
> SHOW RETENTION POLICIES
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
autogen 0s 168h0m0s 1 false
10day 240h0m0s 168h0m0s 1 true
默認autogen的duration爲0表示永久。
那shardGroupDuration對數據保存有什麼影響呢?
注意,存儲策略有點類似於分區數據塊,修改了策略,新策略裏不會有舊策略的數據。
要想查看舊策略下的數據,需要在 measurement 前加上策略名稱。
> select * from "h.pv.all.1h" limit 10;
> select * from "autogen"."h.pv.all.1h" limit 10;
name: h.pv.all.1h
....
influxdb 也支持通過 http方式寫入, see also https://stackoverflow.com/questions/37729008/can-i-create-different-retention-policy-for-different-measurements-in-influxdb
- http://www.oznetnerd.com/influxdb-retention-policies-shard-groups/
- https://www.linuxdaxue.com/retention-policies-in-influxdb.html
7 tag
> show series from TableTest limit 10;
key
---
TableTest,Action=Pay,App=JD,Area=list,Cid=0,FloorId=1,FloorName=辦公,Page=Home,Rid=4
TableTest,Action=Pay,App=TM,Area=list,Cid=0,FloorId=2,FloorName=運動,Page=Home,Rid=1
...
> show tag keys from TableTest;
name: TableTest
tagKey
------
Action
App
Area
Cid
ContentId
FloorId
FloorName
Page
Rid
Sid
> show tag values from TableTest with key="App" limit 30;
name: TableTest
key value
--- -----
App JD
App TM
...
> show tag values from TableTest with key="Area" limit 30;
name: TableTest
key value
--- -----
Area list
Area banner
...
8 Field
相當於實際記錄的數據值,也是採用key=value形式,多個 tag 之間用 ',' 分隔。
> show field keys
name: TableTest
fieldKey fieldType
-------- ---------
Imei string
count integer
field列不能用在group by後面。
當列按照filed寫入後,改成按tag寫入,會使得改列即是filed又是tag。還是不能用在group by後面。
只能刪表重建。
9 查詢實戰
select count(Imei) as PV,count(distinct(Imei)) as UV from TableTest where (time>=1521993600000000000 and time <1522080000000000000) and ((App ='JD')) and Action = 'View' group by App,time(86400s) tz('Asia/Shanghai')
select count(distinct(Imei)) AS CDImei, sum(count) AS SCount from TableTest where App='JD' and time>'2018-03-29' tz('Asia/Shanghai')
where條件中,要通過tag篩選必須加單引號,不能雙引號
> select time,App,Action,count from TableTest where Action="View" order by time desc limit 10;
從命令行查詢
$ influx -precision rfc3339 -username admin -password x -database TableTest_stat -execute "YOUR SQL" -format=csv
10 進階
10.1 選型
Influxdb vs Prometheus
influxdb集成已有的概念,比如查詢語法類似sql,引擎從LSM優化而來,學習成本相對低。
influxdb支持的類型有float,integers,strings,booleans,prometheus目前只支持float。
influxdb的時間精度是納秒,prometheus的則是毫秒。
influxdb僅僅是個數據庫,而prometheus提供的是整套監控解決方案,當然influxdb也提供了整套監控解決方案。
influxdb支持的math function比較少,prometheus相對來說更多,influxdb就目前使用上已經滿足功能。
2015年prometheus還在開發階段,相對來說influxdb更加穩定。
influxdb支持event log,prometheus不支持。
更詳細的對比請參考:
https://db-engines.com/en/system/Graphite%3BInfluxDB%3BPrometheus
open source influxdb cluster http://mysql.taobao.org/monthly/2018/02/02/
10.2 架構
measurement, tag set, retention policy相同的數據集合算做一個 series。理解這個概念至關重要,因爲這些數據存儲在內存中,如果series太多,會導致OOM。
不考慮PR,也可以說series = measurement + tags
插入一條記錄到新表:INSERT Cpu,host=serverA,region=us_west value=0.64
也是 insert measurement,tags field 的格式
11.3 shardGroupDuration對數據保存的影響
http://www.oznetnerd.com/influxdb-retention-policies-shard-groups/
Shard 存儲一定時間間隔的數據,每個目錄對應一個shard,目錄的名字就是shard id。每一個shard都有自己的cache、wal、tsm file以及compactor,目的就是通過時間來快速定位到要查詢數據的相關資源,加速查詢的過程,並且也讓之後的批量刪除數據的操作變得非常簡單且高效。
Shard Group 是shard的邏輯容器。每一個有數據的RP至少有一個關聯的shard group,
一個shard group覆蓋的時間範圍由RP裏的SHARD DURATION參數決定。
shard duration的默認值
Retention Policy’s DURATION Shard Group Duration
< 2 days 1 hour
= 2 days and <= 6 months 1 day
6 months 7 days
較小的shard group duration有助於系統更高效的刪數據。
假如RP duration是1d,shard group duration是1h,那麼系統每個小時都會刪一個shar group
test case
> show retention policies;
name duration shardGroupDuration replicaN default
---- -------- ------------------ -------- -------
autogen 0s 168h0m0s 1 false
d1 24h0m0s 1h0m0s 1 true
> show shard groups;
name: shard groups
id database retention_policy start_time end_time expiry_time
2 mydb autogen 2018-03-05T00:00:00Z 2018-03-12T00:00:00Z 2018-03-12T00:00:00Z
80 mydb autogen 2018-05-28T00:00:00Z 2018-06-04T00:00:00Z 2018-06-04T00:00:00Z
81 mydb d1 2018-05-29T08:00:00Z 2018-05-29T09:00:00Z 2018-05-30T09:00:00Z
82 mydb d1 2018-05-29T09:00:00Z 2018-05-29T10:00:00Z 2018-05-30T10:00:00Z
83 mydb d1 2018-05-29T10:00:00Z 2018-05-29T11:00:00Z 2018-05-30T11:00:00Z
84 mydb d1 2018-05-29T11:00:00Z 2018-05-29T12:00:00Z 2018-05-30T12:00:00Z
85 mydb d1 2018-05-29T12:00:00Z 2018-05-29T13:00:00Z 2018-05-30T13:00:00Z
87 mydb d1 2018-05-29T13:00:00Z 2018-05-29T14:00:00Z 2018-05-30T14:00:00Z
4 nodes autogen 2018-03-12T00:00:00Z 2018-03-19T00:00:00Z 2018-03-19T00:00:00Z
5 TableTest autogen 2018-03-12T00:00:00Z 2018-03-19T00:00:00Z 2018-03-19T00:00:00Z
> select * from autogen.cpu;
name: cpu
time host region value
---- ---- ------ -----
2018-03-06T06:30:57.464227026Z serverA us_west 0.64
> select * from Cpu;
name: Cpu
time host region value
---- ---- ------ -----
2018-05-29T08:36:13Z serverX us_west 0.64
2018-05-29T08:48:54Z serverF us_west 0.64
2018-05-29T08:48:54Z serverF1 us_west 0.64
2018-05-29T09:06:58.532829069Z serverC us_west 0.64
2018-05-29T09:09:36.977236835Z serverC us_west 0.64
...每秒一個
11 實戰經驗
1、在influxdb中,tag_set + timestamp 用於標識是否同一條記錄,如果有兩條記錄該值相同,後面的記錄的field_set會覆蓋前面的值。
2、取值範圍很多的列不要存到tag。
配置文件中默認max-values-per-tag = 100000。雖然修改這個值可以解決 max-values-per-tag limit exceeded (100000/100000) 問題,但是建議不改,把這種列放到filed裏
InfluxDB在內存中維護了系統中每個series數據的索引。隨着具有唯一性的series數據數量的增長,RAM的使用也會增長。過高的series cardinality會導致操作系統kill掉InfluxDB進程,拋出OOM異常。
3、日誌過多塞爆服務器
刪除文件後,進程佔用的空間也沒有被釋放
4、 每天存儲量
目前可以抗住每天6G的存儲。
上限應該是每天8T以上(參考 http://www.infoq.com/cn/articles/storage-in-sequential-databases )
修改存儲路徑:https://stackoverflow.com/questions/28350290/how-to-change-location-of-influxdb-storage-folder