一、說明
OpenTSDB的寫入目前有OpenTSDB開源的tcollector採集框架,所以不着重去寫關於寫入的接口說明,這裏主要是介紹常用的讀取接口,便於對監控數據進行一些定製化檢查、統計。
二、常用接口
OpenTSDB地址:192.168.180.128:16002
*1、查詢所有的metric*
http://192.168.180.128:16002/api/suggest?max=1000&q=&type=metrics
返回
["test.delete","test.delete2"]
*2、查詢全量tag*
http://192.168.180.128:16002/api/suggest?max=1000&q=&type=tagk
["host","idc"]
*3、查詢全量tagv*
http://192.168.180.128:16002/api/suggest?max=1000&q=&type=tagv
["0.0.0.1","0.0.0.2","test1"]
*4、指定metric,查全量tagk*
http://192.168.180.128:16002/api/search/lookup?limit=1000&m=test.delete
得到的結果是這個metric下所有的tag、tagv,需要處理之後纔會得到全量tagk
{"type":"LOOKUP","metric":"test.delete","tags":[],"limit":1000,"time":14.0,"results":[{"tsuid":"C66AC1000001000001000002000002","metric":"test.delete","tags":{"host":"0.0.0.1","idc":"test1"}}],"startIndex":0,"totalResults":1}
*5、指定metric,指定tagk,查全量tagv*
http://192.168.180.128:16002/api/search/lookup?limit=3000&m=test.delete{idc=*}
{"type":"LOOKUP","metric":"test.delete","tags":[{"key":"idc","value":"*"}],"limit":3000,"time":13.0,"results":[{"tsuid":"C66AC1000001000001000002000002","metric":"test.delete","tags":{"host":"0.0.0.1","idc":"test1"}}],"startIndex":0,"totalResults":1}
*6、指定metric,限定一個tagk=takv,查另外一個tagk下的所有tagv*
(如:test.delete,查指定idc=test1條件下,host的所有值)
http://192.168.180.128:16002/api/search/lookup?limit=3000&m=test.delete%7Bhost%3D*,idc%3Dtest1%7D
{"type":"LOOKUP","metric":"test.delete","tags":[{"key":"host","value":"*"},{"key":"idc","value":"test1"}],"limit":3000,"time":4.0,"results":[{"tsuid":"C66AC1000001000001000002000002","metric":"test.delete","tags":{"host":"0.0.0.1","idc":"test1"}}],"startIndex":0,"totalResults":1}
三、應用場景之巡檢
- 功能1,巡檢cmdb中沒有添加監控的主機:
通過/api/search/lookup?limit=3000&m=monitor.alive{endpoint=*}接口查所有監控主機,與cmdb中的主機進行比對,得出未添加監控的主機列表。
- 功能2,巡檢cmdb中沒有添加監控的數據庫實例
通過/api/search/lookup?limit=3000&m=mysql.alive{endpoint=*}接口查詢所有監控主機及數據庫端口號,與cmdb中的數據庫實例進行對比,得出未添加監控的實例列表。
- 功能3,巡檢指定時間內指定metric無數據上報的主機:
#使用 /api/query接口
#ipList爲遍歷的ip列表
tsdbQuery = {
'start': starttime,
'end': endtime,
'queries': [{
'aggregator':'avg',
'downsample':'1h-avg',
'metric': ‘monitor.alive’,
'rate': 'false',
'filters': [{
'type': 'literal_or',
'tagk': 'endpoint',
'filter': "|".join(ipList),
'groupBy': 'true'
}]
}]
}
取出無數據的endpoint,這些endpoint則爲採集數據失敗的主機
- 功能4,巡檢cpu、磁盤、內存全天平均使用率top 10的主機列表:
同功能3的接口,只是metric改爲對於cpu、磁盤、內存的metric即可。
功能n,使用/api/query出來的監控數據可以做各種應用,具體看需求。