Zabbix後端存儲ES的優化實踐

場景分析

由於公司zabbix的歷史數據存儲在elasticsearch中，有個需求是儘可能地把監控的歷史數據存儲的長一點，最好是一年，目前的情況是三臺ES節點，每天監控歷史數據量有5G，目前最多可存儲一個月的數據，超過30天的會被定時刪除，每臺內存分了8G，且全部使用機械硬盤，主分片爲5，副本分片爲1，查詢需求一般只獲取一週的歷史數據，偶爾會有查一個月到兩個月曆史數據的需求。

節點規劃

爲了讓ES能存儲更長的歷史數據，以及考慮到後續監控項添加導致數據的增長，我將節點數量增加至4節點，並將部分節點內存提高，部分節點採用SSD存儲

192.168.179.133  200GSSD 4G內存  tag:hot node.name=es1
192.168.179.134  200GSSD 4G內存  tag:hot node.name=es2
192.168.179.135  1THDD 32G內存  tag:cold node.name=es3  node.master=false
192.168.179.136  1THDD 32G內存  tag:cold node.name=es4  node.master=false

優化思路

對數據mapping重新建模，對str類型的數據不進行分詞，採用冷熱節點對數據進行存儲，前七天數據的索引分片設計爲2主1副，索引存儲在熱節點上，超過七天的數據將被存儲在冷節點，超過30天的索引分片設置爲2主0副本，ES提供了一個shrink的api來進行壓縮。由於ES是基於Lucene的搜索引擎，Lucene的索引由多個segment組成，每一個段都會消耗文件句柄，內存和CPU運行週期，段數量過多會使資源消耗變大，搜索也會變慢，這裏我將前一天的索引分片強制合併爲1個segment，修改refresh的時間間隔至60s，減少段的產生頻率。對超過3個月的索引進行關閉。以上操作均使用ES的管理工具curator來定時執行。

zabbix與ES的對接操作

1.修改/etc/zabbix/zabbix_server.conf，添加如下內容

ES地址填寫集羣中任意一個節點就可以

HistoryStorageURL=192.168.179.133:9200
HistoryStorageTypes=str,text,log,uint,dbl
HistoryStorageDateIndex=1

2.修改/etc/zabbix/web/zabbix.conf.php，添加如下內容

global $DB, $HISTORY;
$HISTORY['url']   = 'http://192.168.179.133:9200';
// Value types stored in Elasticsearch.
$HISTORY['types'] = ['str', 'text', 'log','uint','dbl'];

3.修改ES配置文件，添加冷熱節點的標籤

vim elasticsearch.yml
熱節點配置

node.attr.box_type=hot

冷節點配置

node.attr.box_type=cold

3.在es上創建模板和管道

每種數據類型的模板都需要創建，可以根據elasticsearch.map文件來獲取api的信息，模板定義內容有匹配的索引，主副分片數設置，refresh間隔，新建索引分配節點設置以及mapping的設置，這裏我只是以uint和str數據的索引爲例

PUT _template/uint_template
{
   "template": "uint*",
   "index_patterns": ["uint*"],
   "settings" : {
      "index" : {
         "routing.allocation.require.box_type": "hot",
         "refresh_interval": "60s",
         "number_of_replicas" : 1,
         "number_of_shards" : 2
      }
   },
   "mappings" : {
      "values" : {
         "properties" : {
            "itemid" : {
               "type" : "long"
            },
            "clock" : {
               "format" : "epoch_second",
               "type" : "date"
            },
            "value" : {
               "type" : "long"
            }
         }
      }
   }
}

PUT _template/str_template
{
   "template": "str*",
   "index_patterns": ["str*"],
   "settings" : {
      "index" : {
         "routing.allocation.require.box_type": "hot",
         "refresh_interval": "60s",
         "number_of_replicas" : 1,
         "number_of_shards" : 2
      }
   },
   "mappings" : {
      "values" : {
         "properties" : {
            "itemid" : {
               "type" : "long"
            },
            "clock" : {
               "format" : "epoch_second",
               "type" : "date"
            },
            "value" : {
               "index" : false,
               "type" : "keyword"
            }
         }
      }
   }
}

定義管道的作用是對寫入索引之前的數據進行預處理，使其按天產生索引。

PUT _ingest/pipeline/uint-pipeline
{
  "description": "daily uint index naming",
  "processors": [
    {
      "date_index_name": {
        "field": "clock",
        "date_formats": ["UNIX"],
        "index_name_prefix": "uint-",
        "date_rounding": "d"
      }
    }
  ]
}
PUT _ingest/pipeline/str-pipeline
{
  "description": "daily str index naming",
  "processors": [
    {
      "date_index_name": {
        "field": "clock",
        "date_formats": ["UNIX"],
        "index_name_prefix": "str-",
        "date_rounding": "d"
      }
    }
  ]
}

4.修改完成後重啓zabbix,並查看zabbix是否有數據

systemctl restart zabbix-server

使用curator對索引進行操作

curator官方文檔地址如下
https://www.elastic.co/guide/en/elasticsearch/client/curator/5.8/installation.html

1.安裝curator

pip install -U elasticsearch-curator

2.創建curator配置文件

mkdir /root/.curator
vim /root/.curator/curator.yml
---
client:
  hosts:
    - 192.168.179.133
    - 192.168.179.134
  port: 9200
  url_prefix:
  use_ssl: False
  certificate:
  client_cert:
  client_key:
  ssl_no_validate: False
  http_auth:
  timeout: 30
  master_only: False

logging:
  loglevel: INFO
  logfile:
  logformat: default
  blacklist: ['elasticsearch', 'urllib3']

3.編輯action.yml,定義action

將7天以前的索引分配到冷節點

1:
    action: allocation
    description: "Apply shard allocation filtering rules to the specified indices"
    options:
      key: box_type
      value: cold
      allocation_type: require
      wait_for_completion: true
      timeout_override:
      continue_if_exception: false
      disable_action: false
    filters:
    - filtertype: pattern
      kind: regex
      value: '^(uint-|dbl-|str-).*$'
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y-%m-%d'
      unit: days
      unit_count: 7

將前一天的索引強制合併，每個分片1個segment。

2:
    action: forcemerge
    description: "Perform a forceMerge on selected indices to 'max_num_segments' per shard"
    options:
      max_num_segments: 1
      delay:
      timeout_override: 21600 
      continue_if_exception: false
      disable_action: false
    filters:
    - filtertype: pattern
      kind: regex
      value: '^(uint-|dbl-|str-).*$'
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y-%m-%d'
      unit: days
      unit_count: 1

超過30天的索引將主分片數量修改爲2，副本分片爲0，執行shrink操作的節點不能作爲master節點

  3:
    action: shrink
    description: "Change the number of primary shards to one, and the copy shards to 0"
    options:
      ignore_empty_list: True
      shrink_node: DETERMINISTIC
      node_filters:
        permit_masters: False
        exclude_nodes: ['es1','es2']
      number_of_shards: 2
      number_of_replicas: 0
      shrink_prefix:
      shrink_suffix: '-shrink'
      delete_after: True
      post_allocation:
        allocation_type: include
        key: box_type
        value: cold
      wait_for_active_shards: 1
      extra_settings:
        settings:
          index.codec: best_compression
      wait_for_completion: True
      wait_for_rebalance: True
      wait_interval: 9
      max_wait: -1
    filters:
      - filtertype: pattern
        kind: regex
        value: '^(uint-|dbl-|str-).*$'
      - filtertype: age
        source: name
        direction: older
        timestring: '%Y-%m-%d'
        unit: days
        unit_count: 30

對超過3個月的索引進行關閉

  4:
    action: close
    description: "Close selected indices"
    options:
      delete_aliases: false
      skip_flush: false
      ignore_sync_failures: false
    filters:
     - filtertype: pattern
       kind: regex
       value: '^(uint-|dbl-|str-).*$'
     - filtertype: age
       source: name
       direction: older
       timestring: '%Y-%m-%d'
       unit: days
       unit_count: 90

超過一年的索引進行刪除

5:
    action: delete_indices
    description: "Delete selected indices"
    options:
      continue_if_exception: False
    filters:
    - filtertype: pattern
      kind: regex
      value: '^(uint-|dbl-|str-).*$'
    - filtertype: age
      source: name
      direction: older
      timestring: '%Y-%m-%d'       
      unit: days
      unit_count: 365

4.執行curator進行測試

curator action.yml

5. 將curator操作寫進定時任務，每天執行

crontab -e
10 0 * * * curator /root/action.yml

以上就是對zabbix後端存儲elasticsearch存儲優化的全部實踐，參考鏈接
https://www.elastic.co/cn/blog/hot-warm-architecture-in-elasticsearch-5-x

歡迎關注個人公號“沒有故事的陳師傅”

Zabbix後端存儲ES的優化實踐

場景分析

節點規劃

優化思路

zabbix與ES的對接操作

1.修改/etc/zabbix/zabbix_server.conf，添加如下內容

2.修改/etc/zabbix/web/zabbix.conf.php，添加如下內容

3.修改ES配置文件，添加冷熱節點的標籤

3.在es上創建模板和管道

使用curator對索引進行操作

1.安裝curator

2.創建curator配置文件

3.編輯action.yml,定義action

4.執行curator進行測試

5. 將curator操作寫進定時任務，每天執行

ZABBIX4.0升級5.0 & ES6.1升級7.0

zabbix告警信息推送至kafka

ZABBIX監控vcenter集羣

Zabbix後端存儲ES的優化實踐

ZABBIX對接飛書實現報警通知

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結