Elasticsearch啓動分析與問題解決-bootstrap checks

[TOC]


0 說明

使用的es版本爲5.6,Linux版本爲CentOs 6.5.

1 Elasticsearch bootstrap checks

1.1 開發環境

如果在es的配置中沒有配置network.host來指定一個可用的IP地址的話,默認情況下,就綁定到localhost上,此時es會認爲用戶只是在開發環境下使用es,基於開箱即用的原則,雖然es此時也會進行bootstrap checks,來檢查用戶的配置是否與es設定的安全值相匹配,如下:

  • 如果匹配,則不會有warnning信息,此時es正常啓動;
  • 如果不匹配,則會有warnning信息,但因爲是開發環境,es依然會正常啓動;

1.2 生產環境

一旦用戶配置了network.host來指定一個可用的非loopback地址,那麼es就會認爲用戶此時是在生產環境下啓動es,同樣會進行檢查,但一旦檢查不通過,直接會將前面的warnning提升爲error,所以此時es會啓動失敗。

2 開發環境啓動時的bootstrap checks分析

不配置network.host時,直接啓動es,會有下面的warnning:

[2018-12-07T04:15:44,735][INFO ][o.e.d.DiscoveryModule    ] [PQ85ukj] using discovery type [zen]
[2018-12-07T04:15:45,702][INFO ][o.e.n.Node               ] initialized
[2018-12-07T04:15:45,703][INFO ][o.e.n.Node               ] [PQ85ukj] starting ...
[2018-12-07T04:15:46,071][INFO ][o.e.t.TransportService   ] [PQ85ukj] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2018-12-07T04:15:46,090][WARN ][o.e.b.BootstrapChecks    ] [PQ85ukj] max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
[2018-12-07T04:15:46,090][WARN ][o.e.b.BootstrapChecks    ] [PQ85ukj] max number of threads [1024] for user [hadoop] is too low, increase to at least [2048]
[2018-12-07T04:15:46,090][WARN ][o.e.b.BootstrapChecks    ] [PQ85ukj] max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2018-12-07T04:15:46,090][WARN ][o.e.b.BootstrapChecks    ] [PQ85ukj] system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk
[2018-12-07T04:15:49,269][INFO ][o.e.c.s.ClusterService   ] [PQ85ukj] new_master {PQ85ukj}{PQ85ukjdSoeVEpSpByAjMw}{Dbb3lzTWTN-eUEKXO8z-sw}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2018-12-07T04:15:49,313][INFO ][o.e.h.n.Netty4HttpServerTransport] [PQ85ukj] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2018-12-07T04:15:49,313][INFO ][o.e.n.Node               ] [PQ85ukj] started
[2018-12-07T04:15:49,553][INFO ][o.e.g.GatewayService     ] [PQ85ukj] recovered [0] indices into cluster_state

提取其waarnning信息,如下:

文件描述符:
max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]

線程數:
 max number of threads [1024] for user [hadoop] is too low, increase to at least [2048]

 虛擬內存:
 max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

 system call filters:
 system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk

可以看到有4個問題,分別爲:文件描述符、線程數、虛擬內存與system call filters。

雖然有warnning,但因爲es本身會認爲是在開發環境下運行,基於開箱即用的特性,是仍然可以正常啓動的。

3 生產環境啓動時的bootstrap checks分析

綁定IP地址後再啓動,發現有下面的報錯信息:

ERROR: [4] bootstrap checks failed
[1]: max file descriptors [4096] for elasticsearch process is too low, increase to at least [65536]
[2]: max number of threads [1024] for user [hadoop] is too low, increase to at least [2048]
[3]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[4]: system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk

直接error,所以會啓動失敗,除非進行上面的設置符合安全要求。

4 生產環境正常啓動配置

解決上面出現的問題,需要進行如下的配置。

4.1 文件描述符

  • 臨時修改:
 ulimit -n 65536

但是重新登錄後就會恢復成默認值了。

  • 永久修改

修改/etc/security/limits.conf配置,如下:

hadoop          soft    nofile  65536   # soft表示爲超過這個值就會有warnning
hadoop          hadr    nofile  100000  # hard則表示不能超過這個值

之後再重新登錄,使用ulimit -n就可以進行驗證。

4.2 線程數

修改/etc/security/limits.conf配置,如下:

hadoop          soft    nproc   2048
hadoop          hard    nproc   4096

實際上,該配置文件對於nproc的說明爲進程數,而不是線程數:

#<domain>      <type>  <item>         <value>
                                                                                                   39,1          41%
# /etc/security/limits.conf
#
#Each line describes a limit for a user in the form:
#
#<domain>        <type>  <item>  <value>
#
#Where:
#<domain> can be:
#        - an user name
#        - a group name, with @group syntax
#        - the wildcard *, for default entry
#        - the wildcard %, can be also used with %group syntax,
#                 for maxlogin limit
#
#<type> can have the two values:
#        - "soft" for enforcing the soft limits
#        - "hard" for enforcing hard limits
#
#<item> can be one of the following:
#        - core - limits the core file size (KB)
#        - data - max data size (KB)
#        - fsize - maximum filesize (KB)
#        - memlock - max locked-in-memory address space (KB)
#        - nofile - max number of open files
#        - rss - max resident set size (KB)
#        - stack - max stack size (KB)
#        - cpu - max CPU time (MIN)
#        - nproc - max number of processes
#        - as - address space limit (KB)
#        - maxlogins - max number of logins for this user
#        - maxsyslogins - max number of logins on the system
#        - priority - the priority to run user process with
#        - locks - max number of file locks the user can hold
#        - sigpending - max number of pending signals
#        - msgqueue - max memory used by POSIX message queues (bytes)
#        - nice - max nice priority allowed to raise to values: [-20, 19]
#        - rtprio - max realtime priority
#
#<domain>      <type>  <item>         <value>
#

#*               soft    core            0
#*               hard    rss             10000

4.3 虛擬內存

  • 查看當前值
sysctl vm.max_map_count
  • 臨時設置
sysctl -w vm.max_map_count=262144

但是重啓系統後就會失效。

  • 永久性設置

修改配置文件/etc/sysctl.conf,如下:

vm.max_map_count=262144

需要重啓後才生效。

4.4 system call filters

  • 原因
    這是在因爲Centos6不支持SecComp,而ES5.4.0默認bootstrap.system_call_filter爲true進行檢測,所以導致檢測失敗,失敗後直接導致ES不能啓動。

  • 解決
    在elasticsearch.yml中配置bootstrap.system_call_filter爲false,注意要在Memory下面:
    bootstrap.memory_lock: false
    bootstrap.system_call_filter: false

參考:https://www.jianshu.com/p/89f8099a6d09

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章