在CentOS 7 安裝Calamari

Ceph是一款開源的SDS軟件,對於開源安裝好可以只是完成了第一步,後面的監控運維纔是重點;要想直觀的瞭解集羣的運行狀態,監控軟件也就必不可少了,而對於Ceph的監控用得比較多的有Zabbix,inkScope,Calamari等。下文將詳細說明Calamari在CentOS 7上的安裝過程。

獲取Calamari源碼包

#> git clone https://github.com/ceph/calamari.git
#> git clone https://github.com/ceph/calamari-clients.git
#> git clone https://github.com/ceph/Diamond 

構建calamari server的rpm包

#> cd calamari
#> yum remove prelink //避免安裝時出現cpio Dismatch 錯誤
#> ./build-rpm.sh

構建完成後會在父目錄下的rpmbuild目錄路徑下生成rpm安裝包

安裝calamari server

#> cd ..    //從calamari目錄退出到父目錄
#> yum localinstall rpmbuild/RPMS/x86_64/calamari-server-1.3.1.1-101_g945d16a.el6.x86_64.rpm

構建及安裝calamari client

安裝依賴

#> yum install npm ruby rubygems ruby-devel
#> npm install -g grunt grunt-cli bower grunt-contrib-compass
#> gem update --system && gem install compass

如果由於網絡原因,gem源更新失敗,請按如下方式處理:

#> gem sources 
#> gem sources -r https://rubygems.org/
#> gem sources -a https://ruby.taobao.org/ 
#> gem sources -u

編譯並安裝calamari client

#> cd calamari-clients
#> make build-real 
#> make dist     //會在上級目錄生成 calamari-clients_1.2.2.tar.gz tar包
#> cd ..         //返回calamari-client父目錄
#> tar -zxvf  calamari-clients_1.2.2.tar.gz   //解壓
#> mkdir -p /opt/calamari/webapp/content       //創建目錄
#> cd calamari-clients-1.2.2  
//拷貝內容到下述目錄
#>for dir in manage admin login dashboard 
do 
    mkdir -p /opt/calamari/webapp/content/"$dir"
    cp -pr "$dir"/dist/* /opt/calamari/webapp/content/"$dir"/
done

如果make build-real 過程中出現如下因爲網絡原因下載依賴包失敗的問題,請將對應文件中的依賴包下載地址替換爲一個可用url,舉例如下:

phantomjs@1.9.18 install /datapool/calamari-clients/manage/node_modules/karma-phantomjs-launcher/node_modules/phantomjs   //install.js目錄
> node install.js

Downloading https://bitbucket.org/ariya/phantomjs/downloads/phantomjs-1.9.8-linux-x86_64.tar.bz2
Saving to /datapool/calamari-clients/manage/node_modules/karma-phantomjs-launcher/node_modules/phantomjs/phantomjs/phantomjs-1.9.8-linux-x86_64.tar.bz2
Receiving...

Error making request.
Error: connect ETIMEDOUT    //GFW導致的下載超時
    at errnoException (net.js:905:11)
    at Object.afterConnect [as oncomplete] (net.js:896:19)

上述錯誤提示下載依賴包phantomjs-1.9.8-linux-x86_64.tar.bz2失敗,按如下方式替換install.js中的下載地址即可(這裏使用淘寶的源):

var cdnUrl = process.env.npm_config_phantomjs_cdnurl || process.env.PHANTOMJS_CDNURL || 'http://npm.taobao.org/mirrors/phantomjs'

初始化calamari

經過上面的過程,calamari server及calamari client就安裝完成,在首次使用calamari前需要先完成calamari的初始化,如下:

#> calamari-ctl initialize

如果初始化過程中,出現重啓服務卡死,則需要升級supervisor到3.0以上:

 #> git clone https://github.com/Supervisor/supervisor.git
 #> cd supervisor && python setup.py install 

配置calamari server

配置防火牆

### for salt-master 
#> iptables -A INPUT -m state --state NEW -m tcp -p tcp --dport 4505 -j ACCEPT 
#> iptables -A INPUT -m state --state NEW -m tcp -p tcp --dport 4506 -j ACCEPT 
### for carbon 
#> iptables -A INPUT -m state --state NEW -m tcp -p tcp --dport 2003 -j ACCEPT 
#> iptables -A INPUT -m state --state NEW -m tcp -p tcp --dport 2004 -j ACCEPT

配置saltstack認證

當ceph 節點上的salt-minion服務啓動之後,會自動向salt-master請求認證。在Calamari server上可以通過下面的命令查看salt-minion密鑰的列表:

#> salt-key -L

剛剛啓動salt-minion服務的ceph 節點會出現在Unaccepted Keys列表之後,要使得Calamari能夠通過saltstack管理ceph 節點,需要對這些密鑰進行認證:

#> salt-key -A

安裝diamond及salt-minion

構建diamond rpm包

#> cd Diamond
#> git checkout origin/calamari
#> make rpm      //在dist目錄下生成diamond-3.4.67-0.noarch.rpm 包

在所有ceph節點上安裝salt-minion及diamond

首先將剛纔構建的diamond rpm包拷貝到所有的ceph節點上,執行下述命令安裝相關的軟件包:

#> yum localinstall diamond-3.4.67-0.noarch.rpm
#> yum install -y salt-minion

在所有ceph節點上配置salt-minion並啓動

#> touch /etc/salt/minion.d/calamari.conf 
###calamari-server-name爲calamari 服務器的地址(ip或域名)
#> echo "master: {calamari-server-name}" > /etc/salt/minion.d/calamari.conf    
### :與後面的地址間有個空格
#> echo "master: {calamari-server-name}" >> /etc/salt/minion                   
#> service salt-minion restart 
#> service diamond start 

如果啓動diamond失敗,查看diamond日誌有如下錯誤:

#> tail -f /var/log/diamond/diamond.log
[2015-11-03 19:06:35,044] [MainThread] pysnmp.entity.rfc3413.oneliner.cmdgen failed to load

請 以root用戶運行diamond服務,修改如下:

#> echo "user=root,group=root" >> /etc/diamond/diamond.conf

至此calamari監控就安裝配置完成了,你可以再web上看到ceph集羣的狀態了。如果人品剛好有點問題,那就繼續看下面的故障處理吧!

踩過的那些坑

Q:diamond日誌文件中出現如下的錯誤:

-- Unit diamond.service has begun starting up.
Nov 03 18:46:27 ceph-osd2 diamond[3790]: Failed to acquire lockfile: /var/lock/subsys/diamond.
Nov 03 18:46:27 ceph-osd2 diamond[3790]: Held by 14377
Nov 03 18:46:27 ceph-osd2 diamond[3790]: [FAILED]
Nov 03 18:46:27 ceph-osd2 systemd[1]: diamond.service: control process exited, code=exited status=1
Nov 03 18:46:27 ceph-osd2 systemd[1]: Failed to start LSB: System statistics collector for Graphite.

刪除/var/lock/subsys目錄下面的文件就好了:

#> rm -f /var/lock/subsys/*

Q: 打開網頁出現500錯誤
1)有可能是cthulhu-manager進程沒有起來,重啓supervisor就好

#> /usr/bin/python /usr/bin/supervisord -c /etc/supervisord.conf

2)有可能是寫日誌沒有權限/var/log/calamari/*

#> chmod 766 /var && chmod -R 766 /var/calamari

Q: 打開/dashboard/頁面總是報internal server error(5),並有如下錯誤日誌:

 #> vi /var/log/calamari/calamari.log
 2015-11-05 20:25:39,252 - ERROR - django.request Internal Server Error: /api/v1/cluster/4a4dd60f-c8bb-4982-a1b4-9b891f78c30b/osd
Traceback (most recent call last):
  File "/opt/calamari/venv/lib/python2.6/site-packages/django/core/handlers/base.py", line 117, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
  File "/opt/calamari/venv/lib/python2.6/site-packages/rest_framework/viewsets.py", line 78, in view
    return self.dispatch(request, *args, **kwargs)
  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_rest_api-0.1-py2.6.egg/calamari_rest/views/rpc_view.py", line 94, in dispatch
    return super(RPCViewSet, self).dispatch(request, *args, **kwargs)
  File "/opt/calamari/venv/lib/python2.6/site-packages/django/views/decorators/csrf.py", line 77, in wrapped_view
    return view_func(*args, **kwargs)
  File "/opt/calamari/venv/lib/python2.6/site-packages/rest_framework/views.py", line 399, in dispatch
    response = self.handle_exception(exc)
  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_rest_api-0.1-py2.6.egg/calamari_rest/views/rpc_view.py", line 111, in handle_exception
    return super(RPCViewSet, self).handle_exception(exc)
  File "/opt/calamari/venv/lib/python2.6/site-packages/rest_framework/views.py", line 396, in dispatch
    response = handler(request, *args, **kwargs)
  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_rest_api-0.1-py2.6.egg/calamari_rest/views/v1.py", line 423, in get
    osds, osds_by_pg_state = self.generate(pg_summary, osd_map, server_info, servers)
  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_rest_api-0.1-py2.6.egg/calamari_rest/views/v1.py", line 371, in generate
    for osd_id, osd_pg_summary in pg_summary['by_osd'].items():
TypeError: 'NoneType' object is unsubscriptable

#> vi /var/log/calamari/cthulhu.log
  2015-11-04 17:38:59,278 - ERROR - cthulhu Exception handling message with tag ceph/cluster/4a4dd60f-c8bb-4982-a1b4-9b891f78c30b
Traceback (most recent call last):
  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_cthulhu-0.1-py2.6.egg/cthulhu/manager/cluster_monitor.py", line 244, in _run
    self.on_heartbeat(data['id'], data['data'])
  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_cthulhu-0.1-py2.6.egg/cthulhu/gevent_util.py", line 35, in wrapped
    return func(*args, **kwargs)
  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_cthulhu-0.1-py2.6.egg/cthulhu/manager/cluster_monitor.py", line 346, in on_heartbeat
    cluster_data['versions'][sync_type.str])
  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_cthulhu-0.1-py2.6.egg/cthulhu/manager/cluster_monitor.py", line 99, in on_version
    self.fetch(reported_by, sync_type)
  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_cthulhu-0.1-py2.6.egg/cthulhu/manager/cluster_monitor.py", line 109, in fetch
    client = LocalClient(config.get('cthulhu', 'salt_config_path'))
  File "/usr/lib/python2.6/site-packages/salt/client/__init__.py", line 136, in __init__
    listen=not self.opts.get('__worker', False))
  File "/usr/lib/python2.6/site-packages/salt/utils/event.py", line 114, in get_event
    return MasterEvent(sock_dir, opts)
  File "/usr/lib/python2.6/site-packages/salt/utils/event.py", line 559, in __init__
    super(MasterEvent, self).__init__('master', sock_dir, opts)
  File "/usr/lib/python2.6/site-packages/salt/utils/event.py", line 181, in __init__
    self.get_event(wait=1)
  File "/usr/lib/python2.6/site-packages/salt/utils/event.py", line 410, in get_event
    ret = self._get_event(wait, tag, tags_regex)
  File "/usr/lib/python2.6/site-packages/salt/utils/event.py", line 351, in _get_event
    socks = dict(self.poller.poll(wait * 1000))
  File "/opt/calamari/venv/lib/python2.6/site-packages/zmq/green/poll.py", line 81, in poll
    select.select(rlist, wlist, xlist)
  File "/opt/calamari/venv/lib/python2.6/site-packages/gevent/select.py", line 68, in select
    result.event.wait(timeout=timeout)
  File "/opt/calamari/venv/lib/python2.6/site-packages/gevent/event.py", line 77, in wait
    result = self.hub.switch()
  File "/opt/calamari/venv/lib/python2.6/site-packages/gevent/hub.py", line 337, in switch
    switch_out()
  File "/opt/calamari/venv/lib/python2.6/site-packages/calamari_cthulhu-0.1-py2.6.egg/cthulhu/gevent_util.py", line 15, in asserter
    raise ForbiddenYield("Context switch during `nosleep` region!")

出現該問題,是因爲saltstack(salt, salt-master, salt-minion)與calamari不兼容;在CentOS 7系統上默認安裝salt-2015.5.5這個版本,
爲解決該問題,需卸載該版本,並安裝salt-2014.1.x版本問題解決, 我安裝salt-2014.1.13-1工作正常;salt可以從如下站點下載: http://rpmfind.net/

Q: diamond日誌文件中出現如下錯誤:

#> tail -f /var/log/diamond/diamond.log
 GraphiteHandler: Failed to connect to 10.168.122.165:2003. timed out.

1).可能是防火牆問題,查看防火牆設置即可
2).可能是路由問題, 查看路由設置即可
3).可能服務端cthulhu-manager進程沒有啓動,啓動該進程即可

Q:安裝salt-minion報如下錯誤:

Error: Package: python-msgpack-0.4.6-1.el6.x86_64 (epel)
           Requires: python(abi) = 2.6
           Installed: python-2.7.5-16.el7.x86_64 (@anaconda)
               python(abi) = 2.7
               python(abi) = 2.7

再安裝一個python2.6,然後安裝salt-minion的時候指定python版本爲2.6

configure --with-python2.6=/usr/local/python2.6
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章