問題:藍鯨社區版完整部署,執行安裝bkdata時,報“databus.service.consul start failed.”
[root@paas-1 install]# ./bkcec start bkdata
[192.168.50.117]20181212-091416 72 starting bkdata(ALL) on host: 192.168.50.115
"-":23: bad minute
errors in crontab file, can't install.
[192.168.50.117]20181212-091427 79 going to init snapshot data. this may take a while.
E
======================================================================
ERROR: init_snapshot_config (databus.tests.DatabusHealthTestCase)
------------------------------------------------------------------------------------------------------------------------------
排查思路:
執行dig databus.service.consul正常
執行./bkcec start bkdata databus提示“ERROR: init_snapshot_config (databus.tests.DatabusHealthTestCase)”
執行./bkcec stop bkdata之後,執行./bkcec install bkdata 1(去除之前的環境,覆蓋安裝)
執行./bkcec initdata bkdata(初始化bkdata)
執行./bkcec start bkdata,再次報“databus.service.consul start failed.”
註釋:cat .bk_install.step可查看安裝進度...
問題解決:重啓了cmdb
原因分析:
bkdata從cmdb獲取基礎業務的信息獲取不到,導致報錯。
還有個是腳本bug
(bkdata機器上,執行vim /data/bkce/bkdata/dataapi/databus/tests.py,將 “update_bizid” 引用這個字段的內容註釋掉,該問題在下個版本中會進行修復。)
tests.py內容註釋掉之後的效果圖示:
其他輔助操作命令:
確認中控機位置:
查看日誌:
[root@paas-1 install]# cat /data/install/.controller_ip
192.168.50.117
[root@paas-1 install]# cd /data/bkce/logs/
[root@paas-1 logs]# ll
加載ssh工具($:代表變量,cat /data/install/utils.fc)
[root@paas-1 install]# source utils.fc
[root@paas-1 install]# ssh $BKDATA_IP
#ssh登錄主機後,可以執行ifconfig查看對應主機ip,utils.fc爲腳本文件。加載utils.fc主要是爲了調用服務名稱登錄主機。而不需要以ip的方式登錄主機。
[root@rbtnode1 install]# ssh $FTA_IP
查看詳細:
[root@rbtnode1 bkdata]# ls -lsrt
顯示日誌信息:
[root@rbtnode1 bkdata]# tail -f kernel.log
查看性能資源:
[root@rbtnode1 bkdata]# top
top - 09:35:26 up 18:53, 1 user, load average: 17.46, 12.08, 10.62
Tasks: 361 total, 1 running, 359 sleeping, 1 stopped, 0 zombie
%Cpu(s): 64.0 us, 14.2 sy, 0.0 ni, 20.8 id, 0.2 wa, 0.0 hi, 0.9 si, 0.0 st
KiB Mem : 16267340 total, 2454172 free, 12075284 used, 1737884 buff/cache
KiB Swap: 6160380 total, 6160380 free, 0 used. 3669524 avail Mem
查看啓動任務計劃:確保服務是否正常運行,或看配置文件是否選舉出集羣領導者
如有亂碼,可以執行清除任務計劃,然後[root@rbtnode1 install]# ./bkcec install cron 1進行重新安裝crontab,服務啓動的時候自動會寫入crontab
[root@rbtnode1 ~]# crontab -l
* * * * * export INSTALL_PATH=/data/bkce; /data/bkce/bin/process_watch consul >/dev/null 2>&1
* * * * * export INSTALL_PATH=/data/bkce; /data/bkce/bin/process_watch nginx >/dev/null 2>&1
* * * * * export INSTALL_PATH=/data/bkce; /data/bkce/bin/process_watch zk >/dev/null 2>&1
* * * * * export INSTALL_PATH=/data/bkce; /data/bkce/bin/process_watch rabbitmq >/dev/null 2>&1
* * * * * /usr/local/gse/agent/bin/gsectl watch
* * * * * export INSTALL_PATH=/data/bkce; /data/bkce/bin/process_watch paas_agent >/dev/null 2>&1
* * * * * export INSTALL_PATH=/data/bkce; /data/bkce/bin/process_watch es >/dev/null 2>&1
* * * * * export INSTALL_PATH=/data/bkce; /data/bkce/bin/process_watch kafka >/dev/null 2>&1
*/10 * * * * /data/bkce/bkdata/dataapi/bin/update_cc_cache.sh
查看進程信息:
[root@rbtnode1 ~]# ps -ef |grep bkdata
[root@rbtnode1 ~]# ps -ef |grep gse_agent
說明:腳本bug的問題主要是爲了解決安裝部署藍鯨時,在初始化bkdata遇到的以下問題:
解決方法:bkdata機器上,執行vim /data/bkce/bkdata/dataapi/databus/tests.py,將 “update_bizid” 引用這個字段的內容註釋掉。
原因分析:如不註釋掉,該引用的字段內容,將消耗很大的主機資源,導致主機因性能瓶頸以致藍鯨服務拉不起來。
[root@paas-1 install]# ./bkcec initdata bkdata
initdata for bkdata()
[192.168.50.117]20181212-101752 153 exec initdata_bkdata on 192.168.50.115
[192.168.50.115]20181212-101755 103 start to make migration for bkdata ...
[192.168.50.115]20181212-101755 111 on-migrate ... /data/bkce/bkdata/dataapi/on_migrate
[192.168.50.115]20181212-101757 9 init dataserver zk config
[192.168.50.115]20181212-101757 12 create topic
[192.168.50.115]20181212-101758 15 run trt migration
System check identified some issues:
WARNINGS:
trt.TrtResultTableField.field_index: (fields.W122) 'max_length' is ignored when used with IntegerField
HINT: Remove 'max_length' from field
Operations to perform:
Apply all migrations: trt
Running migrations:
No migrations to apply.
Your models have changes that are not yet reflected in a migration, and so won't be applied.
Run 'manage.py makemigrations' to make new migrations, and then re-run 'manage.py migrate' to apply them.
[192.168.50.115]20181212-101801 18 insert reserved dataid
E=================set reserved dataid========================================
======================================================================
ERROR: update_reserved_dataid (databus.tests.DatabusHealthTestCase)
------------------------------------------------------------------------------------------------------------------------------
Traceback (most recent call last):
File "/data/bkce/bkdata/dataapi/databus/tests.py", line 46, in update_reserved_dataid
blueking_bizid = utils.get_blueking_bizid()
File "/data/bkce/bkdata/dataapi/databus/init/utils.py", line 19, in get_blueking_bizid
raise Exception('Failed to get application id of BlueKing. The response is error %s' % json.dumps(ret))
Exception: Failed to get application id of BlueKing. The response is error {"message": "Component request third-party system [CC] interface [get_app_list] error: Status Code: 404, Error Message: Third-party system does not find this interface, please try again later or contact component developer to handle this", "code": 1306201, "data": null, "result": false, "request_id": "47ef124353824f7a898900c0defc93e1"}
----------------------------------------------------------------------
Ran 1 test in 0.759s
FAILED (errors=1)
[192.168.50.115]20181212-101804 21 running 'update_reserved_dataid' for databus health test failed.
[192.168.50.115]20181212-101804 130 migrate failed for bkdata(dataapi)
[192.168.50.117]20181212-101803 453 create database bksuite_common
[192.168.50.117]20181212-101803 455 add version info to db
環境說明:
[root@paas-1 install]# cd /data/src/
您在 /var/spool/mail/root 中有新郵件
[root@paas-1 src]# grep . VERSION */VERSION */*/VERSION
VERSION:4.1.16
cmdb/VERSION:0.0.42
fta/VERSION:4.1.12
gse/VERSION:3.2.12
job/VERSION:4.3.3
license/VERSION:3.1.4
open_paas/VERSION:3.0.83
paas_agent/VERSION:3.0.8
bkdata/dataapi/VERSION:1.2.105
bkdata/databus/VERSION:1.2.23
bkdata/monitor/VERSION:0.2.6
[root@paas-1 src]#
注,本文章爲個人近期學習藍鯨的內容總結,僅供大家參考學習!