win10上通過VMwarePro15安裝CentOS-7鏡像後搭建cdh6.3.2集羣

背景:隨着Cloudera Enterprise 6.3.3發佈再也沒有免費版了,於是我便萌生在win10上通過VMwarePro15安裝CentOS-7-x86_64-DVD-1810鏡像後搭建cdh6.3.2集羣。但由於Cloudera僅提供CDH-6.3.1的rpm安裝包用於安裝CM和CDH-6.3.2的parcel包用於搭建集羣,如下:

具體安裝步驟參考一:如何在Redhat7.6安裝CDH6.3.3;參考二:CDH6.3.1企業集羣真正離線部署(rpm+http file部署方式) 全網最細,配套視頻和文檔安裝包,生產可實踐

在此說一下我遇到的坑:

1.雖然網上搭建cdh6.x的教程中都是提倡如“cdh6.3.0的rpm+cdh6.3.0的parcel包”進行搭建,但官方所提供的只能使用“cdh6.3.1的rpm+cdh6.3.2的parcel包”。親測可以正常搭建cdh6.3.2集羣且運行正常。關於cdh6.3.1/2比cdh6.3.0優化項暫不討論,不過cdh上各組件的版本號均是一樣的,參見:https://archive.cloudera.com/cdh6/6.3.2/redhat7/yum/RPMS/noarch/

2.通過scm_prepare_database.sh腳本自動生成cdh所需要的CM、AM、RM等庫時若報錯:Unable to find JDBC driver for database type: MySQL問題,如下圖所示:

[root@cdh632_master01 ~]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql cmf cmf '你的mysql數據庫的用戶所對應的密碼'
JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing:  /usr/java/jdk1.8.0_181-cloudera/bin/java -cp :/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
2020-03-03 15:04:59,945 [main] INFO  com.cloudera.enterprise.dbutil.DbCommandExecutor  - Unable to find JDBC driver for database type: MySQL
2020-03-03 15:04:59,946 [main] ERROR com.cloudera.enterprise.dbutil.DbCommandExecutor  - JDBC Driver com.mysql.jdbc.Driver not found.
2020-03-03 15:04:59,946 [main] ERROR com.cloudera.enterprise.dbutil.DbCommandExecutor  - Exiting with exit code 3
--> Error 3, giving up (use --force if you wish to ignore the error)

除了網上普遍指出的所有主機未正確重命名mysql-connector-java-5.1.47.jar並放置到固定位置外:

mkdir /usr/share/java && cp /root/jcz/mysql-connector-java-5.1.47.jar /usr/share/java/mysql-connector-java.jar

而我實際上:之所以我遇到“找不到mysql驅動jar包”的解決方案與別人不同,是因爲我此臺機器上未安裝cloudera-manager-server導致的。之所以我此臺機器上未安裝此server,是因爲在cdh集羣中客戶機不需要安裝server,即此臺機器肯定無法啓動server

3.在啓動agent後,查看狀態時報錯

[root@cdh632_worker02 ~]# systemctl status cloudera-scm-agent
● cloudera-scm-agent.service - Cloudera Manager Agent Service
   Loaded: loaded (/usr/lib/systemd/system/cloudera-scm-agent.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Tue 2020-03-03 15:44:40 CST; 44s ago
  Process: 6057 ExecStart=/opt/cloudera/cm-agent/bin/cm agent (code=exited, status=0/SUCCESS)
 Main PID: 6057 (code=exited, status=0/SUCCESS)

Mar 03 15:44:40 cdh632_worker02 cm[6057]: [03/Mar/2020 15:44:40 +0000] 6057 MainThread agent        INFO     Re-using pre-existing directory.../cgroups
Mar 03 15:44:40 cdh632_worker02 cm[6057]: [03/Mar/2020 15:44:40 +0000] 6057 MainThread agent        INFO     Re-using pre-existing directory.../process
Mar 03 15:44:40 cdh632_worker02 cm[6057]: [03/Mar/2020 15:44:40 +0000] 6057 MainThread tmpfs        INFO     Reusing mounted tmpfs at /var/r.../process
Mar 03 15:44:40 cdh632_worker02 cm[6057]: [03/Mar/2020 15:44:40 +0000] 6057 MainThread main         ERROR    Top-level exception: Hostname i...aracter.
Mar 03 15:44:40 cdh632_worker02 cm[6057]: Traceback (most recent call last):
Mar 03 15:44:40 cdh632_worker02 cm[6057]: File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/main.py", line 105, in main_impl
Mar 03 15:44:40 cdh632_worker02 cm[6057]: ag.configure_service()
Mar 03 15:44:40 cdh632_worker02 cm[6057]: File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 608, in configure_service
Mar 03 15:44:40 cdh632_worker02 cm[6057]: raise Exception("Hostname is invalid; it contains an underscore character.")
Mar 03 15:44:40 cdh632_worker02 cm[6057]: Exception: Hostname is invalid; it contains an underscore character.
Hint: Some lines were ellipsized, use -l to show in full.

根據Exception: Hostname is invalid; it contains an underscore character.這句話顯示“主機名不能包含下劃線”,經查:https://stackoverflow.com/questions/17830232/cloudera-agent-giving-error-hostname-is-invalid-it-contains-an-underscore-ch

然後只需運行:hostnamectl --static set-hostname cdh632-worker02進行主機名重命名,切記虛擬機VMware Pro15上不需要修改主機名以及win10上不需要修改VirtualMachines目錄下自動生成的該主機的好多文件名。

3.接下來啓動server和agent後,瀏覽器輸入master的ip:7180進行羣集安裝突然想起來重命名後未再次進行免密ssh,而之前設置的ssh如下:

[root@worker11 ~]# ssh-keygen
[root@worker11 ~]# ls ~/.ssh/
id_rsa  id_rsa.pub
[root@worker11 ~]# netstat -tlunp | grep sshd
-bash: netstat: command not found
[root@worker11 ~]# yum install net-tools
[root@worker11 ~]# netstat -tlunp | grep sshd
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      43404/sshd
tcp6       0      0 :::22                   :::*                    LISTEN      43404/sshd
[root@worker11 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@master11
Now try logging into the machine, with:   "ssh 'root@master11'"
[root@worker11 ~]# ls ~/.ssh/
authorized_keys(新增)  id_rsa  id_rsa.pub  known_hosts(此時才新增)

當遇到所有三臺主機在羣集安裝時Install Parcels環節均無法進行“分配”,如下圖所示:

通過點擊藍色字體的那三臺主機進入到CM上查看“檢查所有主機”,如下圖所示:

首先可以肯定的是所有的三臺主機上jdk是不可能有任何問題的,因爲我安裝的是cloudera下載rpm包時所提供的oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm安裝包。查看agent狀態(systemctl status cloudera-scm-agent)沒有任何問題,百思不得其解。

鑑於此,我錯誤地卸載並重裝server和agent

[root@cdh632-master01 ~]# systemctl stop cloudera-scm-agent
[root@cdh632-master01 ~]# systemctl stop cloudera-scm-server
[root@cdh632-master01 ~]# yum -y remove cloudera-manager-agent
[root@cdh632-master01 ~]# yum -y remove cloudera-manager-server
[root@cdh632-master01 ~]# yum -y remove cloudera-manager-daemons
[root@cdh632-master01 ~]# cd /var/www/html/cloudera-repos/
[root@cdh632-master01 cloudera-repos]# sudo yum install -y cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm
[root@cdh632-master01 cloudera-repos]# sudo yum install -y cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm
[root@cdh632-master01 cloudera-repos]# sudo yum install -y cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm
[root@cdh632-master01 cloudera-repos]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql cmf cmf '你的mysql數據庫的用戶所對應的密碼'
[root@cdh632-master01 cloudera-repos]# sed -i "s/server_host=localhost/server_host=cdh632-master01/g" /etc/cloudera-scm-agent/config.ini
[root@cdh632-master01 cloudera-repos]# systemctl start cloudera-scm-server
[root@cdh632-master01 cloudera-repos]# systemctl start cloudera-scm-agent
[root@cdh632-master01 cloudera-repos]# systemctl status cloudera-scm-server
[root@cdh632-master01 cloudera-repos]# systemctl status cloudera-scm-agent

再次出現無法“分配”的相同情況,此時才發現卸載並重裝的嘗試是錯誤的。

於是查看agent的詳細日誌(位置在vi /var/log/cloudera-scm-agent/cloudera-scm-agent.log),如下圖所示:

Error, CM server guid updated, expected xxxx, received yyyy的解決方案:

rm -rf /var/lib/cloudera-scm-agent/cm_guid
systemctl restart cloudera-scm-agent

於是成功安裝cdh6.3.2集羣。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章