本文是在CentOS7.4 下進行CDH6集羣的完全離線部署。CDH5集羣與CDH6集羣的部署區別比較大。
說明:本文內容所有操作都是在root用戶下進行的。
文件下載
首先一些安裝CDH6集羣的必須文件要先在外網環境先下載好。
Cloudera Manager 6.3.0
CM6 RPM:https://archive.cloudera.com/cm6/6.3.0/redhat7/yum/RPMS/x86_64/
需要下載該鏈接下的所有RPM文件,保存到cloudera-repos
目錄下。
ASC文件:https://archive.cloudera.com/cm6/6.3.0/allkeys.asc
同時還需要下載一個asc文件,同樣保存到cloudera-repos
目錄下:
[root@node01 upload]# tree cloudera-repos/
cloudera-repos/
├── allkeys.asc
├── cloudera-manager-agent-6.3.0-1281944.el7.x86_64.rpm
├── cloudera-manager-daemons-6.3.0-1281944.el7.x86_64.rpm
├── cloudera-manager-server-6.3.0-1281944.el7.x86_64.rpm
├── cloudera-manager-server-db-2-6.3.0-1281944.el7.x86_64.rpm
├── enterprise-debuginfo-6.3.0-1281944.el7.x86_64.rpm
└── oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm
MySQL JDBC驅動
要求使用5.1.26以上版本的jdbc驅動,可點擊這裏直接下載mysql-connector-java-5.1.47.tar.gz
配置Cloudera Manager yum庫
注意:不要嘗試使用FTP搭建CM的YUM庫!
首先安裝httpd
和createrepo
:
yum -y install httpd createrepo
啓動httpd
服務並設置開機自啓動:
systemctl start httpd
systemctl enable httpd
然後進入到前面準備好的存放Cloudera Manager RPM包的目錄cloudera-repos
下:
cd /data6/upload/cloudera-repos/
生成RPM元數據:
createrepo .
chmod 777 -R cloudera-repos
然後將cloudera-repos
目錄移動到httpd的html目錄下:
mv cloudera-repos /var/www/html/
確保可以通過瀏覽器查看到這些RPM包:
接着在創建cm6的repo文件(每個節點都需要配置):
cd /etc/yum.repos.d
vim cloudera-manager.repo
添加如下內容:
[cloudera-manager]
name=Cloudera Manager 6.3.0
baseurl=http://node01/cloudera-repos/
gpgcheck=0
enabled=1
保存,退出,然後執行yum clean all && yum makecache
命令:
[root@master02 ~]# yum clean all && yum makecache
Loaded plugins: fastestmirror, langpacks
Cleaning repos: ChinaUnicom-Packages cloudera-manager
Cleaning up everything
Maybe you want: rm -rf /var/cache/yum, to also free up space taken by orphaned data from disabled or removed repos
Loaded plugins: fastestmirror, langpacks
ChinaUnicom-Packages | 3.6 kB 00:00:00
cloudera-manager | 2.9 kB 00:00:00
(1/7): ChinaUnicom-Packages/group_gz | 156 kB 00:00:00
(2/7): ChinaUnicom-Packages/filelists_db | 3.1 MB 00:00:00
(3/7): ChinaUnicom-Packages/primary_db | 3.1 MB 00:00:00
(4/7): ChinaUnicom-Packages/other_db | 1.2 MB 00:00:00
(5/7): cloudera-manager/filelists_db | 118 kB 00:00:00
(6/7): cloudera-manager/other_db | 1.0 kB 00:00:00
(7/7): cloudera-manager/primary_db | 8.6 kB 00:00:00
Determining fastest mirrors
Metadata Cache Created
安裝Cloudera Manager Server
這一步只需要在CM Server節點上操作。
執行下面的命令:
# 安裝openjdk8
yum install oracle-j2sdk1.8
# 安裝 cm manager(只需在server節點安裝)
yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server
將會需要很多依賴包,所以說還是有必要搭一個局域網內yum源的,或者手動安裝rpm包
配置本地Parcel存儲庫
Cloudera Manager Server安裝完成後,進入到本地Parcel存儲庫目錄:
cd /opt/cloudera/parcel-repo
將第一部分下載的CDH parcels文件上傳至該目錄下,然後執行修改sha文件:
mv /data6/upload/parcels/* /opt/cloudera/parcel-repo/
mv CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha1 CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha
然後執行下面的命令修改文件所有者:
chown -R cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo
最終/opt/cloudera/parcel-repo
目錄內容如下:
├── CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel
├── CDH-6.3.0-1.cdh6.3.0.p0.1279813-el7.parcel.sha
└── manifest.json
安裝數據庫
MySQL的安裝在環境準備部分中已經有說明,這裏就跳過MySQL安裝了。
數據庫配置
CDH官方給的有一份推薦的MySQL的配置內容:
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log
#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
sql_mode=STRICT_ALL_TABLES
配置mysql jdbc驅動
從前面下載好的mysql-connector-java-5.1.47.tar.gz
包中解壓出mysql-connector-java-5.1.47-bin.jar
文件,將mysql-connector-java-5.1.47-bin.jar
文件上傳至CM Server節點上的/usr/share/java/
目錄下並重命名爲mysql-connector-java.jar
(如果/usr/share/java/
目錄不存在,需要手動創建):
tar zxvf mysql-connector-java-5.1.47.tar.gz
mkdir -p /usr/share/java/
cp mysql-connector-java-5.1.47-bin.jar /usr/share/java/mysql-connector-java.jar
創建CDH所需要的數據庫
根據所需要安裝的服務參照下表創建對應的數據庫以及數據庫用戶,數據庫必須使用utf8編碼,創建數據庫時要記錄好用戶名及對應密碼:
服務名 | 數據庫名 | 用戶名 |
---|---|---|
Cloudera Manager Server | scm | scm |
Activity Monitor | amon | amon |
Reports Manager | rman | rman |
Hue | hue | hue |
Hive Metastore Server | metastore | hive |
Sentry Server | sentry | sentry |
Cloudera Navigator Audit Server | nav | nav |
Cloudera Navigator Metadata Server | navms | navms |
Oozie | oozie | oozie |
創建數據庫及對應用戶:
# scm
CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm';
# amon
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';
# rman
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';
# hue
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';
# hive
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive';
# sentry
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';
# nav
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';
# navms
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'navms';
# oozie
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
# flush
FLUSH PRIVILEGES;
設置Cloudera Manager 數據庫
Cloudera Manager Server包含一個配置數據庫的腳本。
-
mysql數據庫與CM Server是同一臺主機
執行命令:/opt/cloudera/cm/schema/scm_prepare_database.sh mysql scm scm
-
mysql數據庫與CM Server不在同一臺主機上
執行命令:/opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h <mysql-host-ip> --scm-host <cm-server-ip> scm scm
[root@master02 ~]# /opt/cloudera/cm/schema/scm_prepare_database.sh mysql -h 10.172.54.51 --scm-host 10.172.54.52 scm scm
Enter SCM password:
JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing: /usr/java/jdk1.8.0_181-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[ main] DbCommandExecutor INFO Successfully connected to database.
All done, your SCM database is configured correctly!
安裝CDH節點
啓動Cloudera Manager Server服務
systemctl start cloudera-scm-server
然後等待Cloudera Manager Server啓動,可能需要稍等一會兒,可以通過命令tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log
去監控服務啓動狀態。
當看到INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
日誌打印出來後,說明服務啓動成功,可以通過瀏覽器訪問Cloudera Manager WEB界面了。
訪問Cloudera Manager WEB界面
打開瀏覽器,訪問地址:http://<server_host>:7180
,默認賬號和密碼都爲admin:
歡迎頁面
首先是Cloudera Manager的歡迎頁面,點擊頁面右下角的【繼續】按鈕進行下一步:
接受條款
勾選接受條款,點擊【繼續】進行下一步:
版本選擇
這裏我就選擇免費版了:
第二個歡迎界面
選擇版本以後會出現第二個歡迎界面,不過這個是安裝集羣的歡迎頁:
選擇主機
這一步是要搜索並選擇用於安裝CDH集羣的主機,在主機名稱後面的輸入框中輸入各個節點的hostname,中間使用英文逗號分隔開,然後點擊搜索,在結果列表中勾選要安裝CDH的節點即可:
指定存儲庫
Cloudera Manager Agent
這裏選擇自定義,填寫上面使用httpd搭建好的Cloudera Manager YUM 庫URL:
CDH and other software
如果我們之前的【配置本地Parcel存儲庫】步驟操作無誤的話,這裏會自動選擇【使用Parcel】,並加載出CDH版本,確認無誤後點擊【繼續】:
因此,不需要自己手動安裝 Cloudera Manager Agent了
JDK安裝選項
這一步驟我就不再勾選安裝JDK了,因爲我在環境準備部分已經安裝過了。取消勾選,然後繼續:
SSH登錄配置
用於配置集羣主機之間的SSH登錄,填寫root用戶的密碼,根據集羣配置填寫合適的【同時安裝數量】值即可:
安裝Agent
到這一步會自動進行節點Agent的安裝,稍等一會兒,即可安裝完成:
安裝Parcels
這一步同樣是自動安裝,分配步驟的速度主要取決於網絡環境,耐心等待即可:
主機檢查
等待檢查完成即可:
Cloudera 建議將 /proc/sys/vm/swappiness
設置爲最大值 10。當前設置爲 30。使用 sysctl
命令在運行時更改該設置並編輯 /etc/sysctl.conf
,以在重啓後保存該設置。您可以繼續進行安裝,但 Cloudera Manager 可能會報告您的主機由於交換而運行狀況不良。
解決方法:
臨時修改:
sysctl vm.swappiness=10
cat /proc/sys/vm/swappiness
這裏我們的修改已經生效,但是如果我們重啓了系統,又會變成30.
永久修改:
在/etc/sysctl.conf
文件裏添加如下參數:
vm.swappiness=10
或者:
echo 'vm.swappiness=10'>> /etc/sysctl.conf
已啓用透明大頁面壓縮,可能會導致重大性能問題。請運行echo never > /sys/kernel/mm/transparent_hugepage/defrag
和echo never > /sys/kernel/mm/transparent_hugepage/enabled
以禁用此設置,然後將同一命令添加到 /etc/rc.local
等初始化腳本中,以便在系統重啓時予以設置。
安裝上面的提示執行即可;
安裝CDH集羣
選擇服務類型
這裏我選擇自定義服務,Zookeeper, HDFS,Yarn:
可以先安裝基礎組件,然後用到啥在安裝啥
如果所有服務都安裝,可能安裝過程中會出現很多問題
角色分配
CDH會自動給出一個角色分配,如果覺得不合理,我們可以手動調整一下,注意角色分配均衡:
數據庫設置
錯誤問題
no leveldbjni-1.8 in java.library.path
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
Exception in thread "main" java.lang.UnsatisfiedLinkError: Could not load library. Reasons: [Can't load library: /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni-1.8.so, Can't load library: /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni.so, no leveldbjni64-1.8 in java.library.path, no leveldbjni-1.8 in java.library.path, no leveldbjni in java.library.path, /run/cloudera-scm-agent/process/68-cloudera-mgmt-SERVICEMONITOR/libleveldbjni-64-1-4792575431304239050.8: libstdc++.so.6: : , /tmp/libleveldbjni-64-1-6079277982211108711.8: libstdc++.so.6: : ]
出錯原因:當前節點的glibc升級有關。既然不存在leveldbjni的庫,那便給他安裝一個。
安裝leveldbjni庫的方式非常有趣:
1) 首先下載leveldbjni-all-1.8.jar
2)解壓該jar包,在\META-INF\native\linux64目錄下找到libleveldbjni.so文件
- 將libleveldbjni.so文件上傳到1中java.library.path中
卸載Cloudera Manager
如果因爲其他原因,需要卸載Cloudera Manager,在各節點執行如下步驟即可。
systemctl stop cloudera-scm-server
systemctl stop cloudera-scm-agent
yum -y remove 'cloudera-manager-*'
yum clean all
umount cm_processes
umount /var/run/cloudera-scm-agent/process
rm -Rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/cloudera* /var/log/cloudera* /var/run/cloudera*
rm -rf /tmp/.scmpreparenode.lock
rm -Rf /var/lib/flume-ng /var/lib/hadoop* /var/lib/hue /var/lib/navigator /var/lib/oozie /var/lib/solr /var/lib/sqoop* /var/lib/zookeeper
rm -rf /var/lib/hadoop-* /var/lib/impala /var/lib/solr /var/lib/zookeeper /var/lib/hue /var/lib/oozie /var/lib/pgsql /var/lib/sqoop2 /data/dfs/ /data/impala/ /data/yarn/ /dfs/ /impala/ /yarn/ /var/run/hadoop-*/ /var/run/hdfs-*/ /usr/bin/hadoop* /usr/bin/zookeeper* /usr/bin/hbase* /usr/bin/hive* /usr/bin/hdfs /usr/bin/mapred /usr/bin/yarn /usr/bin/sqoop* /usr/bin/oozie /etc/hadoop* /etc/zookeeper* /etc/hive* /etc/hue /etc/impala /etc/sqoop* /etc/oozie /etc/hbase* /etc/hcatalog
systemctl stop mariadb
yum -y remove mariadb-*
rm -rf /var/lib/mysql
rm -rf /var/log/mysqld.log
rm -rf /usr/lib64/mysql
rm -rf /usr/share/mysql
rm -rf /opt/cloudera
問題
ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest
2019-08-27 20:35:50,469 INFO WebServerImpl:com.cloudera.server.cmf.WebServerImpl: Started Jetty server.
2019-08-27 20:35:50,600 ERROR ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: Unable to retrieve remote parcel repository manifest
java.util.concurrent.ExecutionException: java.net.UnknownHostException: archive.cloudera.com: Name or service not known
at com.ning.http.client.providers.netty.future.NettyResponseFuture.abort(NettyResponseFuture.java:231)
at com.ning.http.client.providers.netty.request.NettyRequestSender.abort(NettyRequestSender.java:422)
at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithNewChannel(NettyRequestSender.java:290)
at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithCertainForceConnect(NettyRequestSender.java:142)
at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequest(NettyRequestSender.java:117)
at com.ning.http.client.providers.netty.NettyAsyncHttpProvider.execute(NettyAsyncHttpProvider.java:87)
at com.ning.http.client.AsyncHttpClient.executeRequest(AsyncHttpClient.java:506)
at com.ning.http.client.AsyncHttpClient$BoundRequestBuilder.execute(AsyncHttpClient.java:229)
at com.cloudera.parcel.components.ParcelDownloaderImpl.getRepositoryInfoFuture(ParcelDownloaderImpl.java:592)
at com.cloudera.parcel.components.ParcelDownloaderImpl.getRepositoryInfo(ParcelDownloaderImpl.java:544)
at com.cloudera.parcel.components.ParcelDownloaderImpl.syncRemoteRepos(ParcelDownloaderImpl.java:357)
at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:464)
at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:459)
at com.cloudera.cmf.persist.ReadWriteDatabaseTaskCallable.call(ReadWriteDatabaseTaskCallable.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: archive.cloudera.com: Name or service not known
at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at java.net.InetAddress.getByName(InetAddress.java:1076)
at com.ning.http.client.NameResolver$JdkNameResolver.resolve(NameResolver.java:28)
at com.ning.http.client.providers.netty.request.NettyRequestSender.remoteAddress(NettyRequestSender.java:358)
at com.ning.http.client.providers.netty.request.NettyRequestSender.connect(NettyRequestSender.java:369)
at com.ning.http.client.providers.netty.request.NettyRequestSender.sendRequestWithNewChannel(NettyRequestSender.java:283)
... 15 more
不影響使用
安裝hive報錯:org.apache.hadoop.hive.metastore.HiveMetaException: Failed to retrieve schema tables from Hive Metastore DB,Not supported
[root@master01 ~]# rpm -qa|grep mysql-connector-java
mysql-connector-java-5.1.25-3.el7.noarch
jdbc版本不對,要求使用5.1.26以上版本的jdbc驅動