基於cloudera搭建大數據集羣(docker)記錄

1、安裝docker

安裝最新穩定版的

# step 1: 安裝必要的一些系統工具
sudo yum install -y yum-utils device-mapper-persistent-data lvm2
# Step 2: 添加軟件源信息
sudo yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# Step 3: 更新並安裝Docker-CE
sudo yum makecache fast
sudo yum -y install docker-ce
# Step 4: 開啓Docker服務
sudo service docker start

安裝指定版

#step 1:查看倉庫中docker的版本
yum list docker-ce.x86_64 --showduplicates | sort -r
#step 2:安裝指定版本docker
yum install -y docker-ce-18.09.9 docker-ce-cli-18.09.9 containerd.io

2、基於docker搭建cloudera(sudo權限)

#step1:拉取cloudera鏡像
sudo docker pull cloudera/quickstart:latest
    #如果pull的過程過於緩慢,修改鏡像源,
    #在 /etc/docker/daemon.json 文件中添加以下參數(沒有該文件則新建):

    {
          "registry-mirrors": ["https://9cpn8tt6.mirror.aliyuncs.com"]
    }

    #服務重啓:
    systemctl daemon-reload
    systemctl restart docker
    
#step2: 創建容器
sudo docker run -t -i -d 
--name cdh 
--hostname=quickstart.cloudera 
--privileged=true 
-v /data/CDH:/src 
-p 8020:8020 -p 8022:8022 -p 7180:7180 -p 21050:21050 -p 50070:50070 -p 50075:50075 -p 50010:50010 -p 50020:50020 -p 8890:8890 -p 60010:60010 -p 10002:10002 -p 25010:25010 -p 25020:25020 -p 18088:18088 -p 8088:8088 -p 19888:19888 -p 7187:7187 -p 11000:11000 -p 8888:8888 cloudera/quickstart 
/bin/bash -c '/usr/bin/docker-quickstart'

其中

Option Description
–hostname=quickstart.cloudera Required: Pseudo-distributed configuration assumes this hostname.容器主機名(/etc/hosts中指定hostname)。
–privileged=true Required: For HBase, MySQL-backed Hive metastore, Hue, Oozie, Sentry, and Cloudera Manager.這是Hbase組件需要的模式。
-t Required: Allocate a pseudoterminal. Once services are started, a Bash shell takes over. This switch starts a terminal emulator to run the services.
-i Required: If you want to use the terminal, either immediately or connect to the terminal later.
-p 8888

Recommended: Map the Hue port in the guest to another port on the host.端口映射參數。

格式:-p 8888:8888,:左側端口爲本機端口,:右側爲docker集羣端口

-p [PORT] Optional: Map any other ports (for example, 7180 for Cloudera Manager, 80 for a guided tutorial).
-d Optional: Run the container in the background.容器後臺啓動。
–name 容器的名字
-v host_path:container_path 主機上目錄掛載到容器中目錄上,主機上該放入任何東西,Docker容器中對於目錄可以直接訪問。

CDH端口彙總

service name parameter port number
HBase REST Server Port hbase.rest.port 20550
HBase REST Server Web UI Port hbase.rest.info.port 8085
HBase Thrift Server Port hbase.regionserver.thrift.port 9090
HBase Thrift Server Web UI Port hbase.thrift.info.port 9095
HBase Master Port hbase.master.port 60000
HBase Master Web UI Port hbase.master.info.port 60010
HBase RegionServer Port hbase.regionserver.port 60020
HBase RegionServer Web UI port hbase.regionserver.info.port 60030
DataNode Protocol Port dfs.datanode.ipc.address 50020
DataNode Transceiver Port dfs.datanode.address 50010
DataNode HTTP Web UI Port dfs.datanode.http.address 50075
Secure DataNode Web UI Port (TLS/SSL) dfs.datanode.https.address 50475
REST Port hdfs.httpfs.http.port 14000
Administration Port hdfs.httpfs.admin.port 14001
JournalNode RPC Port dfs.journalnode.rpc-address 8485
JournalNode HTTP Port dfs.journalnode.http-address 8480
Secure JournalNode Web UI Port (TLS/SSL) dfs.journalnode.https-address 8481
NFS Gateway Server Port nfs3.server.port 2049
NFS Gateway MountD Port nfs3.mountd.port 4242
Portmap (or Rpcbind) Port - 111
NameNode Port fs.default.name, fs.defaultFS 8020
NameNode Service RPC Port dfs.namenode.servicerpc-address 8022
NameNode Web UI Port dfs.http.address, dfs.namenode.http-address 50070
Secure NameNode Web UI Port (TLS/SSL) dfs.https.port 50470
SecondaryNameNode Web UI Port dfs.secondary.http.address, dfs.namenode.secondary.http-address 50090
Secure SecondaryNameNode Web UI Port (TLS/SSL) dfs.secondary.https.port 50495
HBase Indexer HTTP Port hbaseindexer.http.port 11060
Solr HTTP Port solr_http_port 8983
Solr Admin Port - 8984
Solr HTTPS port solr_https_port 8985
Client Port clientPort 2181
Quorum Port - 3181
Election Port - 4181
JMX Remote Port - 9010

在創建容器的時候,如果run後有error,名字會被佔用,需要remove掉已創建的container後重新run

#查看當前已啓動的container
docker ps -a 
#rm掉選擇的container
docker rm container_id

3、開啓cloudera manager

#啓動的cdh
sudo docker start CONTAINER_ID
#進入已啓動的cdh container
sudo docker exec -it CONTAINER_ID /bin/bash
# [root@quickstart /] #
#運行cloudera-manager
sudo /home/cloudera/cloudera-manager --force --enterpise

啓動後可通過瀏覽器訪問:IP:7180,其中7180爲cloudera-manager的端口,連接後username:cloudera,passwd:cloudera

如圖:

啓動集羣組件服務:HDFS、Hive、Hue、Yarn等

4、在客戶端測試組件使用

創建test.py文件

from hdfs.client import Client
client = Client("http://192.168.31.3:50070", root="/", timeout=100)
print(client.list("/"))

返回hdfs系統中的路徑

5、安裝kafka

https://blog.csdn.net/nevergiveup54/article/details/50545020

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章