文章目錄
設計網絡拓撲
網段均爲 192.168.0*
yum 104
hadoop1 105
hadoop2 106
hadoop3 107
hadoop4 108
hadoop5 109
#查看ip
ip a
yum的ip
搭建本地 yum 源
在 yum 上安裝nginx
# 安裝依賴
yum install epel-release -y
# 安裝nginx
yum install nginx -y
# 查看nginx 位置
whereis nginx
whereis nginx
nginx: /usr/sbin/nginx /usr/lib64/nginx /etc/nginx /usr/share/nginx
默認使用路徑
/etc/nginx/nginx.conf
關閉防火牆
# 查看防火牆狀態
firewall-cmd --state
》》 running
#禁止開機啓動
systemctl disable firewalld.service
#關閉防火牆
systemctl stop firewalld.service
訪問成功
修改nginx配置
vi /etc/nginx/nginx.conf
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
# Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;
events {
worker_connections 1024;
}
http {
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
include /etc/nginx/mime.types;
default_type application/octet-stream;
# Load modular configuration files from the /etc/nginx/conf.d directory.
# See http://nginx.org/en/docs/ngx_core_module.html#include
# for more information.
include /etc/nginx/conf.d/*.conf;
server {
listen 80 ;
server_name _ ;
root /usr/share/nginx/html;
# Load configuration files for the default server block.
include /etc/nginx/default.d/*.conf;
location / {
}
error_page 404 /404.html;
location = /40x.html {
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
}
}
server {
listen 8080;
server_name _ ;
location / {
autoindex on;
root /home/iso/; # (這裏請換成你的實際目錄路徑)
}
}
}
固定IP
之前我們使用的 dhcp 會自動改變
修改網卡配置固定ip
vi /etc/sysconfig/network-scripts/ifcfg-enp0s10f0
IPADDR=192.168.0.104 #IP地址
GATEWAY=192.168.0.1 #網關
DNS1=192.168.1.1,192.168.0.1 #域名解析器
service network restart
上傳並解壓縮iso鏡像
上傳鏡像到 /home/CentOS-7-x86_64-Everything-1908.iso
mkdir /home/iso
mount -o loop /home/CentOS-7-x86_64-Everything-1908.iso /home/iso/
報錯 mount: /dev/loop0 寫保護,將以只讀方式掛載
修改權限
chmod 777 /home/CentOS-7-x86_64-Everything-1908.iso
#重啓nginx
nginx -s reload
在使用的主機中 配置yum repo 訪問路徑
我們開始配置yum的配置文件,在
/etc/yum.repos.d新建一個名爲Nginx-yum.repo
的配置文件,內容如下:
注意修改 baseurl 爲你虛擬機的url
[Nginx-yum]
name=Nginx-yum
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os&infra=$infra
baseurl=http://192.168.1.3:8080
enabled=1
gpgcheck=1
gpgkey=http://192.168.1.3:8080/RPM-GPG-KEY-CentOS-7
刷新yum
yum clean all
yum makecache
##注意不要千萬不要 yum update 否則虛擬ip 會消失
將虛擬機導出
畢竟配置挺麻煩的
將配置導出到
C:\Users\Public\Documents\Hyper-V\localyum 中的yum 文件夾就是虛擬機
屬性 ip 192.168.0.104
yum 端口 8080
nginx 配置位置
網卡配置位置 /etc/sysconfig/network-scripts/ifcfg-enp0s10f0
vi /etc/nginx/nginx.conf
配置一臺hadoop
ssh免密
systemctl enable sshd
systemctl start sshd
修改主機名
vi /etc/hostname
vi /etc/hosts
service network restart
下載安裝包
wget http://192.168.0.104:9090/jdk1.8.0_181.zip
wget http://192.168.0.104:9090/hadoop-3.2.1.tar.gz
解壓
unzip jdk1.8.0_181.zip
tar -zxvf hadoop-3.2.1.tar.gz
配置java
export JAVA_HOME=/cloudcomput/jdk1.8.0_181
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
配置hadoop
pssh 學習
pssh 是一款 nb哄哄的 工具
安裝pssh
wget https://pypi.tuna.tsinghua.edu.cn/packages/60/9a/8035af3a7d3d1617ae2c7c174efa4f154e5bf9c24b36b623413b38be8e4a/pssh-2.3.1.tar.gz#sha256=539f8d8363b722712310f3296f189d1ae8c690898eca93627fc89a9cb311f6b4
#這個是官網鏈接,不能翻牆,就試試上邊的 國內的下載方式
wget http://parallel-ssh.googlecode.com/files/pssh-2.3.1.tar.gz
tar xf pssh-2.3.1.tar.gz
cd pssh-2.3.1/
#注意這裏一定要採取 python2 安裝,否則會報錯
python setup.py install
使用pssh
使用說明,必須保證 所有主機之間已經可以免密登錄
先在當前文件夾編輯主機名
vi hosts.txt
內容如下
hadoop1
hadoop2
hadoop3
hadoop4
hadoop5
pssh -h hosts.txt -i 【要每個主機執行的命令】
使用實例,向所有節點配置/etc/profile hadoop路徑
#先在主節點修改
vi /etc/profile
在結尾添加
export HADOOP_HOME=/cloudcomput/hadoop-3.2.1
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
#pscp進行分發
pscp -h hosts.txt /etc/profile /etc/profile
#pssh 生效所有文件
pssh -h hosts.txt -i "source /etc/profile "
執行成功
[1] 07:24:13 [SUCCESS] hadoop2
[2] 07:24:13 [SUCCESS] hadoop1
[3] 07:24:13 [SUCCESS] hadoop4
[4] 07:24:13 [SUCCESS] hadoop3
[5] 07:24:13 [SUCCESS] hadoop5
hadoop啓動報錯
ERROR: Attempting to operate on yarn nodemanager as root
在/hadoop/sbin路徑下:
將start-dfs.sh,stop-dfs.sh兩個文件頂部添加以下參數
#!/usr/bin/env bash
HDFS_DATANODE_USER=root
HADOOP_SECURE_DN_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
還有,start-yarn.sh,stop-yarn.sh頂部也需添加以下:
#!/usr/bin/env bash
YARN_RESOURCEMANAGER_USER=root
HADOOP_SECURE_DN_USER=yarn
YARN_NODEMANAGER_USER=root
修改後重啓 ./start-dfs.sh,成功!
分發目錄添加參數 -r
pscp -h hosts.txt -r /cloudcomput/hadoop-3.2.1/sbin /cloudcomput/hadoop-3.2.1/sbin
hadoop 修改配置文件後重新格式化
stop-all.sh
#刪除文件
pssh -h hosts.txt -i "cd /cloudcomput/hadoop-3.2.1/tmp && rm -rf * "
pssh -h hosts.txt -i "cd /cloudcomput/hadoop-3.2.1/logs && rm -rf * "
mkdir /cloudcomput/hadoop-3.2.1/tmp/dfs
mkdir /cloudcomput/hadoop-3.2.1/tmp/dfs/name
chmod -R 777 /cloudcomput/hadoop-3.2.1/tmp
hdfs namenode -format
start-all.sh
啓動hadoop 後除了,namenode 都正常工作,查日誌發現以下錯誤
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /cloudcomput/hadoop-3.2.1/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
解決
hadoop.tmp.dir /tmp/hadoop-${user.name}
mkdir /cloudcomput/hadoop-3.2.1/tmp/dfs
mkdir /cloudcomput/hadoop-3.2.1/tmp/dfs/name
找了半天錯 原來是 - 是中文的!!!!!
搭建完全分佈式 成功
namenode http://192.168.0.105:9870
第二名稱節點
secondnamenode http://192.168.0.106:9868
8088
啓動歷史記錄
mapred --daemon start historyserver
查看數據節點
數據節點1 hadoop3
http://192.168.0.107:9864/datanode.html
數據節點2 hadoop4
http://192.168.0.108:9864/datanode.html
數據節點3 hadoop5
http://192.168.0.109:9864/datanode.html
查看五個節點上的服務
pssh -h hosts.txt -i '/cloudcomput/java/bin/jps '
《大數據技術與應用》
實驗報告
學號:20153442
班級:信1701-3
姓名:李孟凱
石家莊鐵道大學信息學院
(章魚互聯網學院平臺)
2020 年春季學期
實驗一 搭建 namenode與secondnamenode分離的完全分佈
式hadoop 以及 僞分佈式hadoop 搭建:
一、任務目標
1.模擬搭建完全分佈式的hadoop
2.僞分佈式hadoop 搭建
二、系統環境
Win10
三、任務內容
在虛擬機中搭建完全分佈式的hadoop集羣
網段均爲 192.168.0.*
提供集羣部署時的局域網鏡像源 yum 104
namenode hadoop1 105
secondnamenode hadoop2 106
datanoda1 hadoop3 107
datanode2 hadoop4 108
datanoda3 hadoop5 109
四、任務步驟
0.僞分佈式搭建
之前已經配置過僞分佈式hadoop 容器
直接docker部署即可
docker run -tdi -p 8088:8088 -p 9000:9000 -p 9864:9864 -p 9866:9866 -p 9867:9867 -p 9870:9870 -p 19888:19888 -p 50100:50100 -p 50105:50105 --hostname localhost --privileged -e “container=docker” --name hadoopweifb registry.cn-hangzhou.aliyuncs.com/mkmk/hadoop:weifb3 init | docker exec hadoopweifb /bin/bash -c ’ /starthadoop.sh ’
部署講解
8088 數據請求端口
9000 name端口
9864 .。。
都是hadoop 必要的端口
直接使用即可
Hadoop版本 3.2.1
Java版本1.8
鏡像倉庫是我的阿里雲倉庫
registry.cn-hangzhou.aliyuncs.com/mkmk/hadoop:weifb3
1.選擇集羣操作系統。
我對centos7 使用比較多,所以在這裏選擇centos7作爲集羣操作系統,爲了最終搭建的單個節點資源開銷較少,我們採取只有900mb的 centos7 最小化安裝
國內華爲雲
https://repo.huaweicloud.com/centos/7/isos/x86_64/CentOS-7-x86_64-Minimal-1908.iso
2.Hyper-v管理器
我們並不需要下載vamvare ,win10自帶虛擬機工具 hyper-y
查找hyper-y 即可使用
3.搭建本地yum
採用mini install 操作系統上什麼都沒有,需要我們去手動安裝,但是爲了加快整個集羣的安裝速度,我們採取直接在本地搭建yum鏡像源
下載 centos7 完全版安裝包 ,華爲雲,大小10g ,半小時下載完
https://repo.huaweicloud.com/centos/7/isos/x86_64/CentOS-7-x86_64-Everything-1908.iso
首先配置 yum 虛擬機,並選擇最小化安裝包
內存:4g
硬盤50g
選擇CentOS-7-x86_64-Minimal-1908.iso 進行安裝
安裝nginx
安裝依賴
yum install epel-release -y
安裝nginx
yum install nginx -y
查看nginx 位置
whereis nginx
whereis nginx
nginx: /usr/sbin/nginx /usr/lib64/nginx /etc/nginx /usr/share/nginx
默認使用路徑
/etc/nginx/nginx.conf
關閉 虛擬機 防火牆
查看防火牆狀態
firewall-cmd --state
》》 running
#禁止開機啓動
systemctl disable firewalld.service
#關閉防火牆
systemctl stop firewalld.service
配置完全安裝包路徑到 nginx
vi /etc/nginx/nginx.conf
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;
Load dynamic modules. See /usr/share/doc/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;
events {
worker_connections 1024;
}
http {
log_format main '$remote_addr - time_local] “KaTeX parse error: Double superscript at position 34: … '̲status http_referer” ’
‘“http_x_forwarded_for”’;
access_log /var/log/nginx/access.log main;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
include /etc/nginx/mime.types;
default_type application/octet-stream;
# Load modular configuration files from the /etc/nginx/conf.d directory.
# See http://nginx.org/en/docs/ngx_core_module.html#include
# for more information.
include /etc/nginx/conf.d/*.conf;
server {
listen 80 ;
server_name _ ;
root /usr/share/nginx/html;
# Load configuration files for the default server block.
include /etc/nginx/default.d/*.conf;
location / {
}
error_page 404 /404.html;
location = /40x.html {
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
}
}
server {
listen 8080;
server_name _ ;
location / {
autoindex on;
root /home/iso/; # (這裏請換成你的實際目錄路徑)
}
}
}
手動配置ip
默認使用的 dhcp ip會自動改變
修改網卡配置固定ip
vi /etc/sysconfig/network-scripts/ifcfg-enp0s10f0
IPADDR=192.168.0.104 #IP地址
GATEWAY=192.168.0.1 #網關
DNS1=192.168.1.1,192.168.0.1 #域名解析器
service network restart
上傳掛載everything安裝包
mkdir /home/iso
mount -o loop /home/CentOS-7-x86_64-Everything-1908.iso /home/iso/
報錯 mount: /dev/loop0 寫保護,將以只讀方式掛載
修改權限
chmod 777 /home/CentOS-7-x86_64-Everything-1908.iso
重啓nginx
nginx -s reload
上傳java1.8 hadoop3.2.1 到nginx代理
上傳文件後,修改nginx配置
添加
server {
listen 9090;
server_name _ ;
add_header Access-Control-Allow-Origin *;
add_header Access-Control-Allow-Methods ‘GET,POST’;
add_header Access-Control-Allow-Headers ‘DNT,X-Mx-ReqToken,Keep-Alive,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Authorization’;
location / {
root /home/javainstall/;
autoindex on;
}
}
重啓nginx
nginx -s reload
3.搭建hadoop集羣
搭建集羣標準機
由於集羣之間的軟件環境都很類似,我們先搭建namenode,然後鏡像拷貝虛擬機即可,最後修改配置文件即可
配置本地yum倉庫
刪除所有自帶的的 /etc/yum.repos.d
/etc/yum.repos.d新建一個名爲Nginx-yum.repo
[Nginx-yum]
name=Nginx-yum
#mirrorlist=http://mirrorlist.centos.org/?release=KaTeX parse error: Expected 'EOF', got '&' at position 11: releasever&̲arch=basearch&repo=os&infra=$infra
baseurl=http://192.168.1.3:8080
enabled=1
gpgcheck=1
gpgkey=http://192.168.1.3:8080/RPM-GPG-KEY-CentOS-7
刷新yum
yum clean all
yum makecache
配置hosts解析
vi /etc/hosts
127.0.0.1 localhost
192.168.0.105 hadoop1
192.168.0.106 hadoop2
192.168.0.107 hadoop3
192.168.0.108 hadoop4
192.168.0.109 hadoop5
下載並配置java以及hadoop
創建cloudcomput 目錄
mkdir /cloudcomput
進入目錄
Cd /cloudcomput
從本地鏡像下載安裝包
wget http://192.168.0.104:9090/jdk1.8.0_181.zip
wget http://192.168.0.104:9090/hadoop-3.2.1.tar.gz
解壓安裝包,配置環境變量
vi /etc/profile
末尾添加
export JAVA_HOME=/cloudcomput/java
export PATH=JAVA_HOME/bin
export CLASSPATH=.:JAVA_HOME/lib/tools.jar
export HADOOP_HOME=/cloudcomput/hadoop-3.2.1
export PATH=HADOOP_HOME/bin:$HADOOP_HOME/sbin
到hadoop 官網看配置文件
修改/cloudcomput/hadoop3.2.1/etc/hadoop 中的各種配置文件
修改/cloudcomput/hadoop3.2.1/etc/hadoop/hadoop-env.sh
55行左右添加
export JAVA_HOME=/cloudcomput/java
修改/cloudcomput/hadoop3.2.1/etc/hadoop/core-site.xml
末尾添加
fs.defaultFS
hdfs://192.168.0.105:9000
hadoop.tmp.dir
/cloudcomput/hadoop-3.2.1/tmp/hadoop-${user.name}
修改/cloudcomput/hadoop3.2.1/etc/hadoop/hdfs-site.xml
dfs.namenode.http-address
192.168.0.105:9870
dfs.namenode.secondary.http-address
192.168.0.106:9868
dfs.replication
3
dfs.namenode.datanode.registration.ip-hostname-check
false
修改/cloudcomput/hadoop3.2.1/etc/hadoop/mapred-site.xml
mapreduce.framework.name
yarn
mapreduce.application.classpath
/cloudcomput/hadoop-3.2.1/etc/,
/cloudcomput/hadoop-3.2.1/etc/hadoop/,
/cloudcomput/hadoop-3.2.1/lib/,
/cloudcomput/hadoop-3.2.1/share/hadoop/common/,
/cloudcomput/hadoop-3.2.1/share/hadoop/common/lib/,
/cloudcomput/hadoop-3.2.1/share/hadoop/mapreduce/,
/cloudcomput/hadoop-3.2.1/share/hadoop/mapreduce/lib-examples/,
/cloudcomput/hadoop-3.2.1/share/hadoop/hdfs/,
/cloudcomput/hadoop-3.2.1/share/hadoop/hdfs/lib/,
/cloudcomput/hadoop-3.2.1/share/hadoop/yarn/,
/cloudcomput/hadoop-3.2.1/share/hadoop/yarn/lib/*,
修改/cloudcomput/hadoop3.2.1/etc/hadoop/yarn-site.xml
yarn.resourcemanager.hostname
hadoop1
yarn.nodemanager.aux-services
mapreduce_shuffle
mapreduce.application.classpath
/cloudcomput/hadoop-3.2.1/etc/,
/cloudcomput/hadoop-3.2.1/etc/hadoop/,
/cloudcomput/hadoop-3.2.1/lib/,
/cloudcomput/hadoop-3.2.1/share/hadoop/common/,
/cloudcomput/hadoop-3.2.1/share/hadoop/common/lib/,
/cloudcomput/hadoop-3.2.1/share/hadoop/mapreduce/,
/cloudcomput/hadoop-3.2.1/share/hadoop/mapreduce/lib-examples/,
/cloudcomput/hadoop-3.2.1/share/hadoop/hdfs/,
/cloudcomput/hadoop-3.2.1/share/hadoop/hdfs/lib/,
/cloudcomput/hadoop-3.2.1/share/hadoop/yarn/,
/cloudcomput/hadoop-3.2.1/share/hadoop/yarn/lib/*,
修改hadoop 啓動文件權限
vi /cloudcomput/hadoop-3.2.1/sbin/start-dfs.sh
vi /cloudcomput/hadoop-3.2.1/sbin/stop-dfs.sh
vi /cloudcomput/hadoop-3.2.1/sbin/start-yarn.sh
vi /cloudcomput/hadoop-3.2.1/sbin/stop-yarn.sh
都在開頭第二行添加
HDFS_DATANODE_USER=root
HADOOP_SECURE_SECURE_USER=hdfs
HDFS_NAMENODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
標準機配置完成後備份標準機
搭建配置每個節點
修改主機名
vi /etc/hostname
vi /etc/hosts
service network restart
修改ip
vi /etc/sysconfig/network-scripts/ifcfg-enp0s10f0
修改當前節點的ip
IPADDR=192.168.0. 【當前節點ip】 #IP地址
GATEWAY=192.168.0.1 #網關
DNS1=192.168.1.1,192.168.0.1 #域名解析器
service network restart
設置ssh 自啓動
systemctl enable sshd
systemctl start sshd
配置集羣之間ssh 免密登錄
mkdir ~/.ssh
ssh-genkey -t rsa
從第一臺主機開始,把每臺主機的公共密匙 都追加到 authorized_keys
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
把 ~/.ssh/authorized_keys 發送到下一臺主機
scp ~/.ssh/authorized_keys root@下一臺主機:~/.ssh/authorized_keys
ssh root@下一臺主機ip
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys root@下一臺主機:~/.ssh/authorized_keys
依次循環直到所有主機文件都添加到了 ~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys root@所有主機都發一次:~/.ssh/authorized_keys
檢測免密登錄
首先登錄到 hadoop1
依次
ssh hadoop2
ssh hadoop3
ssh hadoop4
ssh hadoop5
至此hadoop完全分佈式配置工作結束
使用完全分佈式
啓動hadoop
登錄到 hadoop1
start-all.sh 即可
學習並安裝批處理工具 pssh
在hadoop1 namenode 中安裝 pssh
wget https://pypi.tuna.tsinghua.edu.cn/packages/60/9a/8035af3a7d3d1617ae2c7c174efa4f154e5bf9c24b36b623413b38be8e4a/pssh-2.3.1.tar.gz#sha256=539f8d8363b722712310f3296f189d1ae8c690898eca93627fc89a9cb311f6b4
#這個是官網鏈接,不能翻牆,就試試上邊的 國內的下載方式
wget http://parallel-ssh.googlecode.com/files/pssh-2.3.1.tar.gz
tar xf pssh-2.3.1.tar.gz
cd pssh-2.3.1/
#注意這裏一定要採取 python2 安裝,否則會報錯
python setup.py install
查看所有節點的啓動進程
pssh -h hosts.txt -i ‘/cloudcomput/java/bin/jps’
查看各節點的web頁面
名稱節點
第二名稱節點
數據節點1
數據節點2
數據節點3
五、實驗總結