安裝高可用Hadoop生態 (三) 安裝Hadoop

3.    安裝Hadoop

3.1. 解壓程序

※ 3臺服務器分別執行

tar -xf ~/install/hadoop-2.7.3.tar.gz -C/opt/cloud/packages

ln -s /opt/cloud/packages/hadoop-2.7.3 /opt/cloud/bin/hadoop
ln -s /opt/cloud/packages/hadoop-2.7.3/etc/hadoop /opt/cloud/etc/hadoop

mkdir -p /opt/cloud/hdfs/name
mkdir -p /opt/cloud/hdfs/data
mkdir -p /opt/cloud/hdfs/journal
mkdir -p /opt/cloud/hdfs/tmp/java
mkdir -p /opt/cloud/logs/hadoop/yarn

3.2. 設置環境變量

設置JAVA環境變量和Hadoop環境變量

vi ~/.bashrc

增加

export HADOOP_HOME=/opt/cloud/bin/hadoop
export HADOOP_CONF_DIR=/opt/cloud/etc/hadoop
export HADOOP_LOG_DIR=/opt/cloud/logs/hadoop
export HADOOP_PID_DIR=/opt/cloud/hdfs/tmp

export YARN_PID_DIR=/opt/cloud/hdfs/tmp
export HADOOP_OPTS="-Djava.io.tmpdir=/opt/cloud/hdfs/tmp/java"

export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

即刻生效

source ~/.bashrc

複製到另外兩臺服務器

scp ~/.bashrc hadoop2:/home/hadoop
scp ~/.bashrc hadoop3:/home/hadoop

3.3. 修改Hadoop參數

cd ${HADOOP_HOME}/etc/hadoop

修改log4j.properties 、hadoop-env.sh、yarn-env.sh、slaves、core-site.xml、hdfs-site.xml、mapred-site.xml和yarn-site.xml,分發到hadoop2和hadoop2相同的目錄下

3.3.1.    修改log配置文件log4j.properties

hadoop.root.logger =INFO,DRFA

hadoop.log.dir=/opt/cloud/logs/hadoop

3.3.2.    修改hadoop-env.sh

hadoop-env.sh設置了Hadoop的一些環境變量,但是直到2.7.3都有bug,不能從系統的環境變量中提取正確的值,需要手工修改,在文件頭部

export JAVA_HOME=${JAVA_HOME}

將其註釋,手工修改爲

export JAVA_HOME="/usr/lib/jvm/java"

在文件中查找#export HADOOP_LOG_DIR,在其下增加

export HADOOP_LOG_DIR=/opt/cloud/logs/hadoop

在文件中查找export HADOOP_PID_DIR=${HADOOP_PID_DIR}

export HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/

設置java的臨時目錄,查找

export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true "

修改爲

export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.io.tmpdir=/opt/cloud/hdfs/tmp/java"

3.3.3.    修改yarn-env.sh

查找default log directory,在其後增加一行

export YARN_LOG_DIR=/opt/cloud/logs/hadoop/yarn

3.3.4.    修改slaves

# vi slaves

配置內容:

刪除:localhost

添加:

hadoop2
hadoop3

3.3.5.    修改core-site.xml

# vi  core-site.xml

配置內容:

<configuration>

<property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/opt/cloud/hdfs/tmp</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.proxyuser.hadoop.groups</name> <value>hadoop</value> </property> <property> <name>hadoop.proxyuser.hadoop.hosts</name> <value>hadoop1, hadoop2, hadoop3,127.0.0.1,localhost</value> </property> <property> <name>ipc.client.rpc-timeout.ms</name>[1] <value>4000</value> </property> <property> <name>ipc.client.connect.timeout</name> <value>4000</value> </property> <property> <name>ipc.client.connect.max.retries</name> <value>100</value> </property> <property> <name>ipc.client.connect.retry.interval</name> <value>10000</value> </property> </configuration>

3.3.6.    修改hdfs-site.xml

# vi  hdfs-site.xml

配置內容:

<configuration>
    <property>
      <name>dfs.nameservices</name>
      <value>mycluster</value>
    </property>
    <property>
      <name>dfs.ha.namenodes.mycluster</name>
      <value>nn1,nn2</value>
    </property>
    <property>
      <name>dfs.namenode.rpc-address.mycluster.nn1</name>
      <value>hadoop1:9000</value>
    </property>
    <property>
      <name>dfs.namenode.http-address.mycluster.nn1</name>
      <value>hadoop1:50070</value>
    </property>
    <property>
      <name>dfs.namenode.rpc-address.mycluster.nn2</name>
      <value>hadoop2:9000</value>
    </property>
    <property>
      <name>dfs.namenode.http-address.mycluster.nn2</name>
      <value>hadoop2:50070</value>
    </property>
    <property>
      <name>dfs.namenode.shared.edits.dir</name>
      <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/mycluster</value>
    </property>
    <property>
      <name>dfs.journalnode.edits.dir</name>
      <value>/opt/cloud/hdfs/journal</value>
    </property>
    <property>
      <name>dfs.ha.automatic-failover.enabled</name>
      <value>true</value>
    </property>
    <property>
      <name>dfs.client.failover.proxy.provider.mycluster</name>
      <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <property>
      <name>dfs.ha.fencing.methods</name>
      <value>
        sshfence
        shell(/bin/true)
      </value>
    </property>
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/home/hadoop/.ssh/id_rsa</value>
    </property>
    <property>
       <name>dfs.ha.fencing.ssh.connect-timeout</name>
       <value>30000</value>
    </property>

   <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    <property>
        <name>dfs.name.dir</name>
        <value>/opt/cloud/hdfs/name</value>
    </property>
    <property>
        <name>dfs.data.dir</name>
        <value>/opt/cloud/hdfs/data</value>
    </property>
    <property>
        <name>dfs.permissions</name>
        <value>false</value>
    </property>
    <property>
        <name>dfs.support.append</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
        <value>NEVER</value>
    </property>
    <property>
        <name>dfs.datanode.max.xcievers</name>
        <value>8192</value>
    </property>
</configuration>

3.3.7.    修改mapred-site.xml

mv mapred-site.xml.template mapred-site.xml
vi mapred-site.xml

配置內容:

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
  <property>
     <name>mapreduce.jobhistory.address</name>
     <value>0.0.0.0:10020</value>
  </property>
  <property>
     <name>mapreduce.jobhistory.webapp.address</name>
     <value>0.0.0.0:19888</value>
  </property>
  <property> 
    <name>yarn.app.mapreduce.am.resource.mb</name>
    <value>1024</value>
  </property>
  <property>
    <name>yarn.app.mapreduce.am.command-opts</name>
    <value>-Xmx800m</value>
  </property>
  <property>
     <name>mapreduce.map.memory.mb</name>
     <value>512</value>
  </property>
  <property>
    <name>mapreduce.map.java.opts</name>
    <value>-Xmx400m</value>
  </property>
  <property>
     <name>mapreduce.reduce.memory.mb</name>
     <value>1024</value>
  </property>
  <property>
    <name>mapreduce.reduce.java.opts</name>
    <value>-Xmx800m</value>
  </property>
</configuration>

3.3.8.    修改yarn-site.xml(非HA版)

vi yarn-site.xml

配置內容:

<configuration>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop1</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
         <name> yarn.nodemanager.aux-services.mapreduce_shuffle.class </name>
         <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
</configuration>

3.3.9.    修改yarn-site.xml(HA版)

vi yarn-site.xml

配置內容:

<configuration>  
    <property>  
       <name>yarn.resourcemanager.ha.enabled</name>  
       <value>true</value>  
    </property>  
    <property>  
       <name>yarn.resourcemanager.cluster-id</name>  
       <value>clusteryarn</value>  
    </property>  
    <property>  
       <name>yarn.resourcemanager.ha.rm-ids</name>  
       <value>rm1,rm2</value>  
    </property>  
    <property>  
       <name>yarn.resourcemanager.hostname.rm1</name>  
       <value>hadoop1</value>  
    </property>  
    <property>  
       <name>yarn.resourcemanager.hostname.rm2</name>  
       <value>hadoop2</value>  
    </property>
    <property>
       <name>yarn.log-aggregation-enable</name>
       <value>true</value>
    </property>
    <property>  
       <name>yarn.resourcemanager.zk-address</name>  
       <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value>  
    </property>
    <property>
       <name>yarn.nodemanager.aux-services</name>  
       <value>mapreduce_shuffle</value>  
    </property>
    <property>
         <name> yarn.nodemanager.aux-services.mapreduce_shuffle.class </name>
         <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property> 
      <name>yarn.resourcemanager.connect.retry-interval.ms</name>
      <value>5000</value>
    </property>
    <property> 
       <name>yarn.nodemanager.resource.memory-mb</name>
       <value>3072</value>
    </property> 
    <property> 
      <name>yarn.nodemanager.vmem-pmem-ratio</name>
      <value>4</value>
    </property>
    <property>
       <name>yarn.nodemanager.resource.cpu-vcores</name>
       <value>2</value>
    </property>
    <property>
        <name>yarn.scheduler.minimum-allocation-mb</name>
        <value>512</value>
    </property>
    <property>
        <name>yarn.scheduler.maximum-allocation-mb</name>
        <value>2048</value>
    </property>
    <property>
        <name>yarn.scheduler.minimum-allocation-vcores</name>
        <value>1</value>
    </property>
    <property>
        <name>yarn.scheduler.maximum-allocation-vcores</name>
        <value>2</value>
    </property>
</configuration>  

3.3.10.   複製到另外2臺服務器

配置文件打包爲 

scp /opt/cloud/bin/hadoop/etc/hadoop/* hadoop2:/opt/cloud/bin/hadoop/etc/hadoop/
scp /opt/cloud/bin/hadoop/etc/hadoop/* hadoop3:/opt/cloud/bin/hadoop/etc/hadoop/

3.4. 首次啓動HDFS

  • 啓動JournalNode集羣:
cexec 'hadoop-daemon.sh start journalnode'

 

注意只有第一次需要這麼啓動,之後啓動hdfs會包含journalnode。

  • 格式化第1個NameNode:
ssh hadoop1 'hdfs namenode -format -clusterId mycluster'

 

輸出信息的最後部分出現下面兩行表示格式化成功

INFO common.Storage: Storage directory /opt/cloud/hdfs/name has been successfully formatted.

...

INFO util.ExitUtil: Exiting with status 0

  • 啓動第1個NameNode:
ssh hadoop1 'hadoop-daemon.sh start namenode'

 

  • 格式化第2個NameNode:
ssh hadoop2 'hdfs namenode -bootstrapStandby'

輸出信息的最後部分出現下面兩行表示格式化成功

INFO common.Storage: Storage directory /opt/cloud/hdfs/name has been successfully formatted.

...

INFO util.ExitUtil: Exiting with status 0

  • 啓動第2個NameNode:
ssh hadoop2 'hadoop-daemon.sh start namenode'
  • 格式化Zk
ssh hadoop1 'hdfs zkfc -formatZK'

 

信息

INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/mycluster in ZK.

即爲格式化成功

  • 啓動2個Zkfc
ssh hadoop1 'hadoop-daemon.sh start zkfc'
ssh hadoop2 'hadoop-daemon.sh start zkfc'
  • 啓動所有的DataNodes:
ssh hadoop1 'hadoop-daemons.sh start datanode'

 

用瀏覽器訪問http://hadoop1:50070和http://hadoop2:50070 查看狀態

namenode一個是active一個是standby,其中active的網頁中QJM三臺服務器的Written txid相同。

3.5. 正式啓動hdfs和Yarn

在hadoop1上執行

start-dfs.sh
start-yarn.sh

在hadoop2上執行

ssh hadoop2 'yarn-daemon.sh start resourcemanager'

通過jps查看進程

[hadoop@hadoop1 ~]$ cexec jps
************************* cloud *************************
--------- hadoop1---------
1223 QuorumPeerMain
3757 DFSZKFailoverController
4787 Jps
3872 ResourceManager
3365 NameNode
3578 JournalNode
--------- hadoop2---------
1220 QuorumPeerMain
24240 NodeManager
24545 Jps
24022 JournalNode
24139 DFSZKFailoverController
23847 NameNode
23923 DataNode
24419 ResourceManager
--------- hadoop3---------
23764 Jps
23578 NodeManager
23471 JournalNode
23372 DataNode
1224 QuorumPeerMain

 

在瀏覽器中下列網址,會看到圖形界面的監控程序

http://hadoop1:50070/  dfs的圖形界面的監控程序

http://hadoop2:50070/  dfs的圖形界面的監控程序,hadoop1和hadoop2其中一個是active,另外一個是standby

http://hadoop1:8088

http://hadoop2:8088 自動跳轉到http://hadoop1:8088

3.6. 開機自動運行hdfs

       Centos7 採用Systemd作爲自啓動管理器,有方便設置依賴關係等多個優點,不過,每個服務的環境變量都是初始化的,即“systemd不繼承任何上下文環境”,所以服務腳本需要設置必要的所有環境變量,每個變量需要用Environment = name = value的方式設置,好消息Environment可以多行,壞消息是Environment中不支持已經使用已經聲明的變量,就是說value中不能有$name,${name}。

3.6.1.    journalnode service

vi hadoop-journalnode.service

[Unit]
Description=hadoop journalnode service
After= network.target
[Service]
Type=forking
User=hadoop
Group=hadoop
Environment = JAVA_HOME=/usr/lib/jvm/java
Environment = JRE_HOME=/usr/lib/jvm/java/jre
Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar'
Environment = HADOOP_HOME=/opt/cloud/bin/hadoop
Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop
Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop
Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/

ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh start journalnode'
ExecStop =/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh stop journalnode'
[Install]
WantedBy=multi-user.target

3.6.2.    namenode service

vi hadoop-namenode.service

[Unit]
Description=hadoop namenode service
After= network.target
[Service]
Type=forking
User=hadoop
Group=hadoop
Environment = JAVA_HOME=/usr/lib/jvm/java
Environment = JRE_HOME=/usr/lib/jvm/java/jre
Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar'
Environment = HADOOP_HOME=/opt/cloud/bin/hadoop
Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop
Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop
Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/

ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh start namenode'
ExecStop=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh stop namenode'
[Install]
WantedBy=multi-user.target

3.6.3.    datanode service

vi hadoop-datanode.service

[Unit]
Description=hadoop datanode service
After= network.target
[Service]
Type=forking
User=hadoop
Group=hadoop
Environment = JAVA_HOME=/usr/lib/jvm/java
Environment = JRE_HOME=/usr/lib/jvm/java/jre
Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar'
Environment = HADOOP_HOME=/opt/cloud/bin/hadoop
Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop
Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop
Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/

ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh start datanode'
ExecStop =/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh stop datanode'
[Install]
WantedBy=multi-user.target

3.6.4.    zkfc service

vi hadoop-zkfc.service

[Unit]
Description=hadoop zkfc service
After= network.target
[Service]
Type=forking
User=hadoop
Group=hadoop
Environment = JAVA_HOME=/usr/lib/jvm/java
Environment = JRE_HOME=/usr/lib/jvm/java/jre
Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar'
Environment = HADOOP_HOME=/opt/cloud/bin/hadoop
Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop
Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop
Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/

ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh start zkfc'
ExecStop=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh stop zkfc'
[Install]
WantedBy=multi-user.target

3.6.5.    yarn resource manager service

vi yarn-rm.service

[Unit]
Description=yarn resource manager service
After= network.target
[Service]
Type=forking
User=hadoop
Group=hadoop
Environment = JAVA_HOME=/usr/lib/jvm/java
Environment = JRE_HOME=/usr/lib/jvm/java/jre
Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar'
Environment = HADOOP_HOME=/opt/cloud/bin/hadoop
Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop
Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop
Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/

ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/yarn-daemon.sh start resourcemanager'
ExecStop=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/yarn-daemon.sh stop resourcemanager'
[Install]
WantedBy=multi-user.target

3.6.6.    yarn nodemanager service

vi yarn-nm.service

[Unit]
Description=yarn node manager service
After= network.target
[Service]
Type=forking
User=hadoop
Group=hadoop
Environment = JAVA_HOME=/usr/lib/jvm/java
Environment = JRE_HOME=/usr/lib/jvm/java/jre
Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar'
Environment = HADOOP_HOME=/opt/cloud/bin/hadoop
Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop
Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop
Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/

ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/yarn-daemon.sh start nodemanager'
ExecStop=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/yarn-daemon.sh stop nodemanager'
[Install]
WantedBy=multi-user.target

3.6.7.    測試和設置爲自動啓動服務

       編寫6種服務的啓動腳本,分別複製到對應服務的/etc/systemd/system目錄

hadoop2 (6種服務)

systemctl start hadoop-journalnode
systemctl start hadoop-namenode
systemctl start hadoop-datanode
systemctl start hadoop-zkfc
systemctl start yarn-rm
systemctl start yarn-nm

測試通過後

systemctl enable hadoop-journalnode
systemctl enable hadoop-namenode
systemctl enable hadoop-datanode
systemctl enable hadoop-zkfc
systemctl enable yarn-rm
systemctl enable yarn-nm

hadoop1 (4種服務)

systemctl enable hadoop-journalnode
systemctl enable hadoop-namenode
systemctl enable hadoop-zkfc
systemctl enable yarn-rm

hadoop3 (3種服務)

systemctl enable hadoop-journalnode
systemctl enable hadoop-datanode
systemctl enable yarn-nm

重新啓動3臺服務器,運行 cexec jps 查看系統狀態

3.7. 卸載

  • 停止yarn,停止DFS:
ssh hadoop1 'stop-yarn.sh'
ssh hadoop2 'yarn-daemon.sh stop resourcemanager'
ssh hadoop1 'stop-dfs.sh'

 

     cexec jps 不再看到hdfs和yarn的進程

  • 停止並刪除系統服務

hadoop2 (6種服務)

systemctl disable hadoop-journalnode
systemctl disable hadoop-namenode
systemctl disable hadoop-datanode
systemctl disable hadoop-zkfc
systemctl disable yarn-rm
systemctl disable yarn-nm

hadoop1 (4種服務)

systemctl disable hadoop-journalnode
systemctl disable hadoop-namenode
systemctl disable hadoop-zkfc
systemctl disable yarn-rm

hadoop3 (3種服務)

systemctl disable hadoop-journalnode
systemctl disable hadoop-datanode
systemctl disable yarn-nm
  • 刪除數據目錄
rm /opt/cloud/hdfs -rf
rm /opt/cloud/logs/hadoop -rf
  • 刪除程序目錄
rm /opt/cloud/bin/hadoop -rf
rm /opt/cloud/etc/hadoop -rf
rm /opt/cloud/packages/hadoop-2.7.3 -rf
  • 復原環境變量

       vi ~/.bashrc

       刪除hadoop相關行

 



[1] 重要的參數,設置hadoop服務之間通訊超時,尤其是nodemanager和resoucemanager之間的ha機制

[2] 適應虛擬機4G內存,各項值都較低

[3] 與yarn的高可用有關,屬於nodemanager連接失敗後的策略參數

[4] 虛擬機僅4G內存2個核,這些資源參數也偏小

[5] 內存不足,虛擬內存比由2.1改爲4

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章