(11)啓動HDFS和YARN
--啓動hdfs
--在hadoop-namenode01或者hadoop-namenode02任意一臺執行
[root@hadoop-namenode01 sbin]# pwd
/usr/local/apps/hadoop-2.4.1/sbin
[root@hadoop-namenode01 sbin]# ./start-dfs.sh
Starting namenodes on [hadoop-namenode01 hadoop-namenode02]
--代表啓動兩個namenode
hadoop-namenode01: starting namenode, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-namenode-hadoop-namenode01.out
hadoop-namenode02: starting namenode, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-namenode-hadoop-namenode02.out
--代表啓動3個datanode
hadoop-datanode02: starting datanode, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-datanode-hadoop-datanode02.out
hadoop-datanode01: starting datanode, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-datanode-hadoop-datanode01.out
hadoop-datanode03: starting datanode, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-datanode-hadoop-datanode03.out
--代表啓動3個journalnode,這裏已經啓動不用管
Starting journal nodes [hadoop-zknode01 hadoop-zknode02 hadoop-zknode03]
hadoop-zknode01: journalnode running as process 25652. Stop it first.
hadoop-zknode03: journalnode running as process 4209. Stop it first.
hadoop-zknode02: journalnode running as process 4128. Stop it first.
--代表啓動兩個zkfc
Starting ZK Failover Controllers on NN hosts [hadoop-namenode01 hadoop-namenode02]
hadoop-namenode02: starting zkfc, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-zkfc-hadoop-namenode02.out
hadoop-namenode01: starting zkfc, logging to /usr/local/apps/hadoop-2.4.1/logs/hadoop-root-zkfc-hadoop-namenode01.out
[root@hadoop-namenode01 sbin]# jps
26512 Jps
26415 DFSZKFailoverController
NameNode
[root@hadoop-namenode02 current]# jps
25599 NameNode
25690 DFSZKFailoverController
--啓動yarn
--在hadoop-resourcemanager01執行
[root@hadoop-resourcemanager01 ~]# cd /usr/local/apps/hadoop-2.4.1/sbin/
[root@hadoop-resourcemanager01 sbin]# ./start-yarn.sh
[root@hadoop-resourcemanager01 sbin]# jps
25989 Jps
25726 ResourceManager
--在第二臺也啓動resourcemanager進程【由於一些缺陷,第一個節點不能啓動第二個節點的resourcemanager】
[root@hadoop-resourcemanager02 sbin]# ./yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /usr/local/apps/hadoop-2.4.1/logs/yarn-root-resourcemanager-hadoop-resourcemanager02.out
[root@hadoop-resourcemanager02 sbin]# jps
25845 Jps
25616 ResourceManager
[root@hadoop-datanode01 ~]# jps
25596 NodeManager
25698 Jps
25434 DataNode
(12)檢查namenode節點角色
說明當前hadoop-namenode01爲主節點
說明hadoop-namenode02爲備用節點,狀態爲standby
(14)檢查yarn節點角色
說明hadoop-resourcemanager02爲active
說明hadoop-resourcemanager02爲standby
七、集羣HA高可用測試
1、NameNode高可用測試
----場景A:
步驟1:檢查Active節點
上圖可以看出當前Active節點在192.168.1.31上,192.168.1.32爲Standby
步驟2:模擬192.168.1.31宕機,直接kill 192.168.1.31上的namenode進程
[root@hadoop-namenode01 hadoop]# jps
26979 Jps
26415 DFSZKFailoverController
26132 NameNode
[root@hadoop-namenode01 hadoop]# kill -9 26132
步驟3:檢查192.168.1.32狀態
殺掉192.168.1.31上的namenode進程後,192.168.1.32上namenode變爲active狀態,故障切換成功。
步驟4:把192.168.1.31上namenode啓動起來
[root@hadoop-namenode01 sbin]# ./hadoop-daemon.sh start namenode
[root@hadoop-namenode01 sbin]# jps
26415 DFSZKFailoverController
27044 NameNode
啓動之後變爲standby
----場景B:
模擬Active節點斷電
步驟1:直接將當前Active的192.168.1.32節點poweroff
步驟2:觀察Standby節點的變爲Active狀態的時間
通過觀察,standby節點過一段時間狀態才變爲Active,時間要比之前直接殺死namenode切換時間長,原因爲poweroff之後zkfc通過ssh 到宕機節點後遲遲得不到響應,超過配置文件裏面指定的30秒後,執行自定義的shell /bin/true腳本後得到響應後纔將standby節點切換爲Active。
這種情況下Standby節點也能正常進行故障切換。
package hdfsutil;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class HAHdfsTest {
public static void main(String[] args) throws IOException {
/**
* 在HA模式下,需要把core-site.xml和hdfs-site.xml配置文件放到源碼目錄下 裏面的所有參賽會被自動加載
* 由於配置文件裏面都是主機名稱,因此需要配置hosts主機映射
*/
Configuration conf = new Configuration();
// conf.set("fs.defaultFS", "hdfs://ns1/");
/**
* 防止jar報衝突報錯 conf.set("fs.hdfs.impl",
* "org.apache.hadoop.hdfs.DistributedFileSystem"); conf.set("fs.file.impl",
* "org.apache.hadoop.fs.LocalFileSystem"); 或者在core-site.xml中加入 <property>
* <name>fs.hdfs.impl</name>
* <value>org.apache.hadoop.hdfs.DistributedFileSystem</value> </property>
* <property> <name>fs.file.impl</name>
* <value>org.apache.hadoop.fs.LocalFileSystem</value> </property>
*/
FileSystem fs = FileSystem.get(conf);
fs.copyFromLocalFile(new Path(args[0]), new Path(args[1]));
System.out.println("Upload Complete!");
fs.close();
}
}
步驟2:執行文件上傳過程中,模擬Active節點宕機
--執行文件上傳
[root@hadoop-zkfcnode01 ~]# java -jar hdfsha.jar "/root/jdk-7u65-linux-i586.tar.gz" "/"
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Upload Complete!
--kill掉active狀態的namenode
通過觀察,kill掉active狀態的namenode後,standby立即接管,變爲active,且文件正常上傳成功。