第1章 准备工作
1.1 实验环境介绍
操作系统版本:CentOS 6.4-64bit
Hadoop版本:hadoop-2.2.0.x86_64.tar.gz
Zooekeeper版本:zookeeper-3.4.5.tar.gz
JDK版本:jdk-7u80-linux-x64.rpm
本实验准备了6台虚拟机搭建Hadoop集群
1.2 集群IP和主机名规划
主机名 | ip地址 | 安装的软件 | 运行的进程 |
hadoop1 | 172.16.10.1 | JDK、hadoop | NameNode、DFSZKFailoverController |
hadoop2 | 172.16.10.2 | JDK、hadoop | NameNode、DFSZKFailoverController |
hadoop3 | 172.16.10.3 | JDK、hadoop | ResourceManager |
hadoop4 | 172.16.10.4 | JDK、hadoop、zookeeper | DataNode、NodeManager、JournalNode、QuorumPeerMain |
hadoop5 | 172.16.10.5 | JDK、hadoop、zookeeper | DataNode、NodeManager、JournalNode、QuorumPeerMain |
hadoop6 | 172.16.10.6 | JDK、hadoop、zookeeper | DataNode、NodeManager、JournalNode、QuorumPeerMain |
1.3 关闭防火墙和SELinux
//分别在6台虚拟机执行如下命令:
# service iptables stop # service iptables status iptables: Firewall is not running. # chkconfig iptables off # vim /etc/selinux/config 将 SELINUX=enforcing 修改为: SELINUX= disabled # setenforce 0
1.4 配置ip地址
在Hadoop1上配置:
# vim /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0 HWADDR=00:0C:29:C6:65:9E TYPE=Ethernet UUID=b0641ec2-b3d1-4a0a-9c0a-6813744e76bd ONBOOT=yes NM_CONTROLLED=yes BOOTPROTO=static IPADDR=172.16.10.4 NETMASK=255.255.255.0
在hadoop2上配置:
# vim /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0 HWADDR=00:0C:29:C6:65:9E TYPE=Ethernet UUID=b0641ec2-b3d1-4a0a-9c0a-6813744e76bd ONBOOT=yes NM_CONTROLLED=yes BOOTPROTO=static IPADDR=172.16.10.2 NETMASK=255.255.255.0
在hadoop3上配置:
# vim /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0 HWADDR=00:0C:29:C6:65:9E TYPE=Ethernet UUID=b0641ec2-b3d1-4a0a-9c0a-6813744e76bd ONBOOT=yes NM_CONTROLLED=yes BOOTPROTO=static IPADDR=172.16.10.3 NETMASK=255.255.255.0
在hadoop4上配置:
# vim /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0 HWADDR=00:0C:29:C6:65:9E TYPE=Ethernet UUID=b0641ec2-b3d1-4a0a-9c0a-6813744e76bd ONBOOT=yes NM_CONTROLLED=yes BOOTPROTO=static IPADDR=172.16.10.4 NETMASK=255.255.255.0
在hadoop5上配置:
# vim /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0 HWADDR=00:0C:29:C6:65:9E TYPE=Ethernet UUID=b0641ec2-b3d1-4a0a-9c0a-6813744e76bd ONBOOT=yes NM_CONTROLLED=yes BOOTPROTO=static IPADDR=172.16.10.5 NETMASK=255.255.255.0
在hadoop6上配置:
# vim /etc/sysconfig/network-scripts/ifcfg-eth0 DEVICE=eth0 HWADDR=00:0C:29:C6:65:9E TYPE=Ethernet UUID=b0641ec2-b3d1-4a0a-9c0a-6813744e76bd ONBOOT=yes NM_CONTROLLED=yes BOOTPROTO=static IPADDR=172.16.10.5 NETMASK=255.255.255.0
1.5 修改主机名和hosts文件
//分别在6台虚拟机上修改对应的主机名和hosts文件,如下:这里已hadoop4为例,其他略。
# vim /etc/sysconfig/network NETWORKING=yes HOSTNAME=hadoop4 # hostname hadoop4 //不重启生效 # vim /etc/hosts //在hosts文件中添加如下内容 172.16.10.1 hadoop1 172.16.10.2 hadoop2 172.16.10.3 hadoop3 172.16.10.4 hadoop4 172.16.10.5 hadoop5 172.16.10.6 hadoop6
1.6 配置免密码登录
#首先要配置hadoop1到hadoop2、hadoop3、hadoop4、hadoop5、hadoop6的免密码登陆
#在hadoop1上生产一对钥匙
# ssh-keygen -t rsa
#将公钥拷贝到其他节点,包括自己
# ssh-copy-id hadoop1 # ssh-copy-id hadoop2 # ssh-copy-id hadoop3 # ssh-copy-id hadoop4 # ssh-copy-id hadoop5 # ssh-copy-id hadoop6
#配置itcast03到itcast04、itcast05、itcast06的免密码登陆
#在itcast03上生产一对钥匙
# ssh-keygen -t rsa
#将公钥拷贝到其他节点
# ssh-copy-id hadoop4 # ssh-copy-id hadoop5 # ssh-copy-id hadoop6
#注意:两个namenode之间要配置ssh免密码登陆,别忘了配置hadoop2到hadoop1的免登陆,在hadoop2上生产一对钥匙
# ssh-keygen -t rsa # ssh-coyp-id hadoop1
1.7 安装JDK
//分别在6台虚拟机上安装JDK,执行如下命令:
1)CentOS操作系统安装好了以后,系统自带了openJDK,查看相关安装信息:
java-1.6.0-openjdk-1.6.0.0-1.50.1.11.5.el6_3.x86_64 gcc-java-4.4.7-3.el6.x86_64 java_cup-0.10k-5.el6.x86_64 java-1.7.0-openjdk-1.7.0.9-2.3.4.1.el6_3.x86_64 tzdata-java-2012j-1.el6.noarch java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64 java-1.6.0-openjdk-devel-1.6.0.0-1.50.1.11.5.el6_3.x86_64
2)查看系统自带JDK版本
# java -version java version "1.7.0_09-icedtea" OpenJDK Runtime Environment (rhel-2.3.4.1.el6_3-x86_64) OpenJDK 64-Bit Server VM (build 23.2-b09, mixed mode)
3)卸载系统自带的openJDK
# rpm -e --nodeps tzdata-java-2012j-1.el6.noarch # rpm -e --nodeps java-1.5.0-gcj-1.5.0.0-29.1.el6.x86_64 # rpm -e --nodeps java-1.6.0-openjdk-devel-1.6.0.0-1.50.1.11.5.el6_3.x86_64 # rpm -e --nodeps java-1.6.0-openjdk-1.6.0.0-1.50.1.11.5.el6_3.x86_64 # rpm -e --nodeps java-1.7.0-openjdk-1.7.0.9-2.3.4.1.el6_3.x86_64
4)安装自己下载JDK
# rpm -ivh jdk-7u80-linux-x64.rpm Preparing... ####################################### [100%] 1:jdk ####################################### [100%] Unpacking JAR files... rt.jar... jsse.jar... charsets.jar... tools.jar... localedata.jar... jfxrt.jar... # java -version java version "1.7.0_80" Java(TM) SE Runtime Environment (build 1.7.0_80-b15) Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
第2章 安装配置Zookeeper集群
2.1 解压Zookeeper软件包(只需要在hadoop4上操作)
# mkdir /cloud # tar -zxvf zookeeper-3.4.5.tar.gz -C /cloud/
2.2 修改配置
# cd /cloud/zookeeper-3.4.5/conf # cp zoo_sample.cfg zoo.cfg # vim zoo.cfg 修改为:dataDir=/cloud/zookeeper-3.4.5/data
在最后添加:
server.1=hadoop4:2888:3888 server.2=hadoop5:2888:3888 server.3=hadoop6:2888:3888 然后,保存退出。 然后创建一个tmp文件夹 # mkdir /cloud/zookeeper-3.4.5/data 再创建一个空文件 # touch /cloud/zookeeper-3.4.5/data/myid 最后向该文件写入ID # echo 1 > /cloud/zookeeper-3.4.5/data/myid
2.3 将配置好的Zookeeper拷贝到其他节点
(首先分别在hadoop5、hadoop6根目录下创建一个cloud目录:mkdir /cloud)
在hadoop4上把zookeeper复制到hadoop5,hadoop6上
# scp -r /cloud/zookeeper-3.4.5/ root@hadoop5:/cloud/ # scp -r /cloud/zookeeper-3.4.5/ root@hadoop6:/cloud/
注意:修改hadoop5、hadoop6对应/cloud/zookeeper-3.4.5/data/myid内容
hadoop5: echo 2 > /cloud/zookeeper-3.4.5/data/myid hadoop6: echo 3 > /cloud/zookeeper-3.4.5/data/myid
2.4 启动Zookeeper
在hadoop4执行如下命令:
# cd /cloud/zookeeper-3.4.5/bin/ [root@hadoop4 bin]# ./zkServer.sh start //启动Zookeeper JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@hadoop4 bin]# ./zkServer.sh status //查看启动状态 JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Error contacting service. It is probably not running.
在hadoop5执行如下命令:
[root@hadoop5 data]# cd /cloud/zookeeper-3.4.5/bin/ [root@hadoop5 bin]# ./zkServer.sh start JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Starting zookeeper ... STARTED
过会在hadoop4和hadoop5查看状态信息如下:
[root@hadoop5 bin]# ./zkServer.sh status JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Mode: leader [root@hadoop4 bin]# ./zkServer.sh status JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Mode: follower
在hadoop6上执行如下命令:
[root@hadoop6 ~]# cd /cloud/zookeeper-3.4.5/bin/ [root@hadoop6 bin]# ./zkServer.sh start JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Starting zookeeper ... STARTED [root@hadoop6 bin]# ./zkServer.sh status JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Mode: follower
2.5 在Zookeeper上测试数据同步
在Hadoop4上操作:
# ./zkCli.sh WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0] ls / [zookeeper] [zk: localhost:2181(CONNECTED) 1] get /zookeeper cZxid = 0x0 ctime = Thu Jan 01 08:00:00 CST 1970 mZxid = 0x0 mtime = Thu Jan 01 08:00:00 CST 1970 pZxid = 0x0 cversion = -1 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 0 numChildren = 1 [zk: localhost:2181(CONNECTED) 3] create /hadoop123 123 //创建一个hadoop123文件并且值为:123 Created /hadoop123 [zk: localhost:2181(CONNECTED) 4] ls / #查看是否创建好了 [hadoop123, zookeeper] [zk: localhost:2181(CONNECTED) 5] get /hadoop123 #获取文件的值 123 cZxid = 0x200000002 ctime = Fri Mar 25 09:54:20 CST 2016 mZxid = 0x200000002 mtime = Fri Mar 25 09:54:20 CST 2016 pZxid = 0x200000002 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 3 numChildren = 0 [zk: localhost:2181(CONNECTED) 6]
在hadoop5和hadoop6上查看在hadoop4上创建文件是否同步过来:
[root@hadoop5 bin]# ./zkCli.sh [zk: localhost:2181(CONNECTED) 0] ls / #显示hadoop123同步过来了 [hadoop123, zookeeper] [zk: localhost:2181(CONNECTED) 1] get /hadoop123 #获取hadoop123文件内容 123 cZxid = 0x200000002 ctime = Fri Mar 25 09:54:20 CST 2016 mZxid = 0x200000002 mtime = Fri Mar 25 09:54:20 CST 2016 pZxid = 0x200000002 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 3 numChildren = 0 [zk: localhost:2181(CONNECTED) 2] [root@hadoop6 bin]# ./zkCli.sh [zk: localhost:2181(CONNECTED) 0] ls / #显示hadoop123同步过来了 [hadoop123, zookeeper] [zk: localhost:2181(CONNECTED) 1] get /hadoop123 #获取hadoop123文件内容 123 cZxid = 0x200000002 ctime = Fri Mar 25 09:54:20 CST 2016 mZxid = 0x200000002 mtime = Fri Mar 25 09:54:20 CST 2016 pZxid = 0x200000002 cversion = 0 dataVersion = 0 aclVersion = 0 ephemeralOwner = 0x0 dataLength = 3 numChildren = 0 [zk: localhost:2181(CONNECTED) 2]
2.6 测试Zookeeper故障角色转移
所先查看hadoop4、hadoop5、hadoop6上角色状态
[root@hadoop4 bin]# ./zkServer.sh status JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Mode: follower [root@hadoop5 bin]# ./zkServer.sh status JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Mode: leader [root@hadoop6 bin]# ./zkServer.sh status JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Mode: follower
模拟hadoop5上Zookeeper故障:
[root@hadoop5 bin]# ./zkServer.sh stop JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Stopping zookeeper ... STOPPED [root@hadoop4 bin]# ./zkServer.sh status JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Mode: follower //hadoop4角色还是follower [root@hadoop6 bin]# ./zkServer.sh status JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Mode: leader //hadoop6转变为leader角色
第3章 安装配置hadoop集群
3.1 解压hadoop软件包(只需在hadoop1操作)
# mkdir /cloud # tar -zxvf hadoop-2.2.0.x86_64.tar.gz -C /cloud/
3.2 配置hadoop环境变量
# vim /etc/profile //(6台虚拟机都要配置) 在文件末尾添加如下内容: export HADOOP_HOME=/cloud/hadoop-2.2.0 export PATH=$PATH:$HADOOP_HOME/bin # source /etc/profile
3.3 配置HDFS
//hadoop2.0的配置文件全部在$HADOOP_HOME/etc/hadoop下
1)修改hadoo-env.sh
# cd /cloud/hadoop-2.2.0/etc/hadoop # vim hadoop-env.sh 将 export JAVA_HOME=${JAVA_HOME} 修改为: export JAVA_HOME= /usr/java/jdk1.7.0_80
2)修改core-site.xml
# vim core-site.xml <configuration> <!-- 指定hdfs的nameservice为ns1 --> <property> <name>fs.defaultFS</name> <value>hdfs://ns1</value> </property> <!-- 指定hadoop临时目录 --> <property> <name>hadoop.tmp.dir</name> <value>/cloud/hadoop-2.2.0/tmp</value> </property> <!-- 指定zookeeper地址 --> <property> <name>ha.zookeeper.quorum</name> <value>hadoop4:2181,hadoop5:2181,hadoop6:2181</value> </property> </configuration>
3)修改hdfs-site.xml
# vim hdfs-site.xml <configuration> <!--指定hdfs的nameservice为ns1,需要和core-site.xml中的保持一致 --> <property> <name>dfs.nameservices</name> <value>ns1</value> </property> <!-- ns1下面有两个NameNode,分别是nn1,nn2 --> <property> <name>dfs.ha.namenodes.ns1</name> <value>nn1,nn2</value> </property> <!-- nn1的RPC通信地址 --> <property> <name>dfs.namenode.rpc-address.ns1.nn1</name> <value>hadoop1:9000</value> </property> <!-- nn1的http通信地址 --> <property> <name>dfs.namenode.http-address.ns1.nn1</name> <value>hadoop1:50070</value> </property> <!-- nn2的RPC通信地址 --> <property> <name>dfs.namenode.rpc-address.ns1.nn2</name> <value>hadoop2:9000</value> </property> <!-- nn2的http通信地址 --> <property> <name>dfs.namenode.http-address.ns1.nn2</name> <value>hadoop2:50070</value> </property> <!-- 指定NameNode的元数据在JournalNode上的存放位置 --> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop4:8485;hadoop5:8485;hadoop6:8485/ns1</value> </property> <!-- 指定JournalNode在本地磁盘存放数据的位置 --> <property> <name>dfs.journalnode.edits.dir</name> <value>/cloud/hadoop-2.2.0/journal</value> </property> <!-- 开启NameNode失败自动切换 --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <!-- 配置失败自动切换实现方式 --> <property> <name>dfs.client.failover.proxy.provider.ns1</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行--> <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> <!-- 使用sshfence隔离机制时需要ssh免登陆 --> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> <!-- 配置sshfence隔离机制超时时间 --> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> </configuration>
4)修改mapred-site.xml
# mv mapred-site.xml.template mapred-site.xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
5)修改yarn-site.xml
# vim yarn-site.xml <configuration> <!-- 指定resourcemanager地址 --> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop3</value> </property> <!-- 指定nodemanager启动时加载server的方式为shuffle server --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
6)修改slaves
(slaves是指定子节点的位置,因为要在hadoop1上启动HDFS、在hadoop3启动yarn,所以hadoop1上的slaves文件指定的是datanode的位置,hadoop3上的slaves文件指定的是nodemanager的位置)
# vim slaves hadoop4 hadoop5 hadoop6
3.4 将配置好的hadoop拷贝到其他节点
# scp -r /cloud/ root@hadoop2:/ # scp -r /cloud/ root@hadoop3:/ # scp -r /cloud/hadoop-2.2.0/ root@hadoop4:/cloud/ # scp -r /cloud/hadoop-2.2.0/ root@hadoop5:/cloud/ # scp -r /cloud/hadoop-2.2.0/ root@hadoop6:/cloud/
3.5 启动Zookeeper集群(分别在hadoop4、hadoop5、hadoop6上启动zk)
# /cloud/zookeeper-3.4.5/bin # ./zkServer.sh start JMX enabled by default Using config: /cloud/zookeeper-3.4.5/bin/../conf/zoo.cfg Starting zookeeper ... STARTED #查看状态:一个leader,两个follower # ./zkServer.sh status
3.6 启动启动journalnode
分别在hadoop4、hadoop5、hadoop6启动journalnode
# /cloud/hadoop-2.2.0/sbin # ./hadoop-daemon.sh start journalnode starting journalnode, logging to /cloud/hadoop-2.2.0/logs/hadoop-root-journalnode-hadoop4.out //运行jps命令检验,hadoop4、hadoop5、hadoop6上多了JournalNode进程 # jps 1686 QuorumPeerMain 1802 JournalNode 1920 Jps
3.7 格式户HDFS
//在hadoo1上执行命令:
# hdfs namenode -format
//格式化后会在根据core-site.xml中的hadoop.tmp.dir配置生成个文件,这里我配置的是/cloud/hadoop-2.2.0/tmp,然后将/cloud/hadoop-2.2.0/tmp拷贝到hadoop2的/cloud/hadoop-2.2.0/下
# scp -r tmp/ hadoop2:/cloud/hadoop-2.2.0/
3.8 格式化ZK (在hadoop1上执行即可)
# hdfs zkfc –formatZK 启动后在hadoop4、hadoop5、hadoop6上查看多一个文件。 [zk: localhost:2181(CONNECTED) 6] ls / [hadoop-ha, zookeeper] //多一个Hadoop-ha文件
3.9 启动HDFS(在hadoop1上执行)
# cd /cloud/hadoop-2.2.0/sbin # ./start-dfs.sh # jps 2552 DFSZKFailoverController 2605 Jps 2288 NameNode
3.10启动YARN(在hadoop3上执行)
(注意:是在hadoop3上执行start-yarn.sh,把namenode和resourcemanager分开是因为性能问题,因为他们都要占用大量资源,所以把他们分开了,他们分开了就要分别在不同的机器上启动)
# cd /cloud/hadoop-2.2.0/sbin/ # ./start-yarn.sh # jps 2174 Jps 2104 ResourceManager 启动之后在hadoop4、hadoop5、hadoop6可以看到NodeManager进程。 # jps 1625 QuorumPeerMain 1917 NodeManager 1697 JournalNode 1791 DataNode 1947 Jps
到此,hadoop2.2.0配置完毕,可以统计浏览器访问: