Hadoop2.X學習筆記--搭建

一搭建環境列表

操作系統:centos6.5 64位

JDK環境:jdk1.7.0_71

hadoop版本:社區版本2.7.2,hadoop-2.7.2-src.tar.gz

主機名

ip

角色

用戶

master1

192.168.204.202

Namenode;secondary namenode;resourcemanager

hadoop

slave1

192.168.204.203

Datanode; nodemanager

hadoop

slave2

192.168.204.204

Datanode; nodemanager

hadoop

二操作系統環境準備

1設置主機名:hostname

vi  /etc/sysconfig/network

2設置防火牆

chkconfig iptables off

service iptables off

3關閉Selinux

vi /etc/sysconfig/selinux

SELINUX=disabled

[root@cloud001 Desktop]# hostname

[root@cloud001 Desktop]# ifconfig

[root@cloud001 Desktop]# service iptables status

[root@cloud001 Desktop]# sestatus

4安裝jdk

配置環境變量

[root@master1 hadoopsolf]vim /etc/profile

JAVA_HOME=/usr/java/jdk1.7.0_71(根據實際情況修改)

CLASSPATH=.:$JAVA_HOME/lib.tools.jar

PATH=$JAVA_HOME/bin:$PATH

export JAVA_HOME CLASSPATH PATH

[root@ master1 hadoopsolf]source /etc/profile

三hadoop2.X軟件編譯環境準備

1下載http://apache.claz.org/hadoop/common/最新版

2準備編譯環境

tar -zxvf hadoop-2.7.2-src.tar.gz得到hadoop-2.7.2-src文件夾。

進入hadoop-2.7.2-src文件夾,查看BUILDING.txt

cd  hadoop-2.7.2-src

vim  BUILDING.txt

可以看到編譯所需的庫或者工具。

3jdk

安裝jdk;然後打開/etc/profile配置jdk環境變量

export  JAVA_HOME=/usr/java/jdk1.7.0_71

export  CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/tools.jar

export  JRE_HOME=/usr/java/jdk1.7.0_71

export  PATH=$PATH:$JRE_HOME/bin

source  /etc/profile

運行javac -version 查看狀態

4安裝各種庫

yum -y install svn ncurses-devel gcc*

yum -y install lzo-devel zlib-develautoconf automake libtool cmake openssl-devel

5安裝protobuf-2.5.0.tar.gz

tar zxvf protobuf-2.5.0.tar.gz進入protobuf-2.5.0依次執行

cd  protobuf-2.5.0

./configure

make

make install

查版本[root@master1 protobuf-2.5.0]# protoc  -version

6安裝maven

下載http://maven.apache.org/download.cgi

tar -zxvf apache-maven-3.3.9-bin.tar.gz  -C  /hadoopsolf

然後打開/etc/profile配置環境變量

export MAVEN_HOME=/hadoopsolf/apache-maven-3.3.9

export MAVEN_OPTS="-Xms256m-Xmx512m"

export PATH=$PATH:$MAVEN_HOME/bin

source /etc/profile

查版本:[root@master1 protobuf-2.5.0]# mvn -version

7安裝ant

下載:http://ant.apache.org/bindownload.cgi

tar -zxvf  apache-ant-1.9.6-bin.tar.gz  -C /hadoopsolf

然後打開/etc/profile配置環境變量

export ANT_HOME=/hadoopsolf/apache-ant-1.9.6

export PATH=$ANT_HOME/bin:$PATH

source /etc/profile

驗證:[root@master1 protobuf-2.5.0]# ant -version

8安裝findbugs

下載:http://findbugs.sourceforge.net/downloads.html

vim /etc/profile 文件末尾添加:

export FINDBUGS_HOME=/hadoopsolf/findbugs-3.0.1

export PATH=$PATH:$FINDBUGS_HOME/bin

source /etc/profile

驗證:[root@master1 protobuf-2.5.0]# findbugs -version

9 hadoop2.X軟件編譯

mvn clean package -Pdist,native -DskipTests -Dtar

或者

mvn package -Pdist,native -DskipTests -Dtar 

務必保持網絡暢通,需要經過漫長的等待!

四安裝配置hadoop2.X

1配置ssh免密碼登錄

1.1各主機配置hosts文件與主機名

vi  /etc/hosts

192.168.204.202  master1

192.168.204.203   slave1

192.168.204.204   slave2

vi  /etc/sysconfig/network

NETWORKING=yes

HOSTNAME=master1  // slave1, slave2

1.2各主機設置靜態ip,互ping

1.3配置master1到2個slave(slave1,slave2)的免密碼登錄(按順序操作)

(1)首先檢查各機器是否已經安裝[root@master1Desktop]# rpm -qa|grep  ssh

已經安裝

openssh-askpass-5.3p1-94.el6.x86_64

libssh2-1.4.2-1.el6.x86_64

openssh-5.3p1-94.el6.x86_64

openssh-server-5.3p1-94.el6.x86_64

openssh-clients-5.3p1-94.el6.x86_64

如果沒有安裝則 yum install  ssh

(2)在master1主機:

hadoop用戶執行:ssh-keygen -t rsa下一步繼續到結束

[hadoop@master1 ~]$ cd  /home/hadoop/.ssh/

[hadoop@master1 .ssh]$ ls

id_rsa  id_rsa.pub

[hadoop@master1 .ssh]$ cat id_rsa.pub>> authorized_keys

root用戶執行[master1Desktop]# chmod  600  /home/hadoop/.ssh/authorized_keys

hadoop用戶執行驗證

[hadoop@master1 ~]$ ssh  master1

Last login: Mon Feb 22 22:23:16 2016 from master1

[hadoop@master1 ~]$

slave機器hadoop用戶執行:mkdir-p  /home/hadoop/.ssh

master1主機hadoop用戶執行傳送

[hadoop@master1~]$

 scp /home/hadoop/.ssh/authorized_keys hadoop@slave1:/home/hadoop/.ssh/

[hadoop@master1~]$

 scp /home/hadoop/.ssh/authorized_keys hadoop@slave2:/home/hadoop/.ssh/

slave機器root用戶執行:chmod  600 /home/hadoop/.ssh/authorized_keys

master1機器hadoop用戶執行驗證

[hadoop@master1 ~]$ ssh  slave1

[hadoop@master1 ~]$ ssh  slave2

2安裝hadoop2.X

hadoop用戶在master1操作:

2.1把編譯好的hadoop2.X解壓至目錄

(自定義,我這裏是/home/hadoop)

[hadoop@master1 ~]$ tar  -zxvf  /hadoopsolf/hadoop-2.7.2.tar.gz  -C  /home/hadoop

配置hadoop2.X的環境變量,修改~/.bash_profile

vi  /home/hadoop/.bash_profile

export HADOOP_HOME=/home/hadoop/hadoop-2.7.2

export HADOOP_MAPRED_HOME=${HADOOP_HOME}

export HADOOP_COMMON_HOME=${HADOOP_HOME}

export HADOOP_HDFS_HOME=${HADOOP_HOME}

export YARN_HOME=${HADOOP_HOME}

export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop

export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop

export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop

export HADOOP_LOG_DIR=${HADOOP_HOME}/logs

export HADOOP_PID_DIR=/var/hadoop/pids

--- 注意

(root用戶創建/var/hadoop/pids並賦予hadoop權限

mkdir   -p /var/hadoop/pids

chown  -R hadoop: hadoop /var/hadoop/pids

)

export PATH=$PATH:$HADOOP_HOME/bin

export JAVA_HOME=/usr/java/jdk1.7.0_71

export  CLASSPATH=.:$JAVA_HOME/lib.tools.jar

export  PATH=$JAVA_HOME/bin:$PATH

2.2配置Hadoop中基礎目錄

cd  /home/hadoop/hadoop-2.7.2

$ mkdir -p dfs/name

$ mkdir -p dfs/data

$ mkdir -p tmp

$ cd  etc/hadoop

2.3配置Hadoop中配置文件

需要配置的文件如下core-site.xml, hdfs-site.xml, mapred-site.xml, hadoop-env.sh,所有的文件均位於hadoop2.7.2/etc/hadoop下面,具體需要的配置如下:

core-site.xml 配置如下:

<property>

<name>hadoop.tmp.dir</name>

<value>file:/home/hadoop/hadoop-2.7.2/tmp</value>

<description>Abase for other temporary directories.</description>

</property>   //接收Client連接的RPC端口,用於獲取文件系統metadata信息。

      <property>

       <name>fs.defaultFS</name>

      <value>hdfs://master1:9000</value>

      </property>

<property>

<name>io.file.buffer.size</name>

<value>131702</value>

</property>

hdfs-site.xml配置如下:

<property>

<name>dfs.namenode.secondary.http-address</name>

<value>master1:9001</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/home/hadoop/hadoop-2.7.2/dfs/name</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/home/hadoop/hadoop-2.7.2/dfs/data</value>

</property>

<property>   //2

<name>dfs.replication</name>

<value>2</value>

</property>

<property>

<name>dfs.webhdfs.enabled</name>

<value>true</value>

</property>

mapred-site.xml配置如下:(cp mapred-site.xml.template  mapred-site.xml)  

<property>

<name>mapreduce.framework.name</name>  // mapreduce運行在yarn上。

<value>yarn</value>

</property>

<property>

<name>mapreduce.jobhistory.address</name>

<value>master1:10020</value>

</property>

<property>

<name>mapreduce.jobhistory.webapp.address</name>

<value>master1:19888</value>

</property>

yarn-site.xml配置如下:

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

<property>

<name>yarn.resourcemanager.address</name>

<value>master1:8032</value>

</property>

<property>

<name>yarn.resourcemanager.scheduler.address</name>

<value>master1:8030</value>

</property>

<property>

<name>yarn.resourcemanager.resource-tracker.address</name>

<value>master1:8031</value>

</property>

<property>

<name>yarn.resourcemanager.admin.address</name>

<value>master1:8033</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address</name>

<value>master1:8088</value>

</property>

Vi slaves:

slave1

slave2

vi  hadoop-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_71

vi  yarn-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_71

2.4拷貝

從master1遠程複製文件夾到slaveX機

scp -r /home/hadoop/hadoop-2.7.2  slave1:/home/hadoop

scp -r /home/hadoop/hadoop-2.7.2  slave2:/home/hadoop

3啓動hadoop2.X

master1機器操作

3.1初始化HDFS系統

bin/hdfs  namenode  -format

3.2開啓NameNode和DataNode守護進程

sbin/start-dfs.sh(此命令啓動了namenode、secondaryNamenode以及datanode)

[hadoop@master1 sbin]$ ./start-dfs.sh

Starting namenodes on [master1]

master1: Error: JAVA_HOME is not set and could not be found.

slave2: Error: JAVA_HOME is not set and could not be found.

slave1: Error: JAVA_HOME is not set and could not be found.

Starting secondary namenodes [master1]

master1: Error: JAVA_HOME is not set and could not be found.

解決

vi /home/hadoop/hadoop-2.7.2/libexec/hadoop-config.sh
添加 export JAVA_HOME=/usr/java/jdk1.7.0_71

此時在master1上面運行的進程有:namenode secondarynamenode

slaveX上面運行的進程有:datanode

./sbin/start-yarn.sh (此命令啓動了ResourceManager和NodeManager)

此時在master1上面運行的進程有:namenode secondarynamenode resourcemanager

slaveX上面運行的進程有:datanode  NodeManager

查看各進程

[hadoop@master1 hadoop-2.7.2]$ jps

8176 Jps

4356 ResourceManager

6277 NameNode

6429 SecondaryNameNode

3.3基本狀態查看

查看幫助:

[hadoop@master1 hadoop-2.7.2]$./hdfs  –help

[hadoop@master1 hadoop-2.7.2]$./hdfs dfs –help

[hadoop@master1 bin]$ hdfs dfsadmin  -help

查看集羣狀態:./bin/hdfs dfsadmin –report

查看文件塊組成:./bin/hdfs fsck / -files -blocks

可以通過登錄Web控制檯,查看HDFS集羣狀態:     http://master1:50070 (hdfs-site.xml)

ResourceManager運行在主節點master上,查看yarn:    http://master1:8088 (yarn-site.xml)

NodeManager運行在從節點上,查看例如節點slave1:http://slave1:8042/

管理JobHistory Server(先要啓動mr-jobhistory-daemon.shstart historyserver),通過Web查看:http://master1:19888/jobhistory

查看hadoop:http://master1:9001

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章