Hadoop2.4.1部署（完整版）

引言

轉眼間，Hadoop的stable版本已經升級到2.4.1了，社區的力量真是強大！3.0啥時候release呢？

今天做了個調研，嚐鮮了一下2.4.1版本的分佈式部署，包括NN HA（目前已經部署好了2.2.0的NN HA，ZK和ZKFC用現成的），順便也結合官方文檔 http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/ClusterSetup.html 梳理、補全了關鍵的配置文件屬性，將同類屬性歸類，方便以後閱讀修改，及作爲模板使用。

下面記錄參照官方文檔及過去經驗部署2.4.1的過程。

歡迎轉載，請註明來源：http://blog.csdn.net/u010967382/article/details/37653177

注意

1.本文只記錄配置文件，不記錄其餘部署過程，其餘過程和2.2.0相同，參見

http://blog.csdn.net/u010967382/article/details/20380387

http://blog.csdn.net/u010967382/article/details/30976935

2.配置中所有的路徑、IP、hostname均需根據實際情況修改。

1.實驗環境：

4節點集羣，ZK節點3個，hosts文件和各節點角色分配如下：

hosts：

192.168.66.91 master

192.168.66.92 slave1

192.168.66.93 slave2

192.168.66.94 slave3

角色分配：

	Active NN	Standby NN	DN	JournalNode	Zookeeper	FailoverController
master	V			V	V	V
slave1		V	V	V	V	V
slave2			V	V	V
slave3			V

2.hadoop-env.sh 修改以下三處即可

# The java implementation to use.

export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_07

# The directory where pid files are stored. /tmp by default.

# NOTE: this should be set to a directory that can only be written to by the user that will run the hadoop daemons. Otherwise there is the potential for a symlink attack.

export HADOOP_PID_DIR=/home/yarn/Hadoop/hadoop-2.4.1/hadoop_pid_dir

export HADOOP_SECURE_DN_PID_DIR=/home/yarn/Hadoop/hadoop-2.4.1/hadoop_pid_dir

3.core-site.xml 完整文件

<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Licensed under the Apache License, Version 2.0 (the "License"); you

may not use this file except in compliance with the License. You may obtain

a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless

required by applicable law or agreed to in writing, software distributed

under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES

OR CONDITIONS OF ANY KIND, either express or implied. See the License for

the specific language governing permissions and limitations under the License.

See accompanying LICENSE file. -->

<name>fs.defaultFS</name>

<value>hdfs://myhadoop</value>

<description>NameNode UR，格式是hdfs://host:port/，如果開啓了NN

HA特性，則配置集羣的邏輯名，具體參見我的博客http://blog.csdn.net/u010967382/article/details/30976935

</description>

</property>

<name>hadoop.tmp.dir</name>

<value>/home/yarn/Hadoop/hadoop-2.4.1/tmp</value>

</property>

<name>io.file.buffer.size</name>

<description>Size of read/write buffer used in SequenceFiles.

</description>

</property>

<name>ha.zookeeper.quorum</name>

<value>master:2181,slave1:2181,slave2:2181</value>

<description>注意，配置了ZK以後，在格式化、啓動NameNode之前必須先啓動ZK，否則會報連接錯誤

</description>

</property>

</configuration>

4.hdfs-site.xml 完整文件