機器準備
假設現在有五臺機器192.168.8.10
192.168.8.11
192.168.8.12
192.168.8.13
192.168.8.14
環境配置
用戶創建在所有機器上輸入下列命令
useradd -d /home/hadoop -s /bin/bash -m hadoop #創建新用戶hadoop
passwd hadoop #設置hadoop用戶密碼
準備生成密鑰
在namenode機器上,在hadoop用戶下執行以下命令
ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
修改/etc/hosts
在hosts文件中加入下列幾行
192.168.8.10 hadoop-master
192.168.8.11 hadoop1
192.168.8.12 hadoop2
192.168.8.13 hadoop3
192.168.8.14 hadoop4
將authorized_keys文件從master傳到其餘的機器上
scp ~/.ssh/authorized_keys [email protected]:/home/hadoop/.ssh/authorized_keys
scp ~/.ssh/authorized_keys [email protected]:/home/hadoop/.ssh/authorized_keys
scp ~/.ssh/authorized_keys [email protected]:/home/hadoop/.ssh/authorized_keys
scp ~/.ssh/authorized_keys [email protected]:/home/hadoop/.ssh/authorized_keys
使用同樣的方法將hosts文件傳到其餘機器上的相應位置
hadoop安裝
下載並解壓縮hadoop 2.2wget http://apache.mirrors.lucidnetworks.net/hadoop/common/stable2/hadoop-2.2.0.tar.gz
tar -xvf hadoop-2.2.0.tar.gz
mv hadoop-2.2.0.tar.gz /home/hadoop/
配置hadoop環境變量
切換到root,編輯/etc/profile,在最後加入下列內容
export HADOOP_HOME=/home/hadoop/hadoop-2.2.0
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/lib
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
配置hadoop
進入$HADOOP_HOME/etc/hadoop
編輯slaves
加入下列內容
hadoop1
hadoop2
hadoop3
hadoop4
編輯hadoop-env.sh
加入下列一行
export JAVA_HOME=/usr/share/jdk1.7.0_51 #否則後面可能會提示找不到JAVA_HOME
編輯core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-master:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoopp-2.2.0/mytmp</value>
<description>A base for other temporarydirectories.</description>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>hadoop-master</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
</configuration>
編輯hdfs-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-master:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoopp-2.2.0/mytmp</value>
<description>A base for other temporarydirectories.</description>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>hadoop-master</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
</configuration>
編輯mapred-site.xml.template
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop-master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop-master:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/mr-history/tmp</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/mr-history/done</value>
</property>
</configuration>
編輯yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop-master:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop-master:18030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop-master:18025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop-master:18041</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop-master:8088</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/home/hadoop/mynode/my</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/home/hadoop/mynode/logs</value>
</property>
<property>
<name>yarn.nodemanager.log.retain-seconds</name>
<value>10800</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/logs</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir-suffix</name>
<value>logs</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>-1</value>
</property>
<property>
<name>yarn.log-aggregation.retain-check-interval-seconds</name>
<value>-1</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
將core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml使用scp從namenode傳送到各個datanode上的相應位置。
至此hadoop已配置完畢。
使用hadoop
啓動hadoophadoop namenode -format #格式化namenode
$HADOOP_HOME/sbin/start-all.sh #啓動hadoop
在namenode上輸入jps,應能看到下列內容
27825 ResourceManager
28080 Jps
27667 SecondaryNameNode
27406 NameNode
在一個datanode上輸入jps,能看到下列內容
6079 NodeManager
5908 DataNode
6388 Jps
停止hadoop可使用如下命令
$HADOOP_HOME/sbin/stop-all.sh