ubuntu系統
2.所需要的軟件包
ubuntu系統安裝包
JDK安裝包(這裏以jdk-6u30-linux-i586爲例)
hadoop-0.20.203.0包(下載地址:http://apache.etoak.com//hadoop/core/hadoop-0.20.203.0/)
eclipse安裝包
3.配置環境的步驟
3.1安裝Ubuntu
安裝Ubuntu這裏就不說了,其實很簡單和安裝一般的軟件沒多大區別
3.2安裝配置jdk
(1)在 /usr/local 下建立 java 文件夾
命令:sudo mkdir /usr/local/java
(2)將自己的jdk軟件包拷貝到剛纔建立的java文件夾下
命令:sudo cp jdk的路徑 /usr/local/java
(3)在java文件夾下安裝JDK
切換到java目錄;
命令: cd /usr/local/java;
(4)給文件權限
命令:sudo chmod u+x jdk-6u30-linux-i586.bin
(5)安裝jdk-6u30-linux-i586.bin
命令:sudo ./jdk-6u30-linux-i586.bin
(6)配置jdk環境
命令:sudo gedit /etc/profile
在配置文件尾添加如下代碼
#set java environment
export JAVA_HOME=/ usr/local/java/jdk1.6.0_30
export JRE_HOME=/ usr/local/java/jdk1.6.0_30 /jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
(7)測試JDK是否安裝成功
命令:java -version
java version "1.6.0_23"
Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
Java HotSpot(TM) Server VM (build 19.0-b09, mixed mode)
3.3安裝配置ssh
(1)安裝openssh_server
命令:sudo apt-get install openssh-server
(2)創建ssh-key,爲rsa
命令:ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
(3)驗證配置成功與否
命令:ssh localhost
Welcome to Ubuntu 11.10 (GNU/Linux 3.0.0-14-generic i686)
* Documentation: https://help.ubuntu.com/
108 packages can be updated.38 updates are security updates.
Last login: Sun Feb 5 10:45:19 2012 from localhost
3.4安裝配置hadoop
(1)將hadoop壓縮文件複製到local文件目錄下;
命令:cp hadoop路徑 /usr/local
(2)解壓hadoop文件;
命令:sudo tar -xzf hadoop-0.20.203.0rc1.tar
(3)將解壓後的文件夾更名爲hadoop
s命令:sudo mv hadoop-0.20.203.0 hadoop
(4)建立hadoop用戶組(建立一個名爲hadoop的用戶組和hadoop用戶)
命令:sudo addgroup hadoop
(5)建立hadoop用戶並歸到hadoop用戶組下
命令:sudo adduser -ingroup hadoop hadoop
(6)給hadoop權限
打開sudoers文件
命令:sudo gedit /etc/sudoers;
在root ALL =(ALL) ALL 下面添加如下代碼;
hadoop ALL =(ALL) ALL;
4.配置hadoop
(1)打開conf/hadoop-env.sh
命令:cd /usr/local/hadoop
sudo gedit conf/hadoop-env.sh
配置conf/hadoop-env.sh(找到#export JAVA_HOME=...,去掉#,然後加上本機jdk的路徑),如圖15
(2)打開conf/core-site.xml
命令:cd /usr/local/hadoop
sudo gedit conf/core-site.xml
配置,如下內容:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
</property>
</configuration>
(3)打開conf目錄下的mapred-site.xml
命令:cd /usr/local/hadoop
sudo gedit conf/mapred-site.xml
配置如下內容:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
5.hadoop測試
(1)改變用戶,格式化namenode
命令:cd /usr/local/hadoop
./bin/hadoop hadoop namenode -format
(2)啓動hadoop
命令:sudo chown -r /usr/local/hadoop
./bin start-all.sh
(3)測試是否啓動成功
命令:jps命令
如果同時打印出:NameNode,JobTracker,SecondaryNameNode,jps,tasktracker,DataNode則表示安裝成功;hadoop是否成功啓動
(4)運行自帶wordcount例子
首先準備兩個本地文檔;
sudo gedit /testin/test1.txt
sudo gedit /testin/test2.txt
寫點單詞
在hdfs中新建目錄
./bin/hadoop dfs -mkdir test-in
上傳本地文件到hdfs中指定的目錄;
./bin/hadoop copyFromLocal /tmp/test*.txt test-in
運行wordcount;
./bin/hadoop jar hadoop-examples-0.20.203.0.jar wordcount file-in file-out
查看運行結果
./bin/hadoop dfs -cat file-out/part-r-00000
linux系統(這裏以ubuntu11.10爲例)