三臺機器的hadoop集羣的配置、wordcount的運行

1.安裝好三臺centos機器,配置好網絡,使三臺機器能夠互通
本次試用的是virtualbox虛擬機,在virtualbox上分別安裝好三臺centos6.5後,一般爲了使centos既可以在內網與主機間互通,又可以訪問外網,需要爲每臺機器配置雙網卡,其中一個用net網絡模式,另一個用hostonly模式即可達到該要求。

2.修改主機名
/etc/sysconfig/network 三臺機器分別修改hostname=master,hostname=slave1,hostname=slave2.

配置hosts,三臺機器的/etc/hosts 中末尾全部添加:
master ip1
slave1 ip2
slave2 ip3

重啓三臺機器,測試ping master, ping slave1, ping slave2能否互通

3.在三臺機器中配置ssh免密碼
a.在三臺機器中創建公鑰私鑰
在master中:
cd /root/.ssh
ssh-keygen -t rsa -P “”
ssh slave1
cd /root/.ssh
ssh-keygen -t rsa -P “”
ssh slave2
cd /root/.ssh
ssh-keygen -t rsa -P “”
b.slave1,slave2機器中分別複製一份公鑰,並重命名
slave1: cp id_rsa.pub id1
slave2: cp id_rsa.pub id2
c.將複製的公鑰傳送到master的/home中,在/root/.ssh中創建authorized_keys,將傳到master中的兩個公鑰及自己的公鑰內容複製到
authorized_keys,最後將authorized_keys移動到/root/.ssh,並分發到slave1,slave2的/root/.ssh下。
slave:
scp id1 master:/home
master:
vi /root/.ssh/authorized_keys
cat /home/id1 >> /root/.ssh/authorized_keys
cat /home/id2 >> /root/.ssh/authorized_keys
cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys

scp authorized_keys slave1:/root/.ssh
scp authorized_keys slave2:/root/.ssh

4.安裝JDK.
在master上安裝後,直接傳到2個slave上,/etc/profile 環境變量設置也可直接傳

5.hadoop安裝
解壓,修改配置文件:masters,slaves,hadoop-env.sh core-site.xml hdfs-site.xml mapred-site.xml
export JAVA_HOME=/usr/jdk1.7.0_79

<configuration>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp</value>
</property>
<property>
  <name>fs.default.name</name>
  <value>hdfs://master:9000</value>
</property>
</configuration>

<configuration>
  <property>
    <name>dfs.replication</name>  
    <value>2</value>
</property>
</configuration>

<configuration>
<property>
    <name>mapred.job.tracker</name>  
    <value>master:9001</value>
</property>
</configuration>

關閉所有機器上的防火牆:
service iptables stop
service iptables off
service iptables status

bin/hadoop namenode -format
bin/start-all.sh
若需要改配置文件,則先bin/stop-all.sh,再改配置文件,刪除/tmp文件夾,重新bin/hadoop namenode -format

4.hadoop shell執行wordcount程序:
[root@master home]# mkdir hadooplocalfile
[root@master home]# ls
hadooplocalfile
[root@master home]# cd hadooplocalfile
[root@master hadooplocalfile]# echo “Hello world” > file1.txt
[root@master hadooplocalfile]# echo “Hello hadoop” > file2.txt
[root@master hadooplocalfile]# ls
file1.txt file2.txt
[root@master hadooplocalfile]# more file1.txt
Hello world
[root@master hadooplocalfile]# cd
[root@master ~]# cd /usr/hadoop-1.0.1
[root@master hadoop-1.0.1]# bin/hadoop fs -mkdir input
[root@master hadoop-1.0.1]# bin/hadoop fs -ls /
Found 3 items
drwxr-xr-x - root supergroup 0 2015-12-21 10:57 /test
drwxr-xr-x - root supergroup 0 2015-12-21 10:40 /tmp
drwxr-xr-x - root supergroup 0 2015-12-21 11:14 /user
[root@master hadoop-1.0.1]# bin/hadoop fs -ls /user
Found 1 items
drwxr-xr-x - root supergroup 0 2015-12-21 11:14 /user/root
[root@master hadoop-1.0.1]# bin/hadoop fs -ls /user/root
Found 1 items
drwxr-xr-x - root supergroup 0 2015-12-21 11:14 /user/root/input
[root@master hadoop-1.0.1]# bin/hadoop fs -put /home/hadooplocalfile/file*.txt input
[root@master hadoop-1.0.1]# bin/hadoop -ls input
Unrecognized option: -ls
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
[root@master hadoop-1.0.1]# bin/hadoop fs -ls input···············
Found 2 items
-rw-r–r– 2 root supergroup 12 2015-12-21 11:16 /user/root/input/file1.txt
-rw-r–r– 2 root supergroup 13 2015-12-21 11:16 /user/root/input/file2.txt
[root@master hadoop-1.0.1]# bin/hadoop jar /usr/hadoop-1.0.1/hadoop-examples-1.0.1.jar wordcount input output
****hdfs://master:9000/user/root/input
15/12/21 11:18:01 INFO input.FileInputFormat: Total input paths to process : 2
15/12/21 11:18:02 INFO mapred.JobClient: Running job: job_201512211045_0001
15/12/21 11:18:03 INFO mapred.JobClient: map 0% reduce 0%
15/12/21 11:18:23 INFO mapred.JobClient: map 50% reduce 0%
15/12/21 11:18:26 INFO mapred.JobClient: map 100% reduce 0%
15/12/21 11:18:38 INFO mapred.JobClient: map 100% reduce 100%
15/12/21 11:18:43 INFO mapred.JobClient: Job complete: job_201512211045_0001
15/12/21 11:18:43 INFO mapred.JobClient: Counters: 29
15/12/21 11:18:43 INFO mapred.JobClient: Job Counters
15/12/21 11:18:43 INFO mapred.JobClient: Launched reduce tasks=1
15/12/21 11:18:43 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=27147
15/12/21 11:18:43 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
15/12/21 11:18:43 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
15/12/21 11:18:43 INFO mapred.JobClient: Launched map tasks=2
15/12/21 11:18:43 INFO mapred.JobClient: Data-local map tasks=2
15/12/21 11:18:43 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=13091
15/12/21 11:18:43 INFO mapred.JobClient: File Output Format Counters
15/12/21 11:18:43 INFO mapred.JobClient: Bytes Written=25
15/12/21 11:18:43 INFO mapred.JobClient: FileSystemCounters
15/12/21 11:18:43 INFO mapred.JobClient: FILE_BYTES_READ=55
15/12/21 11:18:43 INFO mapred.JobClient: HDFS_BYTES_READ=243
15/12/21 11:18:43 INFO mapred.JobClient: FILE_BYTES_WRITTEN=64327
15/12/21 11:18:43 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=25
15/12/21 11:18:43 INFO mapred.JobClient: File Input Format Counters
15/12/21 11:18:43 INFO mapred.JobClient: Bytes Read=25
15/12/21 11:18:43 INFO mapred.JobClient: Map-Reduce Framework
15/12/21 11:18:43 INFO mapred.JobClient: Map output materialized bytes=61
15/12/21 11:18:43 INFO mapred.JobClient: Map input records=2
15/12/21 11:18:43 INFO mapred.JobClient: Reduce shuffle bytes=61
15/12/21 11:18:43 INFO mapred.JobClient: Spilled Records=8
15/12/21 11:18:43 INFO mapred.JobClient: Map output bytes=41
15/12/21 11:18:43 INFO mapred.JobClient: Total committed heap usage (bytes)=234758144
15/12/21 11:18:43 INFO mapred.JobClient: CPU time spent (ms)=3220
15/12/21 11:18:43 INFO mapred.JobClient: Combine input records=4
15/12/21 11:18:43 INFO mapred.JobClient: SPLIT_RAW_BYTES=218
15/12/21 11:18:43 INFO mapred.JobClient: Reduce input records=4
15/12/21 11:18:43 INFO mapred.JobClient: Reduce input groups=3
15/12/21 11:18:43 INFO mapred.JobClient: Combine output records=4
15/12/21 11:18:43 INFO mapred.JobClient: Physical memory (bytes) snapshot=403873792
15/12/21 11:18:43 INFO mapred.JobClient: Reduce output records=3
15/12/21 11:18:43 INFO mapred.JobClient: Virtual memory (bytes) snapshot=3180650496
15/12/21 11:18:43 INFO mapred.JobClient: Map output records=4
[root@master hadoop-1.0.1]# bin/hadoop fs -ls output
Found 3 items
-rw-r–r– 2 root supergroup 0 2015-12-21 11:18 /user/root/output/_SUCCESS
drwxr-xr-x - root supergroup 0 2015-12-21 11:18 /user/root/output/_logs
-rw-r–r– 2 root supergroup 25 2015-12-21 11:18 /user/root/output/part-r-00000

[root@master hadoop-1.0.1]# bin/hadoop fs -cat output/part-r-00000
Hello 2
hadoop 1
world 1

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章