Setup VM fushion
Configure Static IP Address
You will only need to edit the settings for:
- DNS
- GATEWAY
- PREFIX
- IPADDR
$>su root
$>cd /etc/sysconfig/network-scripts
$>vi ifcfg-eno16777736
ONBOOT=yes
IPADDR=192.168.XXX.201
PREFIX=24
GATEWAY=192.168.XXX.2
DNS=192.168.XXX.2
Now we can ping ip but can not ping nameserver. To fix it:
$>vi /etc/resolv.conf
nameserver 192.168.168.2
$>su root
$>service network restart
$>ping www.baidu.com
ali.repo
[centos@localhost ~]$ cd /etc/yum.repos.d/
[centos@localhost yum.repos.d]$ su root
Password:
[root@localhost yum.repos.d]# rename .repo .repo.backup *
[root@localhost yum.repos.d]# curl -o /etc/yum.repos.d/ali.repo http://mirrors.aliyun.com/repo/Centos-7.repo
[root@localhost yum.repos.d]# yum clean cache
[root@localhost yum.repos.d]# yum makecache
[root@localhost yum.repos.d]# yum -y install nano
[root@localhost yum.repos.d]# yum search ifconfig
[root@localhost yum.repos.d]# yum -y install net-tools
sudo
[root@localhost yum.repos.d]# nano /etc/sudoers
root ALL=(ALL) ALL
centos ALL=(ALL) ALL
Change hostname
[centos@localhost yum.repos.d]$ sudo nano /etc/hostname
s201
IP mapping
[centos@localhost yum.repos.d]$ sudo nano /etc/hosts
127.0.0.1 localhost
192.168.168.201 s201
Hadoop Environment Setup
1. JDK
[centos@localhost ~]$ cd ~
[centos@localhost ~]$ mkdir downloads
[centos@localhost ~]$ cd downloads/
[centos@localhost downloads]$ tar -xzvf jdk-8u65-linux-x64.tar.gz
[centos@localhost ~]$ sudo mkdir /soft
[centos@localhost ~]$ sudo chown centos:centos /soft
[centos@localhost ~]$ cd /home/centos/downloads/
[centos@localhost downloads]$ mv jdk1.8.0_65/ /soft/
[centos@localhost downloads]$ cd /soft
[centos@localhost soft]$ ln -s jdk1.8.0_65 jdk
[centos@localhost soft]$ cd /soft/jdk/bin
[centos@localhost bin]$ ./java -version
java version "1.8.0_65"
Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)
[centos@localhost bin]$
JDK ok now
2. Configure JDK Environment Variable
[centos@localhost bin]$ sudo nano /etc/profile
In the end :
export JAVA_HOME=/soft/jdk
export PATH=$PATH:$JAVA_HOME/bin
[centos@localhost bin]$ source /etc/profile
[centos@localhost bin]$ java -version
java version "1.8.0_65"
3.Hadoop setup
[centos@localhost bin]$ cd ~
[centos@localhost ~]$ cd downloads/
[centos@localhost downloads]$ tar -xzvf hadoop-2.7.3.tar.gz
[centos@s201 downloads]$ mv ~/downloads/hadoop-2.7.3 /soft/
[centos@s201 downloads]$ cd /soft
[centos@s201 soft]$ ln -s hadoop-2.7.3 hadoop
[centos@s201 soft]$ cd hadoop/bin
[centos@s201 bin]$ ./hadoop version
Hadoop 2.7.3
OK now
4. Configure Hadoop Environment Variables
[centos@s201 soft]$ sudo nano /etc/profile
export HADOOP_HOME=/soft/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
[centos@s201 bin]$ source /etc/profile
[centos@s201 bin]$ cd ~
[centos@s201 ~]$ hadoop version
Hadoop 2.7.3
OK now
Setup Pseudo-Distributed mode
[centos@s201 ~]$ cd /soft/hadoop/etc
[centos@s201 etc]$ mkdir alone
[centos@s201 etc]$ mkdir pseudo
[centos@s201 etc]$ mkdir full
[centos@s201 etc]$ cp hadoop/* alone/
[centos@s201 etc]$ cp hadoop/* pseudo/
[centos@s201 etc]$ cp hadoop/* full/
[centos@s201 etc]$ rm -rf hadoop/
[centos@s201 etc]$ ln -s pseudo hadoop
Then you need to change 4 xml files
- core-site.xml
- hdfs-site.xmls
- yarn-site.xml
- mapred-site.xml
[centos@s201 etc]$ cd hadoop
[centos@s201 hadoop]$ sudo nano core-site.xml
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost/</value>
</property>
</configuration>
[centos@s201 hadoop]$ sudo nano hdfs-site.xml
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
centos@s202 hadoop]$ sudo nano yarn-site.xml
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
Note: cp mapred-site.xml.template mapred-site.xml
[centos@s201 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[centos@s201 hadoop]$ sudo nano mapred-site.xml
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Configure ssh
[centos@s201 ~]$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
[centos@s201 ~]$ cd ~/.ssh
[centos@s201 .ssh]$ cat id_rsa.pub >> authorized_keys
[centos@s201 .ssh]$ chmod 644 authorized_keys
[centos@s201 .ssh]$ ll
total 16
-rw-r--r--. 1 centos centos 393 Jul 1 22:37 authorized_keys
-rw-------. 1 centos centos 1679 Jul 1 22:36 id_rsa
-rw-r--r--. 1 centos centos 393 Jul 1 22:36 id_rsa.pub
-rw-r--r--. 1 centos centos 182 Jul 1 22:33 known_hosts
[centos@s201 .ssh]$ ssh s201
Last login: Sat Jul 1 22:33:47 2017 from s201
[centos@s201 ~]$
g and o can not have w authority so we need to chmod authorized_keys.
Start Pseudo-Distributed mode
Format HDFS
hadoop namenode -format
//=hdfs namenode -format
[centos@s201 ~]$ cd /soft/hadoop/etc/hadoop
[centos@s201 hadoop]$ nano hadoop-env.sh
# The java implementation to use.
export JAVA_HOME=/soft/jdk
[centos@s201 hadoop]$ start-all.sh
[centos@s201 hadoop]$ jps
19331 Jps
19092 NodeManager
18695 DataNode
18841 SecondaryNameNode
18986 ResourceManager
[centos@s201 hadoop]$
Now try http://192.168.168.201:50070/ on chrome. It would not work because the firewall. We need to stop it.
$>sudo systemctl disable firewalld.service //disable
$>sudo systemctl stop firewalld.service //stop
$>sudo systemctl status firewalld.service //check
if you have error:
Failed to stop firewalld.service: Unit firewalld.service not loaded.
[centos@s201 hadoop]$ sudo systemctl status firewalld.service
● firewalld.service
Loaded: not-found (Reason: No such file or directory)
Active: inactive (dead)
then try :
systemctl list-units --type=service
Install if not available:
[centos@s201 hadoop]$ sudo yum install firewalld
[centos@s202 hadoop]$ stop-all.sh
Note: namenode need to format
[centos@s201 hadoop]$ hadoop namenode -format =hdfs namenode -format
[centos@s202 hadoop]$ start-all.sh
Failed to stop firewalld.service: Unit firewalld.service not loaded.
Wordcount example on Hadoop
[centos@s201 ~]$ pwd
/home/centos
[centos@s201 ~]$ mkdir input
[centos@s201 ~]$ cd input/
[centos@s201 input]$ echo "hello world" > file1.txt
[centos@s201 input]$ echo "hello hadoop" > file2.txt
[centos@s201 input]$ echo "hello mapreduce" >> file2.txt
[centos@s201 input]$ ls
file1.txt file2.txt
[centos@s201 input]$ hadoop fs -mkdir /wc_input
[centos@s201 input]$ hadoop fs -lsr /
lsr: DEPRECATED: Please use 'ls -R' instead.
drwxr-xr-x - centos supergroup 0 2017-07-01 23:53 /wc_input
[centos@s201 input]$ hadoop fs -put ../input/* /wc_input
[centos@s201 input]$ hadoop fs -lsr /
lsr: DEPRECATED: Please use 'ls -R' instead.
drwxr-xr-x - centos supergroup 0 2017-07-01 23:54 /wc_input
-rw-r--r-- 1 centos supergroup 12 2017-07-01 23:54 /wc_input/file1.txt
-rw-r--r-- 1 centos supergroup 29 2017-07-01 23:54 /wc_input/file2.txt
[centos@s201 ~]$ cd /soft/hadoop/share/hadoop/mapreduce
[centos@s201 mapreduce]$ hadoop jar hadoop-mapreduce-examples-2.7.3.jar wordcount /wc_input /wc_output
[centos@s201 mapreduce]$ hadoop fs -cat /wc_output/part-r-00000
hadoop 1
hello 3
mapreduce 1
world 1
[centos@s201 mapreduce]$