Hadoop Environment Setup(VM fushion. Centos7)

Setup VM fushion

Configure Static IP Address

You will only need to edit the settings for:

  • DNS
  • GATEWAY
  • PREFIX
  • IPADDR
$>su root
$>cd /etc/sysconfig/network-scripts
$>vi ifcfg-eno16777736
ONBOOT=yes
IPADDR=192.168.XXX.201
PREFIX=24
GATEWAY=192.168.XXX.2
DNS=192.168.XXX.2

Static IP

Now we can ping ip but can not ping nameserver. To fix it:

$>vi /etc/resolv.conf
nameserver 192.168.168.2
$>su root
$>service network restart
$>ping www.baidu.com

ali.repo

[centos@localhost ~]$ cd /etc/yum.repos.d/
[centos@localhost yum.repos.d]$ su root
Password: 
[root@localhost yum.repos.d]# rename .repo .repo.backup *
[root@localhost yum.repos.d]# curl -o /etc/yum.repos.d/ali.repo http://mirrors.aliyun.com/repo/Centos-7.repo
[root@localhost yum.repos.d]# yum clean cache
[root@localhost yum.repos.d]# yum makecache
[root@localhost yum.repos.d]# yum -y install nano
[root@localhost yum.repos.d]# yum search ifconfig
[root@localhost yum.repos.d]# yum -y install net-tools

sudo

[root@localhost yum.repos.d]# nano /etc/sudoers
root    ALL=(ALL)       ALL
centos  ALL=(ALL)       ALL

Change hostname

[centos@localhost yum.repos.d]$ sudo nano /etc/hostname
s201

IP mapping

[centos@localhost yum.repos.d]$ sudo nano /etc/hosts
127.0.0.1 localhost
192.168.168.201 s201

Hadoop Environment Setup

1. JDK

[centos@localhost ~]$ cd ~
[centos@localhost ~]$ mkdir downloads
[centos@localhost ~]$ cd downloads/
[centos@localhost downloads]$ tar -xzvf jdk-8u65-linux-x64.tar.gz
[centos@localhost ~]$ sudo mkdir /soft
[centos@localhost ~]$ sudo chown centos:centos /soft
[centos@localhost ~]$ cd /home/centos/downloads/
[centos@localhost downloads]$ mv jdk1.8.0_65/ /soft/
[centos@localhost downloads]$ cd /soft
[centos@localhost soft]$ ln -s jdk1.8.0_65 jdk
[centos@localhost soft]$ cd /soft/jdk/bin
[centos@localhost bin]$ ./java  -version
java version "1.8.0_65"
Java(TM) SE Runtime Environment (build 1.8.0_65-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.65-b01, mixed mode)
[centos@localhost bin]$ 

JDK ok now

2. Configure JDK Environment Variable

[centos@localhost bin]$ sudo nano /etc/profile

In the end :

export  JAVA_HOME=/soft/jdk
export  PATH=$PATH:$JAVA_HOME/bin 
[centos@localhost bin]$ source /etc/profile
[centos@localhost bin]$ java -version
java version "1.8.0_65"

3.Hadoop setup

[centos@localhost bin]$ cd ~
[centos@localhost ~]$ cd downloads/
[centos@localhost downloads]$ tar -xzvf hadoop-2.7.3.tar.gz
[centos@s201 downloads]$ mv ~/downloads/hadoop-2.7.3 /soft/
[centos@s201 downloads]$ cd /soft
[centos@s201 soft]$ ln -s hadoop-2.7.3 hadoop
[centos@s201 soft]$ cd hadoop/bin
[centos@s201 bin]$ ./hadoop version
Hadoop 2.7.3

OK now

4. Configure Hadoop Environment Variables

[centos@s201 soft]$ sudo nano /etc/profile
export HADOOP_HOME=/soft/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin     
[centos@s201 bin]$ source /etc/profile
[centos@s201 bin]$ cd ~
[centos@s201 ~]$ hadoop version
Hadoop 2.7.3

OK now

Setup Pseudo-Distributed mode

[centos@s201 ~]$ cd /soft/hadoop/etc
[centos@s201 etc]$ mkdir alone
[centos@s201 etc]$ mkdir pseudo
[centos@s201 etc]$ mkdir full
[centos@s201 etc]$ cp hadoop/* alone/
[centos@s201 etc]$ cp hadoop/* pseudo/
[centos@s201 etc]$ cp hadoop/* full/
[centos@s201 etc]$ rm -rf hadoop/
[centos@s201 etc]$ ln -s pseudo hadoop

Then you need to change 4 xml files

  • core-site.xml
  • hdfs-site.xmls
  • yarn-site.xml
  • mapred-site.xml
[centos@s201 etc]$ cd hadoop
[centos@s201 hadoop]$ sudo nano core-site.xml
core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost/</value>
    </property>
</configuration>

[centos@s201 hadoop]$ sudo nano hdfs-site.xml
hdfs-site.xml

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>
centos@s202 hadoop]$ sudo nano yarn-site.xml
yarn-site.xml

<configuration>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>localhost</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>

Note: cp mapred-site.xml.template mapred-site.xml

[centos@s201 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[centos@s201 hadoop]$ sudo nano mapred-site.xml
mapred-site.xml

<configuration>
<property>
        <name>mapreduce.framework.name</name>                                
        <value>yarn</value>                                
</property>                                
</configuration>

Configure ssh

[centos@s201 ~]$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
[centos@s201 ~]$ cd ~/.ssh
[centos@s201 .ssh]$ cat id_rsa.pub >> authorized_keys
[centos@s201 .ssh]$ chmod 644 authorized_keys
[centos@s201 .ssh]$ ll
total 16
-rw-r--r--. 1 centos centos  393 Jul  1 22:37 authorized_keys
-rw-------. 1 centos centos 1679 Jul  1 22:36 id_rsa
-rw-r--r--. 1 centos centos  393 Jul  1 22:36 id_rsa.pub
-rw-r--r--. 1 centos centos  182 Jul  1 22:33 known_hosts
[centos@s201 .ssh]$ ssh s201
Last login: Sat Jul  1 22:33:47 2017 from s201
[centos@s201 ~]$ 

g and o can not have w authority so we need to chmod authorized_keys.

Start Pseudo-Distributed mode

Format HDFS

hadoop namenode -format 

//=hdfs namenode -format

[centos@s201 ~]$ cd /soft/hadoop/etc/hadoop
[centos@s201 hadoop]$ nano hadoop-env.sh
# The java implementation to use.
export JAVA_HOME=/soft/jdk 
[centos@s201 hadoop]$ start-all.sh
[centos@s201 hadoop]$ jps
19331 Jps
19092 NodeManager
18695 DataNode
18841 SecondaryNameNode
18986 ResourceManager
[centos@s201 hadoop]$ 

Now try http://192.168.168.201:50070/ on chrome. It would not work because the firewall. We need to stop it.

$>sudo systemctl disable firewalld.service     //disable       
$>sudo systemctl stop firewalld.service        //stop
$>sudo systemctl status firewalld.service      //check

if you have error:

Failed to stop firewalld.service: Unit firewalld.service not loaded.
[centos@s201 hadoop]$ sudo systemctl status firewalld.service 
● firewalld.service
   Loaded: not-found (Reason: No such file or directory)
   Active: inactive (dead)

then try :

systemctl list-units --type=service

Install if not available:

[centos@s201 hadoop]$ sudo yum install firewalld
[centos@s202 hadoop]$ stop-all.sh

Note: namenode need to format

[centos@s201 hadoop]$ hadoop namenode -format =hdfs namenode -format
[centos@s202 hadoop]$ start-all.sh

Failed to stop firewalld.service: Unit firewalld.service not loaded.

Wordcount example on Hadoop

[centos@s201 ~]$ pwd
/home/centos
[centos@s201 ~]$ mkdir input
[centos@s201 ~]$ cd input/
[centos@s201 input]$ echo "hello world" > file1.txt
[centos@s201 input]$ echo "hello hadoop" > file2.txt
[centos@s201 input]$ echo "hello mapreduce" >> file2.txt
[centos@s201 input]$ ls
file1.txt  file2.txt
[centos@s201 input]$ hadoop fs -mkdir /wc_input
[centos@s201 input]$ hadoop fs -lsr /
lsr: DEPRECATED: Please use 'ls -R' instead.
drwxr-xr-x   - centos supergroup          0 2017-07-01 23:53 /wc_input
[centos@s201 input]$ hadoop fs -put ../input/* /wc_input
[centos@s201 input]$ hadoop fs -lsr /
lsr: DEPRECATED: Please use 'ls -R' instead.
drwxr-xr-x   - centos supergroup          0 2017-07-01 23:54 /wc_input
-rw-r--r--   1 centos supergroup         12 2017-07-01 23:54 /wc_input/file1.txt
-rw-r--r--   1 centos supergroup         29 2017-07-01 23:54 /wc_input/file2.txt
[centos@s201 ~]$ cd /soft/hadoop/share/hadoop/mapreduce
[centos@s201 mapreduce]$ hadoop jar hadoop-mapreduce-examples-2.7.3.jar wordcount /wc_input /wc_output
[centos@s201 mapreduce]$ hadoop fs -cat /wc_output/part-r-00000
hadoop  1
hello   3
mapreduce       1
world   1
[centos@s201 mapreduce]$ 
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章