Linux下Hadoop集羣安裝詳細步驟 .

 

1. 環境的需求(這裏虛擬機下Centos6的安裝就不說了)

          Centos6 + hadoop-0.21.0.tar

2. 服務器的配置(我在這裏均爲2個CPU,2G內存,100硬盤)

   在這裏,服務器IP最好是固定的,也就是說,能相互之間用ping命令ping通的IP

   建議在公司開發,因爲公司有這個條件,我在這裏配置了三臺datanode,分別爲

   Datanode1,Datanode2,Datanode3(這些都是創建虛機的主機名)

          服務器名稱     IP地址(自己定)

   Namenode       192.168.16.1

   Datanode1                            192.168.16.2

   Datanode2                            192.168.16.3

   Datanode3                            192.168.16.4

   2.1 修改服務器IP的方法如下:

                            vi  /etc/sysconfig/network-scripts/ifcfg-eth0

                           

                            DEVICE="eth0"

                            #這裏是你網卡的物理地址,通常檢測到的網卡你就不用輸入了

                            #打開後該項已經存在,無需修改

                            HWADDR="00:0C:29:95:1D:A5"

                            BOOTPROTO="static"

                            ONBOOT="yes"

                            #這裏是IP地址,不能重複,從253降序排列,被佔用了的不能再次使用.

                            IPADDR=172.16.101.245

                            NETMASK=255.255.255.0

                            NETWORK=172.16.101.0

                            BROADCAST=172.16.101.255

                            GATEWAY=172.16.101.254

 

                            退出保存後,執行如下命令,使設置的網關馬上生效:

                            shell>> ifdown eth0

                            shell>> ifup eth0

                            shell>> /etc/init.d/network restart

3. 安裝JDK6(我用的是jdk-6u26-linux-x64-rpm.bin)

        3.1. 創建安裝目錄 mkdir /usr/java/

        3.2. 把jdk-6u26-linux-x64-rpm.bin移動到/usr/java/下然後執行

                          ./jdk-6u26-linux-x64-rpm.bin

                          運行中會要求輸入,順序yes和按回車就行.

                          執行後會看到一個文件夾名爲:jdk1.6.0_26

        3.3. 設置環境變量

                            /etc/profile增加如下內容

                            #config java

                            JAVA_HOME=/usr/java/jdk1.6.0_26

                            CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar

                            PATH=$JAVA_HOME/bin:$HOME/bin:$PATH

                            export PATH JAVA_HOME CLASSPATH

        3.4. 讓設置生效: source /etc/profile

4. 安裝ssh服務器和客戶端

        a. yum search ssh

        b. 找到要安裝的server(這裏拿openssh-server.x86_64)

        c. 安裝server: yum install openssh-server.x86_64

        d. 安裝client(這裏拿openssh-clients.x86_64)

        e. 安裝client: yum install openssh-clients.x86_64

5. 設置ssh進行Namenode和Datanode之間無密碼訪問

        a. 用 ssh-key-gen 在本地主機上創建公鑰和密鑰

                            [root@Namenode ~]# ssh-keygen -t  rsa

                            Enter file in which to save the key (/home/jsmith/.ssh/id_rsa):[Enter key]

                            Enter passphrase (empty for no passphrase): [Press enter key]

                            Enter same passphrase again: [Pess enter key]

                            Your identification has been saved in /home/jsmith/.ssh/id_rsa.

                            Your public key has been saved in /home/jsmith/.ssh/id_rsa.pub.

                            The key fingerprint is: 33:b3:fe:af:95:95:18:11:31:d5:de:96:2f:f2:35:f9

                            root@Namenode

        b. 用 ssh-copy-id 把公鑰複製到遠程主機上

                            [root@Namenode ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub  root@Datanode1

                            root@Datanode1's password:

                            Now try logging into the machine, with ―ssh ?root@Datanode1‘‖, and check in:

                            .ssh/authorized_keys to make sure we haven‘t added extra keys that you weren‘t expecting.

                            [注: ssh-copy-id 把密鑰追加到遠程主機的 .ssh/authorized_key 上.]

   c. 直接登錄遠程主機

                            [root@Namenode ~]# ssh Datanode1

                            Last login: Sun Nov 16 17:22:33 2008 from 192.168.1.2

                            [注: SSH 不會詢問密碼.]

                            [root@Datanode1 ~]

                            [注: 你現在已經登錄到了遠程主機上]

   d. 注意:在這裏,執行都在Namenode上面,而且Namenode也需要對自己進行無密碼操作即

      [root@Namenode ~]# ssh-copy-id -i ~/.ssh/id_rsa.pub  root@Namenode操作,

      其他的,按照a-c重複操作Datanode2和Datanode3就行了

      一定要能無密碼訪問,否則不能集羣Hadoop一定失敗.

6. 安裝Hadoop(這裏,每一臺服務器的JDK和Hadoop安裝路徑都相同)

        a. 創建安裝目錄 mkdir /usr/local/hadoop/

        b. 解壓安裝文件hadoop-0.21.0.tar放入到安裝目錄

tar -zxvf hadoop-0.21.0.tar

        c. 設置環境變量

                            /etc/profile增加如下內容

                            #config hadoop

                            export HADOOP_HOME=/usr/local/hadoop/

                            export PATH=$HADOOP_HOME/bin:$PATH

                            #hadoop的日誌文件路徑的配置

                            export HADOOP_LOG_DIR=${HADOOP_HOME}/logs

               讓設置生效: source /etc/profile

        d. 設置主從配置

                     Namenode中/etc/hosts的配置如下:

                     192.168.16.1        Namenode

                     192.168.16.2        Datanode1

                     192.168.16.3        Datanode2

                     192.168.16.4        Datanode3

  

                     Namenode中/usr/local/hadoop/conf/masters的配置如下:

                     Namenode

                    

                     Namenode中/usr/local/hadoop/conf/slaves的配置如下:

                     Datanode1

                     Datanode2

                     Datanode3

                    

                    

                     Datanode1中/etc/hosts的配置如下:(/usr/local/hadoop/conf/中的masters和slaves的配置跟Namenode一樣)

                     192.168.16.1        Namenode

                     192.168.16.2        Datanode1

                    

                     Datanode2中/etc/hosts的配置如下:(/usr/local/hadoop/conf/中的masters和slaves的配置跟Namenode一樣)

                     192.168.16.1        Namenode

                     192.168.16.3        Datanode2

                    

                     Datanode3中/etc/hosts的配置如下:(/usr/local/hadoop/conf/中的masters和slaves的配置跟Namenode一樣)

                     192.168.16.1        Namenode

                     192.168.16.4        Datanode3

   e. 修改配置文件/usr/local/hadoop/conf/ hadoop-env.sh

      把JAVA_HOME該爲安裝jdk的路徑

                            # The java implementation to use.  Required.

                            export JAVA_HOME=/usr/java/jdk1.6.0_26/

   f. 修改配置文件 core-site.xml內容如下:

                            <configuration>

                               <property>

                                 <name>fs.default.name</name>

                                 <value>hdfs://Namenode:9000/</value>

                               </property>

                              <property>

                                <name>hadoop.tmp.dir</name>

                                <value>/usr/local/hadoop/tmp/</value>

                              </property>

                            </configuration>

   g. 修改配置文件 hdfs-site.xml內容如下:

                            <configuration>

                <property>

                  <name>dfs.replication</name>

                  #設置備份文件數

                  <value>1</value>

                </property>

                            </configuration>

   h. 修改配置文件 mapred-site.xml內容如下:

                            <configuration>

                 <property>

              <name>mapred.job.tracker</name>

              #一般jobtracker和namenode設置到同一臺機器上,但是同樣可以集羣

              <value>Namenode:9001</value>

                 </property>

                            </configuration>

   i. 注意:上面講的配置文件全部是在Namenode中配置的,只要把這三個配置文件拷貝複製到其他的Datanode上就行了

   j. 初始化Hadoop: cd /usr/local/hadoop/

               ./bin/hadoop namenode -format

               出現類似如下的信息:但是不能出現ERORR字段.

                            .2.jar:/usr/local/hadoop/bin/../lib/paranamer-generator-2.2.jar:/usr/local/hadoop/bin/../lib/qdox-1.10.1.jar:/usr/local/hadoop/bin/../lib/servlet-api-2.5-6.1.14.jar:/usr/local/hadoop/bin/../lib/slf4j-api-1.5.11.jar:/usr/local/hadoop/bin/../lib/slf4j-log4j12-1.5.11.jar:/usr/local/hadoop/bin/../lib/xmlenc-0.52.jar:/usr/local/hadoop/bin/../lib/jsp-2.1/*.jar:/usr/local/hadoop/hdfs/bin/../conf:/usr/local/hadoop/hdfs/bin/../hadoop-hdfs-*.jar:/usr/local/hadoop/hdfs/bin/../lib/*.jar:/usr/local/hadoop/bin/../mapred/conf:/usr/local/hadoop/bin/../mapred/hadoop-mapred-*.jar:/usr/local/hadoop/bin/../mapred/lib/*.jar:/usr/local/hadoop/hdfs/bin/../hadoop-hdfs-*.jar:/usr/local/hadoop/hdfs/bin/../lib/*.jar

                            STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.21 -r 985326; compiled by 'tomwhite' on Tue Aug 17 01:02:28 EDT 2010

                            ************************************************************/

                            Re-format filesystem in /usr/local/hadoop/tmp/dfs/name ? (Y or N) y

                            Format aborted in /usr/local/hadoop/tmp/dfs/name

                            11/06/16 13:04:17 INFO namenode.NameNode: SHUTDOWN_MSG:

                            /************************************************************

                            SHUTDOWN_MSG: Shutting down NameNode at namenode/172.16.101.251

                            ************************************************************/

   k. 啓動Hadoop ./bin/start-all.sh

                            This script is Deprecated. Instead use start-dfs.sh and start-mapred.sh

                            starting namenode, logging to /usr/local/hadoop//logs/hadoop-root-namenode-namenode.out

                            datanode1: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-root-datanode-datanode1.out

                            datanode2: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-root-datanode-datanode2.out

                            datanode3: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-root-datanode-datanode3.out

                            namenode: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-root-secondarynamenode-namenode.out

                            starting jobtracker, logging to /usr/local/hadoop//logs/hadoop-root-jobtracker-namenode.out

                            datanode3: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-root-tasktracker-datanode3.out

                            datanode2: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-root-tasktracker-datanode2.out

                            datanode1: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-root-tasktracker-datanode1.out

      啓動後用命令JPS查看結果如下:

                            [root@namenode hadoop]# jps

                            1806 Jps

                            1368 NameNode

                            1694 JobTracker

                            1587 SecondaryNameNode

                            然後到Datanode1/2/3上去查看,執行JPS,結果如下:

                            [root@datanode2 hadoop]# jps

                            1440 Jps

                            1382 TaskTracker

                            1303 DataNode

                            [root@datanode2 hadoop]# jps

                            1382 TaskTracker

                            1303 DataNode

                            1452 Jps

                            說明你成功集羣安裝了Hadoop

7. HDFS操作

          運行bin/目錄的hadoop命令,可以查看Haoop所有支持的操作及其用法,這裏以幾個簡單的操作爲例。

建立目錄

          [root@namenode hadoop]# ./bin/hadoop  dfs  -mkdir  testdir

          在HDFS中建立一個名爲testdir的目錄

複製文件

          [root@namenode hadoop]# ./bin/hadoop  dfs  -put  /home/dbrg/large.zip  testfile.zip

          把本地文件large.zip拷貝到HDFS的根目錄/user/dbrg/下,文件名爲testfile.zip

查看現有文件

          [root@namenode hadoop]# ./bin/hadoop  dfs  -ls

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章