hbase集羣搭建

本文主要講操作,一次部署服務器記錄,不對的地方敬請雅正!
準備工作:
機器4臺,這裏分別使用(151(master),100,101,102)機器。
系統:
CentOS release 6.5 (Final)
查看版本命令
# cat /etc/issue
軟件:
jdk1.7.0_79
hadoop-2.6.0
hbase-1.0.1.1
zookeeper-3.4.6
相關軟件下載,訪問[email protected]

特別強調:
    1.因爲是集羣,所有每一臺配置都一樣,我們就從master上配置,然後scp(複製)到其他從屬機器上。特殊情況特殊說明~
    2.所有操作都是root
    3.配置文件比較矯情,有些複製的地方可能會有空格字節碼的問題。

2步驟說明:
1.解壓
2.機器配置
3.配置環境變量
4.配置hadoop
5.配置zookeeper
6.配置hbase
7.完成

一、將下載好的文件加壓到/home/hadoop/software/(可以自己指定)目錄下:

二、機器配置
1、ip配置(三臺機器都操作)
把網卡IP設置成靜態 (NAT方式)
# ifconfig # 查看網卡IP
# vi /etc/sysconfig/network-scripts/ifcfg-eth0
ONBOOT=yes # 把網卡設置成開機啓動
BOOTPROTO=static # 把DHCP改爲static
IPADDR=192.168.17.129
NETMASK=255.255.255.0
GATEWAY=192.168.17.2
DNS1=192.168.17.2 #第一個DNS設置成跟網關地址一樣
DNS2=202.96.209.5

2、主機名(三臺機器都操作)
建議
第一臺:master(主機)
第二臺:master-slave01
第三臺:master-slave02
第四臺:master-slave03
# vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=master

3、hosts映射文件(三臺機器都操作)
# vim /etc/hosts
192.168.xx.99  master
192.168.xx.100  master-slave01
192.168.xx.101  master-slave02
192.168.xx.102  master-slave03

4、關閉iptables和selinux    [root用戶] [三臺服務器]
# service iptables stop     --關閉防火牆服務
# chkconfig iptables off    --讓iptables開啓不啓動

# vi /etc/sysconfig/selinux 
SELINUX=disabled
若考慮安全問題,可能需要自己開啓端口

5、ssh無密鑰登陸
    [三臺機器]
    # ssh-keygen -t rsa     --一直回車,生成一對公私鑰對         
        ** /home/beifeng/.ssh/          
        id_rsa  id_rsa.pub
        --把自己的公鑰拷貝給其他三臺服務器
    # ssh-copy-id master-slave01
    # ssh-copy-id master-slave02
    # ssh-copy-id master-slave03        

6、同步時間
    [master]
    1、同步時間
    # ntpdate cn.pool.ntp.org   --同步當前服務器時間
    25 Aug 14:47:41 ntpdate[10105]: step time server 202.112.29.82 offset -9.341897 sec

    2、檢查軟件包
    # rpm -qa | grep ntp        --查看ntp軟件包是否已安裝
    ntp-4.2.4p8-3.el6.centos.x86_64
    # yum -y install ntp        --如果沒有安裝需要安裝ntp

    3、修改ntp配置文件
    # vi /etc/ntp.conf 
    ####去掉下面這行前面的# ,並把網段修改成自己的網段
    restrict 192.168.17.0 mask 255.255.255.0 nomodify notrap

    ####註釋掉以下幾行
    #server 0.centos.pool.ntp.org
    #server 1.centos.pool.ntp.org
    #server 2.centos.pool.ntp.org

    ####把下面兩行前面的#號去掉,如果沒有這兩行內容,需要手動添加
    server  127.127.1.0     # local clock
    fudge   127.127.1.0 stratum 10

    4、重啓ntp服務
    # service ntpd start
    # chkconfig ntpd on


    [master-slave01、master-slave02、master-slave03]
    ** 去同步hadoop-senior01這臺時間服務器時間
    # service ntpd stop 
    # chkconfig ntpd off

    # ntpdate hadoop-senior01.beifeng.com   --去第一臺服務器同步時間
    25 Aug 15:16:47 ntpdate[2092]: adjust time server 192.168.17.129 offset 0.311666 sec

    制定計劃任務,週期性同步時間
    # crontab -e
    */10 * * * * /usr/sbin/ntpdate hadoop-senior01.beifeng.com
    [分  時 日  月 星期]
    # service crond restart

三、配置環境變量
爲了統一管理,講環境變量同意放到/root/.bashrc
# vim /root/.bashrc
添加如下內容:
export SOFT_HOME=/home/hadoop/software/

    export JAVA_HOME=$SOFT_HOME/jdk1.7.0_79/
    export JAVA_JRE=$JAVA_HOME/jre;
    export PATH=$PATH:$JAVA_HOME/bin;

    export HADOOP_HOME=$SOFT_HOME/hadoop-2.6.0/
    export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin;

    export HBASE_HOME=$SOFT_HOME/hbase-1.0.1.1/
    export PATH=$PATH:$HBASE_HOME/bin;

    export ZOOKEEPER_HOME=$SOFT_HOME/zookeeper-3.4.6/
    export PATH=$PATH:$ZOOKEEPER_HOME/bin;

使環境變量生效:
# source /root/.bashrc 

四、配置hadoop
hadoop配置主要在hadoop-2.6.0/etc/hadoop/目錄下。
主要涉及文件:
core-site.xml
hdfs-site.xml
mapred-site.xml(默認沒有該文件,需要從mapred-site.xml.template複製)
yarn-site.xml
slaves
yarn-env.sh
hadoop-env.sh

1、core-site.xml
<property>
       <name>fs.trash.interval</name>
       <value>10080</value>
 </property>
 <property>
     <name>fs.defaultFS</name>
     <value>hdfs://master:9000</value>
     <description></description>
 </property>
 <property>
     <name>hadoop.tmp.dir</name>
     <value>/home/hadoop/temp/hadoop/</value>
     <description>a base for other temporary directories.</description>
 </property>
 <property>
     <name>ha.zookeeper.quorum</name>
     <value>master-slave03,master-slave02,master-slave01</value>
     <description></description>
 </property>
<property>
   <name>io.compression.codecs</name>
   <value>
     org.apache.hadoop.io.compress.BZip2Codec,
   </value>
 </property>
2、hdfs-site.xml
<property>
     <name>dfs.namenode.name.dir</name>
     <value>/home/hadoop/software/hadoop-2.6.0/name</value>
     <description></description>
 </property>
 <property>
     <name>dfs.datanode.data.dir</name>
     <value>/data1,/data2,/data3</value>
     <description></description>
 </property>

 <!--######-->
<property>
     <name>dfs.namenode.secondary.http-address</name>
     <value>master:50090</value>
 </property>

<!--
 <property>
     <name>dfs.namenode.shared.edits.dir</name>
     <value>qjournal://master-slave03:8485;master-slave02:8485;master-slave01:8485/cluster1</value>
     <description></description>
 </property>
 -->
 <property>
     <name>dfs.journalnode.edits.dir</name>
     <value>/home/hadoop/temp/journal</value>
     <description></description>
 </property>



 <property>
     <name>dfs.client.failover.proxy.provider.cluster1</name>
     <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
     <description></description>
 </property>

 <property>
     <name>dfs.ha.fencing.methods</name>
     <value>
         sshfence
         <!--shell-->
     </value>
     <description></description>
 </property>

 <property>
     <name>dfs.ha.fencing.ssh.private-key-files</name>
     <value>/root/.ssh/id_rsa</value>
     <description></description>
 </property>

 <property>
     <name>dfs.ha.fencing.ssh.connect-timeout</name>
     <value>30000</value>
     <description></description>
 </property>

 <!--######-->
 <property>
     <name>dfs.permissions.enabled</name>
     <value>false</value>
     <description></description>
 </property>

 <property>
     <name>dfs.datanode.max.transfer.threads</name>
     <value>8192</value>
     <description></description>
 </property>
 <property>
    <name>dfs.block.size</name>
    <value>256m</value>
    <description>塊大小字節。可以使用後綴:k,m,g,t,p,e指定大小(1g, 等待,默認爲128mb)</description>
 </property> 

 <property>
     <name>dfs.replication</name>
     <value>3</value>
     <description></description>
 </property>
 <property>
     <name>dfs.namenode.replication.min</name>
     <value>1</value>
     <description></description>
 </property>
 <property>
    <name>dfs.replication.max</name>
    <value>30</value>
    <description>最大塊副本數,不要大於節點總數</description>
 </property>

 <property>
     <name>dfs.webhdfs.enabled</name>
     <value>true</value>
     <description></description>
 </property>
 <property>
     <name>dfs.datanode.socket.write.timeout</name>
     <value>3000000</value>
 </property>
 <property>
     <name>dfs.socket.timeout</name>
     <value>3000000</value>
 </property>
 <property>
     <name>dfs.datanode.handler.count</name>
     <value>20</value>
 </property>
 <property>
     <name>dfs.namenode.handler.count</name>
     <value>60</value>
</property>
3、mapred-site.xml(默認沒有該文件,需要從mapred-site.xml.template複製)
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>
<property>
    <name>mapreduce.jobhistory.address</name>
    <value>master:10020</value>
</property>
<property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>master:19888</value>
</property>
4、yarn-site.xml
<property>
   <name>yarn.resourcemanager.hostname</name>
   <value>master</value>
</property>
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
  <description>nodemanager節點間數據傳輸方式</description>
</property>

<property>
 <name>yarn.nodemanager.local-dirs</name>
 <value>/home/hadoop/software/hadoop-2.6.0/hadoop/yarn/local</value>
 <description>中間結果存放位置</description>
</property>
<property> 
 <name>yarn.nodemanager.log-dirs</name> 
 <value>/home/hadoop/software/hadoop-2.6.0/hadoop/yarn/logs</value> 
 <description>日誌存放地址</description>
</property> 
<property> 
<name>yarn.nodemanager.resource.memory-mb</name> 
<value>98304</value>
<discription>每個節點可用內存,單位MB containers * RAM-per-container</discription> 
</property> 
<property> 
<name>yarn.nodemanager.vmem-pmem-ratio</name> 
<value>2.1</value> 
<discription>任務每使用1MB物理內存,最多可使用虛擬內存量,默認是2.1</discription> 
</property> 
<property>
           <name>yarn.app.mapreduce.am.resource.mb</name>
           <value>6144</value>
           <discription>2 * RAM-per-container</discription>
   </property>
   <property>
           <name>yarn.app.mapreduce.am.command-opts</name>
           <value>-Xmx4906m</value>
           <discription>0.8 * 2 * RAM-per-container</discription>
   </property>
   <property>
           <name>yarn.scheduler.minimum-allocation-mb</name>
           <value>3072</value>
           <discription>單個任務可申請最少內存,默認1024MB   RAM-per-container</discription>
   </property>
   <property>
           <name>yarn.scheduler.maximum-allocation-mb</name>
           <value>40960</value>
           <discription>單個任務可申請最大內存,默認8192MB containers * RAM-per-container</discription>
   </property>
  <property>
<name>mapreduce.map.memory.mb</name>
<value>3072</value>
<description>每個Map任務的物理內存限制</description>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx2450m</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>6144</value>
<description>每個Reduce任務的物理內存限制</description>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx4906m</value>
</property>

   <property>
<name>mapreduce.task.io.sort.mb</name>
<value>600</value>
<description>任務內部排序緩衝區大小</description>
</property>

 <property> 
<name>yarn.scheduler.minimum-allocation-vcores</name> 
<value>1</value> 
<discription>單個任務可申請的最小虛擬CPU個數,默認是1</discription> 
 </property> 
 <property> 
   <name>yarn.scheduler.maximum-allocation-vcores</name> 
   <value>20</value>
<discription>單個任務可申請的最多虛擬CPU個數,默認是32</discription> 
 </property> 
 <property> 
   <name>yarn.nodemanager.resource.cpu-vcores</name> 
   <value>24</value>
  <discription>YARN可使用的虛擬CPU個數,默認是8,注意,目前推薦將該值設值爲與物理CPU核數數目相同</discription> 
</property>
<property>
   <name>yarn.nodemanager.resource.percentage-physical-cpu-limit</name>
   <value>80</value>
  <discription>Percentage of CPU that can be allocated for containers. This setting allows users to limit the amount of CPU that YARN containers use. Currently functional only on Linux using cgroups. The default is to use 100% of CPU. Default:100</discription>
</property>
5、slaves
master2-slave01
master2-slave02
master2-slave03
6、yarn-env.sh
    指定JAVA_HOME目錄
    export JAVA_HOME=/home/hadoop/software/jdk1.7.0_79/
7、hadoop-env.sh
    指定JAVA_HOME目錄
    export JAVA_HOME=/home/hadoop/software/jdk1.7.0_79/
8、配置完成
    將配置好的文件複製到其他機器上
    scp /home/hadoop/software/hadoop-2.6.0 master-slave01:/home/hadoop/software/
    scp /home/hadoop/software/hadoop-2.6.0 master-slave02:/home/hadoop/software/
    scp /home/hadoop/software/hadoop-2.6.0 master-slave03:/home/hadoop/software/
9、啓動集羣
    格式化namenode
    [root@master hadoop-2.6.0]# bin/hdfs namenode -format
    啓動hdfs
    [root@master hadoop-2.6.0]# sbin/start-dfs.sh

    查看成功狀態:
    [root@master hadoop-2.6.0]# jps
    jps
    NameNode
    SecondaryNameNode

    [root@master-slave01 hadoop-2.6.0]# jps
    jps
    DataNode

    [root@master-slave02 hadoop-2.6.0]# jps
    jps
    DataNode

    [root@master-slave03 hadoop-2.6.0]# jps
    jps
    DataNode

    至此hadoop啓動成功
    網頁上輸入:http://192.168.xx.99:50070/

五、配置zookeeper
/home/hadoop/software/zookeeper-3.4.6/conf
$ cp -a zoo_sample.cfg zoo.cfg
修改配置文件:
dataDir=/home/hadoop/software/zkData

server.1=master-slave01
server.2=master-slave02
server.3=master-slave03
$ mkdir /home/hadoop/software/zkData
添加myid文件
    ** 一定要在linux裏面創建
    $ vi zkData/myid 
    1
    分別對應
        server.1
        server.2
        server.3
將配置好的文件複製到其他機器上
scp /home/hadoop/software/zookeeper-3.4.6 master-slave01:/home/hadoop/software/
scp /home/hadoop/software/zookeeper-3.4.6 master-slave02:/home/hadoop/software/
scp /home/hadoop/software/zookeeper-3.4.6 master-slave03:/home/hadoop/software/

啓動:
    分別在配置的
        server.1=master-slave01
        server.2=master-slave02
        server.3=master-slave03
    機器上啓動
    [root@master-slave01 zookeeper-3.4.6]#  bin/zkServer.sh start
    [root@master-slave02 zookeeper-3.4.6]#  bin/zkServer.sh start
    [root@master-slave03 zookeeper-3.4.6]#  bin/zkServer.sh start

六、配置hbase
hbase配置2個配置文件
[root@master hbase-1.0.1.1]# vim conf/hbase-site.xml

<!--master processer server and port-->
<property>
    <name>hbase.master.port</name>
    <value>60000</value>
</property>

<!--clock-->
<property>
    <name>hbase.master.maxclockskew</name>
    <value>180000</value>
</property>

<!--hdfs path for hbase data-->
<property>
    <name>hbase.rootdir</name>
    <value>hdfs://master:9000/hbase</value>
</property>

<!--run mode is distributed-->
<property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
</property>

<!--zookeeper quorum server list,the num must is odd-->
<property> 
    <name>hbase.zookeeper.quorum</name>
    <value>master-slave03,master-slave02,master-slave01</value>
</property>

<!--the path as same an zoo.cfg-->
<property>
    <name>hbase.zookeeper.property.dataDir</name>
    <value>/zookeeper</value>
</property>

<!--data replication num-->
<property>
    <name>dfs.replication</name>
    <value>2</value>
</property>

<!--self define hbase coprocessor-->
<!--property>
    <name>hbase.coprocessor.region.classes</name>
    <value>org.apache.hbase.kora.coprocessor.RegionObserverExample</value>
</property-->
<property>
    <name>hbase.master.info.port</name>
    <value>60800</value>
    <description>The port for the HBase Master web UI.
        Set to -1 if you do not want a UI instance run.
    </description>
</property>
<property>
    <name>hbase.wal.provider</name>
    <value>multiwal</value>
</property>
<property>
    <name>hbase.master.distributed.log.splitting</name>
    <value>false</value>
    <description></description>
</property>
<property>
    <name>hbase.hlog.split.skip.errors</name>
    <value>true</value>
</property>
<property>
    <name>hbase.regionserver.handler.count</name>
    <value>200</value>
    <description></description>
</property>
<property>
    <name>hfile.block.cache.size</name>
    <value>0.25</value>
    <description></description>
</property>
<property>
    <name>hbase.bucketcache.ioengine</name>
    <value>offheap</value>
</property>
<property>
    <name>hbase.bucketcache.size</name>
    <value>3072</value>
</property>
<property>
    <name>hbase.hregion.max.filesize</name>
    <value>107374182400</value>
</property>
<property>
    <name>hbase.hregion.memstore.flush.size</name>
    <value>268435456</value>
</property>
<property>
    <name>dfs.datanode.socket.write.timeout</name>
    <value>3000000</value>
</property>
<property>
    <name>dfs.socket.timeout</name>
    <value>180000</value>
</property>
<property>
    <name>hbase.lease.recovery.timeout</name>
    <value>1800000</value>
</property>
<property>
    <name>hbase.lease.recovery.dfs.timeout</name>
    <value>128000</value>
</property>
<property>
    <name>fail.fast.expired.active.master</name>
    <value>true</value>
    <description>If abort immediately for the expired master without trying
  to recover its zk session.</description>
</property>
<property>
    <name>hbase.master.wait.on.regionservers.mintostart</name>
    <value>3</value>
    <description></description>
</property>
<property>
    <name>hbase.hstore.compactionThreshold</name>
    <value>6</value>
    <description></description>
</property>
<property>
    <name>hbase.regionserver.thread.compaction.small</name>
    <value>5</value>
</property>
<property>
    <name>hbase.regionserver.thread.compaction.large</name>
    <value>5</value>
</property>
<property>
    <name>hbase.hstore.blockingStoreFiles</name>
    <value>100</value>
</property>
<property>
    <name>hbase.hregion.majorcompaction</name>
    <value>0</value>
</property>
<!--property>
      <name>hbase.regionserver.codecs</name>
      <value>snappy</value>
</property-->
 <property>
    <name>hbase.block.data.cachecompressed</name>
    <value>true</value>
</property>
[root@master hbase-1.0.1.1]#  vim conf/regionservers
master-slave01
master-slave02
master-slave03
啓動:
    [root@master hbase-1.0.1.1]#  bin/start-hbase.sh 

七、完成

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章