【Apache Hadoop系列】Hadoop1.0.4+zookeeper3.5.4+hbase0.94.5分佈式部署

一、HADOOP安裝配置步驟
1.1、服務器的準備

四臺Redhat服務器

192.168.130.170 master
192.168.130.168 dd1
192.168.130.162 dd2
192.168.130.248 dd3
1.2、安裝和配置JDK環境
安裝JDK1.6,並在/etc/profile設置好環境變量
具體步驟:
1.2.1、下載地址:
http://www.oracle.com/technetwork/java/javase/downloads/index.html (可選擇自己想要的版本,我這裏安裝的是JDK1.6)
安裝完之後通過執行java -version驗證是否安裝成功
[hadoop@master conf]$ java -version
java version “1.6.0_37″
Java(TM) SE Runtime Environment (build 1.6.0_37-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.12-b01, mixed mode)
1.2.2、設置環境變量
vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.6.0_26
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
export JAVA_HOME CLASSPATH PATH
1.2.3、編輯完/etc/profile文件後,保存好之後還需輸入下面這個命令使其馬上生效
source /etc/profile
1.3、軟件準備
HADOOP 1.0.4下載地址:http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-1.0.4/hadoop-1.0.4.tar.gz
ZOOKEEPER 3.4.5下載地址:http://apache.dataguru.cn/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz
HBASE 0.94.5 下載地址:http://mirrors.tuna.tsinghua.edu.cn/apache/hbase/hbase-0.94.5/hbase-0.94.5.tar.gz
1.4 給四臺服務器創建hadoop用戶及組,配置ssh無密碼訪問
1.4.1、創建用戶及用戶組:
groupadd -g 2000 hadoop
useradd -u 2000 -g hadoop hadoop
passwd hadoop
連續輸入兩次密碼,我這裏設置的是hadoop
1.4.2、生成密鑰,輸入如下命令
ssh-keygen -t rsa 連續回車三次
1.4.3、把公鑰加到認證的公鑰文件中,命令如下
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
1.4.4、授權authorized_keys文件的權限爲644,具體命令和操作如下:
chmod 644 ~/.ssh/authorized_keys(確保所有服務器的authorized_keys權限爲644)
1.4.5、接着驗證ssh免密碼登陸是否配置成功,命令如下
ssh localhost
第一次登陸時會詢問你是否繼續鏈接,輸入yes即可
1.4.6、各服務器之間ssh無密碼訪問設置
使用scp命令,將當前服務器上的密匙添加到其他服務器上
scp ~/.ssh/id_rsa.pub hadoop@dd1:~/
scp ~/.ssh/id_rsa.pub hadoop@dd2:~/
scp ~/.ssh/id_rsa.pub hadoop@dd3:~/
回車會提示你是否繼續連接,輸入yes,再輸入hadoop用戶的密碼hadoop
然後在各服務器上執行以下命令
cat id_rsa.pub >> ~/.ssh/authorized_keys
rm id_rsa.pub(添加後刪除,以免其他服務器拷貝過來的時候衝突或者造成不必要的錯誤)

其他服務器之間以此類推
1.4.7、驗證各服務器是否能正常ssh通訊
ssh dd1(第一次會提示是否繼續連接,輸入yes即可),直接可以登錄到對應的節點上說明成功。
1.5、編輯/etc/hosts文件
在四臺服務器的hosts文件中添加各服務器的主機名。

192.168.130.170 master
192.168.130.168 dd1
192.168.130.162 dd2
192.168.130.248 dd3
1.6、配置hadoop
使用hadoop用戶通過FTP將hadoop、zookeeper、hbase的gz包拷貝到~/目錄下
1.6.1、解壓hadoop安裝包
tar -zxvf hadoop-1.0.4.tar.gz
mv hadoop-1.0.4 hadoop
1.6.2配置hadoop配置文件
(1)配置conf/hadoop-env.sh文件
export JAVA_HOME=/usr/java/jdk1.6.0_26
(2)配置conf/core-site.xml文件
<configuration>
	<property>
		<name>fs.default.name</name>
		<value>hdfs://master:9000</value>
	</property>
	<property>
		<name>hadoop.tmp.dir</name>
		<value>/data/zqhadoop/data</value>
	</property>
</configuration>
(3)配置conf/hdfs-site.xml文件
<configuration>
	<property>
		<name>dfs.replication</name>
		<value>3</value>
	</property>
	<property>
		<name>dfs.name.dir</name>
		<value>/home/zqhadoop/HDFS/Namenode</value>
	</property>
	<property>
		<name>dfs.data.dir</name>
		<value>/home/zqhadoop/HDFS/Datanode</value>
	</property>
	<property>
		<name>dfs.permissions</name>
		<value>false</value>
	</property>
	<property>
		<name>dfs.datanode.max.xcievers</name>
		<value>4096</value>
	</property>
</configuration>
(4)配置conf/mapred-site.xml文件
<configuration>
	<property>
		<name>mapred.job.tracker</name>
		<value>master:9001</value>
	</property>
</configuration>
(5)配置masters和slaves文件
masters:
master
slaves:
dd1
dd2
dd3
1.6.3、使用scp命令將配置好的hadoop拷貝到其他服務器上
scp -r ~/hadoop hadoop@dd1:~/
scp -r ~/hadoop hadoop@dd2:~/
scp -r ~/hadoop hadoop@dd3:~/
1.6.4、在各服務器的/etc/profile添加hadoop的環境變量
export HADOOP_HOME=/home/hadoop/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
1.6.5、格式化hadoop namenode
進入hadoop目錄,執行以下命令
bin/hadoop namenode -format
1.6.6、啓動hadoop集羣
bin/start-all.sh
二、驗證hadoop集羣
2.1、使用java自帶的jps查看守護進程運行情況

maseter:
[zqhadoop@master bin]$ jps
763 SecondaryNameNode
866 JobTracker
8526 Jps
554 NameNode
slave:
[zqhadoop@dd1 ~]$ jps
19422 Jps
14051 DataNode
14159 TaskTracker
2.2、輸入hadoop dfsadmin -report命令,可以查看hadoop羣集的概況
[zqhadoop@master bin]$ ./hadoop dfsadmin -report
Warning: $HADOOP_HOME is deprecated.

Configured Capacity: 95862853632 (89.28 GB)
Present Capacity: 83737403392 (77.99 GB)
DFS Remaining: 83736358912 (77.99 GB)
DFS Used: 1044480 (1020 KB)
DFS Used%: 0%
Under replicated blocks: 1
Blocks with corrupt replicas: 0
Missing blocks: 0

————————————————-
Datanodes available: 3 (3 total, 0 dead)

Name: 192.168.130.248:50010
Decommission Status : Normal
Configured Capacity: 31954284544 (29.76 GB)
DFS Used: 348160 (340 KB)
Non DFS Used: 3638841344 (3.39 GB)
DFS Remaining: 28315095040(26.37 GB)
DFS Used%: 0%
DFS Remaining%: 88.61%
Last contact: Tue Feb 19 10:42:48 CST 2013

Name: 192.168.130.168:50010
Decommission Status : Normal
Configured Capacity: 31954284544 (29.76 GB)
DFS Used: 348160 (340 KB)
Non DFS Used: 3718184960 (3.46 GB)
DFS Remaining: 28235751424(26.3 GB)
DFS Used%: 0%
DFS Remaining%: 88.36%
Last contact: Tue Feb 19 10:42:49 CST 2013

Name: 192.168.130.162:50010
Decommission Status : Normal
Configured Capacity: 31954284544 (29.76 GB)
DFS Used: 348160 (340 KB)
Non DFS Used: 4768423936 (4.44 GB)
DFS Remaining: 27185512448(25.32 GB)
DFS Used%: 0%
DFS Remaining%: 85.08%
Last contact: Tue Feb 19 10:42:48 CST 2013
2.3、使用web訪問驗證hadoop
通過訪問web界面訪問hadoop的羣集概況和數據節點概況, 可以通過以下兩種方式進行訪問, 其中master-IP是指命名主機的IP地址。

https://master-IP:50030 (查看命名空間情況)

https://master-IP:50070 (查看整個分佈式文件系統的狀態,瀏覽分佈式文件系統中的文件以及 log)



三、zookeeper安裝配置步驟:
3.1、解壓zookeeper壓縮包
tar -zxvf zookeeper-3.4.5.tar.gz
mv zookeeper-3.4.5 zookeeper
3.2、配置zoo.cfg文件
進入zookeeper/conf目錄,執行以下命令
cp zoo_sample.cfg zoo.cfg
修改zoo.cfg中的配置如下:

dataDir=/home/zqhadoop/zookeeper_data
dataLogDir=/home/zqhadoop/zookeeper_log
clientPort=2181
initLimit=10
syncLimit=5
tickTime=2000
server.1=192.168.130.170:2888:3888
server.2=192.168.130.168:2888:3888
server.3=192.168.130.162:2888:3888
server.4=192.168.130.248:2888:3888
3.3、創建dataDir目錄(這裏指的是“/home/zqhadoop/zookeeper_data”),且在該目錄下創建名爲myid的文件
mkdir /home/hadoop/zookeeper_data
cd /home/hadoop/zookeeper_data
3.4、編輯“myid”文件,並在對應的IP的機器上輸入對應的編號。如在192.168.130.170上,
“myid”文件內容就是1,在192.168.130.168上,內容就是2。與上面的zoo.cfg配置對應
3.5、將配置好的zookeeper通過scp備份到對應的服務器上
scp -r zookeeper hadoop@dd1:~/
scp -r zookeeper hadoop@dd2:~/
scp -r zookeeper hadoop@dd3:~/

scp -r zookeeper_data hadoop@dd1:~/
scp -r zookeeper_data hadoop@dd2:~/
scp -r zookeeper_data hadoop@dd3:~/
依照zoo.cfg修改zookeeper_data 目錄中myid的值
3.5、啓動zookeeper
單獨啓動的zookeeper需要分別在每臺服務器上執行以下命令
~/home/hadoop/zookeeper/bin/zkServer.sh start
四、驗證zookeeper安裝
4.1、使用jps檢查服務是否啓動

[hadoop@master conf]$ jps
763 SecondaryNameNode
866 JobTracker
9203 Jps
1382 QuorumPeerMain
554 NameNode
可以看到jps會多出這樣一個進程QuorumPeerMain
4.2、通過輸入“sh /jz/zookeeper-3.3.1/bin/zkServer.sh status”檢查是否啓動
[zqhadoop@master bin]$ ./zkServer.sh status
JMX enabled by default
Using config: /home/zqhadoop/zookeeper/bin/../conf/zoo.cfg
Mode: follower
五、HBASE安裝配置
5.1、解壓hbase安裝包

tar -zxvf hbase-0.94.5.tar.gz
mv hbase-0.94.5 hbase
5.2、配置hbase配置文件
(1)設置conf/hbase-site.xml文件

<configuration>
	<property>
		<name>hbase.rootdir</name>
		<value>hdfs://master:9000/hbase</value>
	</property>
	<property>
		<name>hbase.cluster.distributed</name>
		<value>true</value>
	</property>
	<property>
		<name>hbase.zookeeper.quorum</name>
		<value>dd1,dd2,dd3</value>
	</property>
	<property>
		<name>hbase.zookeeper.session.timeout</name>
		<value>60000</value>
	</property>
	<property>
		<name>hbase.zookeeper.property.clientPort</name>
		<value>2181</value>
	</property>
	<property>
		<name>hbase.master</name>
		<value>master</value>
	</property>
	<property>
		<name>hbase.regionserver.lease.period</name>
		<value>60000</value>
	</property>
	<property>
		<name>hbase.rpc.timeout</name>
		<value>60000</value>
	</property>
	<property>
		<name>hbase.master.maxclockskew</name>
		<value>180000</value>
	</property>
</configuration>

  • hbase.rootdir 設置hbase在hdfs上的目錄,主機名爲hdfs的namenode節點所在的主機
  • hbase.cluster.distributed 設置爲true,表明是完全分佈式的hbase集羣
  • hbase.zookeeper.quorum 設置zookeeper的主機,建議使用單數
  • hbase.zookeeper.property.dataDir 設置zookeeper的數據路徑
  • hbase.master.maxclockskew  設置各服務器之間的最大時間間隔
  • hbase.zookeeper.session.timeout 設置調用zookeeper session的超時時間
  • hbase.rpc.timeout 設置調用hbase rpc的超時時間
  • hbase.zookeeper.property.clientPort 設置zookeeper客戶端的通訊端口
(2)配置conf/regionservers文件
dd1
dd2
dd3
(3)配置conf/hbase-env.xml文件
export JAVA_HOME=/usr/java/jdk1.6.0_37
export HBASE_MANAGES_ZK=flase
(4)配置hadoop中的hdfs-site.xml,添加配置
<property>
<span style="white-space:pre">	</span><name>dfs.datanode.max.xcievers</name>
<span style="white-space:pre">	</span><value>4096</value>
</property>
該參數限制了datanode所允許同時執行的發送和接受任務的數量,缺省爲256,hadoop-defaults.xml中通常不設置這個參數。這個限制看來實際有些偏小,高負載下,DFSClient 在put數據的時候會報 could not read from stream 的 Exception。
An Hadoop HDFS datanode has an upper bound on the number of files that it will serve at any one time. The upper bound parameter is called xcievers (yes, this is misspelled).
Not having this configuration in place makes for strange looking failures. Eventually you’ll see a complain in the datanode logs complaining about the xcievers exceeded, but on the run up to this one manifestation is complaint about missing blocks. For example: 10/12/08 20:10:31 INFO hdfs.DFSClient: Could not obtain block blk_XXXXXXXXXXXXXXXXXXXXXX_YYYYYYYY from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry…
5.3、刪除hbase/lib包下面的hadoop-core-1.0.X.jar和commons-collections-3.2.1.jar,將hadoop目錄下的core包和commons-collections-3.2.1.jar拷貝到hbase的lib下面 —這個操作是爲了保證hadoop和hbase的包的版本一致,以爲出現一些類似版本不兼容的問題
5.4、查看修改/etc/hosts是否綁定了127.0.0.1,如果有將其註釋
下面是官網的說明
Before we proceed, make sure you are good on the below loopback prerequisite.
Loopback IP
HBase expects the loopback IP address to be 127.0.0.1. Ubuntu and some other distributions, for example, will default to 127.0.1.1 and this will cause problems for you.
/etc/hosts should look something like this:
127.0.0.1 localhost
127.0.0.1 ubuntu.ubuntu-domain ubuntu
這裏說了hbase默認的會送地址是127.0.0.1,配置應該像上面這樣,他這裏添加了 127.0.0.1 localhost 這段。其實也可以直接不添加,把下面的這句話註釋掉。問題就解決了!

錯誤記錄:
情況1:
2012-12-19 22:11:46,018 INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not SASL-authenticate because the default JAAS configuration section ‘Client’ could not
be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
2012-12-19 22:11:46,024 WARN org.apache.zookeeper.ClientCnxn: Session 0×0 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused
解決方法:
修改/etc/hosts
註釋掉加粗的一行

<strong>#127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 </strong>
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.130.170 master
192.168.130.168 dd1
192.168.130.162 dd2
192.168.130.248 dd3
192.168.130.164 dd4
六、驗證hbase運行
6.1、使用jps查看進程運行情況

masters:
[zqhadoop@master hbase]$ jps
763 SecondaryNameNode
9326 Jps
866 JobTracker
3207 HMaster
1382 QuorumPeerMain
554 NameNode
slaves:
[zqhadoop@dd1 ~]$ jps
15065 HRegionServer
14051 DataNode
14159 TaskTracker
14447 QuorumPeerMain
20157 Jps
6.2、執行hbase shell看是否連接正常
[zqhadoop@master bin]$ ./hbase shell
HBase Shell; enter ‘help’ for list of supported commands.
Type “exit” to leave the HBase Shell
Version 0.94.5, r1443843, Fri Feb 8 05:51:25 UTC 2013

hbase(main):001:0>
6.3、通過web驗證hbase是否正常運行

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章