Hadoop2.2.0 + HBase0.96.1.1部署實踐

Hadoop2.2.0部署文檔


一、準備工作:

1.機器準備:

IPuser/passwdhostnamerole

*172.16.16.31(蕭何) lscm/izenexxxx    B5M-0169 nn/snn/rm

*172.16.16.29(大哥) lscm/izenexxxx    oscarshan-OptiPlex-990 dn/nm

*172.16.16.30(張清) lscm/izenexxxx    Caliph dn/nm

*172.16.16.47(慕容) lscm/izenexxxx  B5M-0213 dn/nm

nn:NameNode;

snn:SecondaryNameNode;

rm:ResourceManager;

dn:DataNode;

nm:NodeManager;

Hostname可以在/etc/hostname文件中修改;我這是借用別人的工作電腦,怕影響別人,就沒改;真正部署的時候爲了至少“看着舒服”,可以改成自己喜歡的名字如Cloud1Cloud2Cloud3之類;

創建用戶:根據現有情況,172.16.16.31172.16.16.47上已經有lscm帳號並且密碼相同並且擁有管理員權限,使用之;對172.16.16.29172.16.16.30,新建lscm帳號並設置相同密碼並分配管理員權限;PSuseradd命令創建的用戶沒有home目錄,不要使用;要使用adduser命令來創建用戶;

分配權限:編輯/etc/sudoers文件,命令sudovi /etc/sudoers,在文件中增加lscmALL=(ALL:ALL) ALL一行;

修改綁定:編輯/etc/hosts文件,增加下面4行:

172.16.16.31B5M-0169

172.16.16.29oscarshan-OptiPlex-990

172.16.16.30Caliph

172.16.16.47B5M-0213

打通免登陸:設置4臺機器之間相互免登,方法如下:

1).安裝ssh工具:命令sudoapt-get install openssh-serversudoapt-get install ssh

2).依次執行如下兩條命令,

ssh-keygen-t dsa -P '' -f ~/.ssh/id_dsa

cat~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

3).4臺機器上~/.ssh/authorized_keys文件內容合併成一份,再分別放在原文件中;

4).修改/etc/ssh/ssh_config文件,末尾加上兩行:

StrictHostKeyCheckingno

UserKnownHostsFile/dev/null

5).這樣,4臺機器便可以互相免登陸了

2.JDK的安裝:略;

二、安裝和配置Hadoop2.2.0

1.下載:

訪問http://hadoop.apache.org/—>左邊的Releases—>右邊的Download—>Downloada release now!—>“suggestedmirror”—>stable/—>hadoop-2.2.0.tar.gz,下載安裝包;源碼也在同一路徑下,可以一同下載下來學習研究;

2.安裝:

將下載的hadoop-2.2.0.tar.gz解壓在~/hadoop220下,於是HADOOP_HOME就像這樣:~/hadoop220/hadoop-2.2.0

新建三個目錄,用來存放將來的數據:

~/hadoop220/dfs/name

~/hadoop220/dfs/data

~/hadoop220/temp

3.配置:

總共涉及到7個配置文件要改:

~/hadoop220/hadoop-2.2.0/etc/hadoop/hadoop-env.sh

~/hadoop220/hadoop-2.2.0/etc/hadoop/yarn-env.sh

~/hadoop220/hadoop-2.2.0/etc/hadoop/slaves

~/hadoop220/hadoop-2.2.0/etc/hadoop/core-site.xml

~/hadoop220/hadoop-2.2.0/etc/hadoop/hdfs-site.xml

~/hadoop220/hadoop-2.2.0/etc/hadoop/mapred-site.xml

~/hadoop220/hadoop-2.2.0/etc/hadoop/yarn-site.xml

PS:有些.xml文件不存在,可以從.template文件複製得來;

1).配置文件hadoop-env.sh

修改JAVA_HOME值(exportJAVA_HOME=exportJAVA_HOME=/home/lscm/installedprogrames/jdk1630/jdk1.6.0_30

2).配置文件yarn-env.sh

修改JAVA_HOME值(exportJAVA_HOME=exportJAVA_HOME=/home/lscm/installedprogrames/jdk1630/jdk1.6.0_30

3).配置文件slaves

寫入以下內容:

oscarshan-OptiPlex-990

Caliph

B5M-02133

4).配置文件core-site.xml

<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://B5M-0169:9000</value>

</property>

<property>

<name>io.file.buffer.size</name>

<value>131072</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>file:/home/lscm/tmp</value>

<description>Abasefor other temporary directories.</description>

</property>

<property>

<name>hadoop.proxyuser.lscm.hosts</name>

<value>*</value>

</property>

<property>

<name>hadoop.proxyuser.lscm.groups</name>

<value>*</value>

</property>

</configuration>

5).配置文件hdfs-site.xml

<configuration>

<property>

<name>dfs.namenode.secondary.http-address</name>

<value>B5M-0169:9001</value>

</property>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:/home/lscm/hadoop220/dfs/name</value>

</property>

<property>

<name>dfs.datanode.data.dir</name>

<value>file:/home/lscm/hadoop220/dfs/data</value>

</property>

<property>

<name>dfs.replication</name>

<value>3</value>

</property>

<property>

<name>dfs.webhdfs.enabled</name>

<value>true</value>

</property>

<property>

<name>dfs.datanode.max.xcievers</name>

<value>4096</value>

</property>

</configuration>

6).配置文件mapred-site.xml

<configuration>

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

<property>

<name>mapreduce.jobhistory.address</name>

<value>B5M-0169:10020</value>

</property>

<property>

<name>mapreduce.jobhistory.webapp.address</name>

<value>B5M-0169:19888</value>

</property>

</configuration>

7).配置文件yarn-site.xml

<configuration>

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

<value>org.apache.hadoop.mapred.ShuffleHandler</value>

</property>

<property>

<name>yarn.resourcemanager.address</name>

<value>B5M-0169:8032</value>

</property>

<property>

<name>yarn.resourcemanager.scheduler.address</name>

<value>B5M-0169:8030</value>

</property>

<property>

<name>yarn.resourcemanager.resource-tracker.address</name>

<value>B5M-0169:8031</value>

</property>

<property>

<name>yarn.resourcemanager.admin.address</name>

<value>B5M-0169:8033</value>

</property>

<property>

<name>yarn.resourcemanager.webapp.address</name>

<value>B5M-0169:8088</value>

</property>

</configuration>

將上述7個配置文件複製到所有其他節點對應路徑下;PS:這裏有一個值得注意的地方,如果所有機器環境路徑都一樣,上面的7個文件就會完全相同,便可以配置一次然後完全複製;如果機器環境不一樣,比如JAVA_HOME不一樣,就得分別配置每臺機器上的hadoop-env.shyarn-env.sh文件;

三、啓動和使用:

進入HADOOP_HOME目錄:cd~/hadoop220/hadoop-2.2.0/

1).格式化namenode./bin/hdfsnamenode –format

2).啓動hdfs:./sbin/start-dfs.sh

啓動成功之後,在B5M-0169上面運行的進程有:NameNodeSecondaryNameNodeoscarshan-OptiPlex-990CaliphB5M-0213上面運行的進程有:DataNode

3).啓動yarn:./sbin/start-yarn.sh

此時,B5M-0169上面運行的進程有:NameNodeSecondaryNameNode ResourceManager;在oscarshan-OptiPlex-990CaliphB5M-0213上面運行的進程有:DataNodeNodeManager

查看集羣狀態:./bin/hdfsdfsadmin –report

查看HDFS:http://172.16.16.31:50070

查看RM:http://172.16.16.31:8088

附:hdfs常用命令:

./hadoopfs -ls

./hadoopfs -ls input

./hadoopfs -lsr input

./hadoopfs -mkdir input/data

./hadoopfs -puttest.txt input/data

./hadoopfs -cat input/data/test.txt

./hadoopfs -getinput/data/test.txt ./

./hadoopfs -tailinput/data/test.txt

./hadoopfs -rminput/data/test.txt

./hadoopfs -help ls








HBase0.96.1.1部署文檔:

一、準備工作:同Hadoop2.2.0部分;

二、安裝和配置HBase0.96.1.1

1.下載:

訪問http://hbase.apache.org—>左邊Downloads—>“suggestedmirror”—>hbase-0.96.1.1—>hbase-0.96.1.1-hadoop2-bin.tar.gz,下載安裝包,源碼在同一個包中;

2.安裝

將下載的hbase-0.96.1.1-hadoop2-bin.tar.gz解壓在~/hbase09611路徑下,HBASE_HOME便像這樣:~/hbase09611/hbase-0.96.1.1-hadoop2

3.配置

一個HadoopHDFS Datanode 有一個同時處理文件的上限.在配置Hbase之前,先確認下Hadoop的這個文件etc/hadoop/hdfs-site.xml裏面的xceivers參數,至少要有4096:

<property>

<name>dfs.datanode.max.xcievers</name>

<value>4096</value>

</property>

然後重啓HadoopHDFS系統;

需要修改的HBase的配置文件總共涉及到3個:

~/hbase09611/hbase-0.96.1.1-hadoop2/conf/hbase-env.sh

~/hbase09611/hbase-0.96.1.1-hadoop2/conf/hbase-site.xml

~/hbase09611/hbase-0.96.1.1-hadoop2/conf/regionservers

1).配置文件hbase-env.sh:

exportJAVA_HOME=/home/lscm/jdk1630/jdk1.6.0_30

exportHBASE_MANAGES_ZK=true(這個目的是讓HBase託管ZooKeeper

2).配置文件hbase-site.xml

<configuration>

<property>

<name>hbase.rootdir</name>

<value>hdfs://B5M-0169:9000/hbase</value>

</property>

<property>

<name>hbase.cluster.distributed</name>

<value>true</value>

</property>

<property>

<name>hbase.zookeeper.property.clientPort</name>

<value>2222</value>

</property>

<property>

<name>hbase.zookeeper.quorum</name>

<value>oscarshan-OptiPlex-990,Caliph,B5M-0213</value>

</property>

<property>

<name>hbase.zookeeper.property.dataDir</name>

<value>/home/lscm/hbase09611/zookeeper</value>

</property>

</configuration>

3).配置文件regionservers

B5M-0169

oscarshan-OptiPlex-990

Caliph

B5M-0213

三、啓動和使用:

首先確認HDFS系統是運行着的,如果還沒運行就到~/hadoop220/hadoop-2.2.0/sbin/下啓動./start-dfs.sh

1).啓動HBase

~/hbase09611/hbase-0.96.1.1-hadoop2/bin路徑下執行./start-hbase.sh腳本;

此時,B5M-0169上面運行的進程有:HMasteroscarshan-OptiPlex-990CaliphB5M-0213上面運行的進程有:HRegionServerHQuorumPeer

訪問資源頁:http://172.16.16.31:60010可以看到HBase各項參數和屬性

也可以到~/hbase09611/hbase-0.96.1.1-hadoop2/bin路徑下執行./hbaseshell,進入HBase的命令行交互界面;

附:HBase常用命令:

create'test2', 'cf'

list

list'test2'

put'test2', 'row1', 'cf:a', 'value1'

scan'test2'

disable'test2'

drop'test2'

附:最基本的用java操作數據庫代碼:

publicclassDemo {

@SuppressWarnings("deprecation")

publicstaticvoidmain(String args[]) {

try{

//獲得Hbase配置參數

Configurationconfig = HBaseConfiguration.create();

config.set("hbase.master","172.16.16.31");

config.set("hbase.master.port","60010");

config.set("hbase.zookeeper.quorum","172.16.16.29,172.16.16.30,172.16.16.47");

config.set("hbase.zookeeper.property.clientPort","2222");

HBaseAdminadmin =newHBaseAdmin(config);//新建一個數據庫管理員

if(admin.tableExists("info")){//測試要操作的表是否已經存在

System.out.println("Targettable exist,drop it...");

admin.disableTable("info");//關閉一個表

admin.deleteTable("info");//刪除一個表

}

System.out.println("createtable------------------------------------------------");

//新建一個info表的描述

HTableDescriptortableDescripter =newHtableDescriptor("info".getBytes());

//HTableDescriptor.parseFrom("info".getBytes());

tableDescripter.addFamily(newHColumnDescriptor("details"));//在表描述中添加列族

admin.createTable(tableDescripter);//根據配置好的描述建表

HTablehtable =newHTable(config,"info");//返回表info的實例

htable.setAutoFlush(true,true);

htable.setWriteBufferSize(1024* 1024 * 100);

byte[]rowkey = Bytes.toBytes("rowkey");

Putput =newPut(rowkey);

put.add(Bytes.toBytes("details"),Bytes.toBytes("name"),Bytes.toBytes("liuyue"));

put.add(Bytes.toBytes("details"),Bytes.toBytes("num"),Bytes.toBytes("24"));

put.add(Bytes.toBytes("details"),Bytes.toBytes("time"),Bytes.toBytes("2011"));

htable.put(put);

htable.flushCommits();

Getget =newGet(Bytes.toBytes("rowkey"));//新建Get實例,根據條件返回一個指定行裏的數據

Resultr = htable.get(get);//返回一個指定行裏的數據

byte[]value1 = r.getValue(Bytes.toBytes("details"),Bytes.toBytes("name"));

byte[]value2 = r.getValue(Bytes.toBytes("details"),Bytes.toBytes("num"));

byte[]value3 = r.getValue(Bytes.toBytes("details"),Bytes.toBytes("time"));

System.out.println(newString(value1));

System.out.println(newString(value2));

System.out.println(newString(value3));

htable.close();

admin.close();

}

catch(Exception e) {

e.printStackTrace();

}

}

}


hbase(main):010:0>scan 'info'

ROW COLUMN+CELL

rowkey column=details:name,timestamp=1390706177556, value=liuyue

rowkey column=details:num,timestamp=1390706177556, value=24

rowkey column=details:time,timestamp=1390706177556, value=2011

1row(s) in 0.0090 seconds

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章