虛擬機裏在ubuntu linux上搭建Eclipse的HBase 開發環境

轉自： http://blog.csdn.net/linhx/article/details/6965154

要求： Hadoop/HBase集羣運行在遠程的數據中心；開發環境搭建在本地的虛機的ubuntu 系統裏

1. 虛擬機，比如VMware，virtual PC or ....，裏建立ubuntu linux

2. 下載Eclipse JEE Version， Helios是個不錯的考慮，這裏強烈建議用JEE version，實際開發後你就會發現省事不少；

3. Hadoop開發環境配置

4. HBase環境配置

4.1 新建project

4.2 將以下 hbase, hadoop, log4j, commons-logging,commons-lang, and ZooKeeper jars 包放入classpath。一般是在project中新建一個lib文件，與src文件平級，將以上文件拷貝進去，再添加Java Build Path。

比如：

hadoop-0.20.205-core.jar

log4j-1.2.16.jar

commons-logging-1.1.1.jar

hbase-0.90.4.jar

hbase-0.90.4-tests.jar

zookeeper-3.2.2.jar

commons-lang-2.5.jar

4.3 在project中新建一個conf文件，與src文件平級，用於存放hbase的conf文件夾內容，然後通過add class folder添加Java Build Path。

當然，另外一種選擇是將4.2的lib和4.3的conf同時放進reference library裏，然後將此reference library add進入build path，這樣整體看起來乾淨整潔。方法是先建立在preference裏建立user library，然後通過工程->properties-java build path->libraries-> add library 導入；

看圖如下：

第一張圖爲了說明問題，重複了兩種方法，根據方框的寬度區分；

第二張圖來自一戰友的文章： http://www.sujee.net/tech/articles/hbase-map-reduce-freq-counter/，哥們把conf都放入library裏了。好壞由你點評了。

另外這裏順便說一下， Order and Export 這裏的順序是很重要的，但往往被忽略。 Eclipse 在類名相同的類進行導入提示時，就是根據這個順序，所以當導入的庫比較多時，花一分鐘思考一下庫的順序是必要的。尤其是像我一樣，很喜歡用Ctrl+Shift+O的同志。

order就是使用s同名class的順序；export就是把用到的一些的lib和project同時發佈.

Say you have junit.jar in the build path of project A. Project B depends on project A.

Now you write a junit test in project B. If project A exports junit.jar, project B can use it at compile time - no more action necessary. If A doesn't export it, B doesn't know about it - you will have to explicitely put it into its build path, too.

4.4 建立HBase操作類，運行成功的話，可以在HBase上建立相關表。控制檯有輸出信息供分析。

view
plain

package com.ibm.bi.hbase;  

import java.io.IOException;  

import org.apache.hadoop.conf.Configuration;  

import org.apache.hadoop.hbase.HBaseConfiguration;  

import org.apache.hadoop.hbase.HColumnDescriptor;  

import org.apache.hadoop.hbase.HTableDescriptor;  

import org.apache.hadoop.hbase.client.Get;  

import org.apache.hadoop.hbase.client.HBaseAdmin;  

import org.apache.hadoop.hbase.client.HTable;  

import org.apache.hadoop.hbase.client.Put;  

import org.apache.hadoop.hbase.client.Result;  

import org.apache.hadoop.hbase.client.ResultScanner;  

import org.apache.hadoop.hbase.client.Scan;  

import org.apache.hadoop.hbase.util.Bytes;  

public class TableOperation {  

    /** 

     * @param args 

     */  

    public static void main(String[] args) throws Exception {        

     Configuration config = HBaseConfiguration.create();  

     // Create table  

     HBaseAdmin admin = new HBaseAdmin(config);  

     HTableDescriptor htd = new HTableDescriptor("test");  

     HColumnDescriptor hcd = new HColumnDescriptor("data");  

     htd.addFamily(hcd);  

     admin.createTable(htd);  

     byte [] tablename = htd.getName();  

     HTableDescriptor [] tables = admin.listTables();  

     if (tables.length != 1 && Bytes.equals(tablename, tables[0].getName())) {  

         throw new IOException("Failed create of table");  

     }  

     // Run some operations -- a put, a get, and a scan -- against the table.  

     HTable table = new HTable(config, tablename);  

     byte [] row1 = Bytes.toBytes("row1");  

     Put p1 = new Put(row1);  

     byte [] databytes = Bytes.toBytes("data");  

     p1.add(databytes, Bytes.toBytes("1"), Bytes.toBytes("value1"));  

     table.put(p1);  

     Get g = new Get(row1);  

     Result result = table.get(g);  

     System.out.println("Get: " + result);  

     Scan scan = new Scan();  

     ResultScanner scanner = table.getScanner(scan);  

     try {  

         for (Result scannerResult: scanner) {  

         System.out.println("Scan: " + scannerResult);  

         }  

     } finally {  

         scanner.close();  

     }  

//     // Drop the table  

//     admin.disableTable(tablename);  

//     admin.deleteTable(tablename);  

     }  

}

4.5 問題分析

(1) Error類似以下，

ERROR: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default). Consider inspecting your ZK server logs for that error and then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information.

這個是表象，仔細分析控制檯的log輸出，發現是連接超時的問題。於是轉而確認虛機是否能夠與HBase master連接的問題。通過分析發現是域名連接的問題，主要是遠程服務器是雙網卡的，導致Hadoop及HBase配置文件中的域名是外部網絡無法連通的。於是修改（1）遠程服務器的hostname，讓hostname改爲對外一致的hostname。這是個麻煩而且公司政治風險高的操作——老闆和系統管理員一定會challenge你爲什麼這麼做，想好對策吧！ 2）添加本地/etc/hosts 中的域名，IP映射，使得連接成功。

（2）

站內首發文章

迷失的小書童

發佈了39 篇原創文章 · 獲贊 1 · 訪問量 6萬+

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

虛擬機裏在ubuntu linux上搭建Eclipse的HBase 開發環境

自學編程兩個月，現在我月入 4 萬元

Google Chrome驅動程序 124.0.6367.62（正式版本）去哪下載？

淘寶海量數據產品技術架構

百度筆試題，求解答

高併發高流量網站架構

簡明 Vim 練級攻略

Git跨平臺中文亂碼臨時解決方案

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結