源碼解讀--(1)hbase客戶端源代碼

源碼解讀--(1)hbase客戶端源代碼 http://aperise.iteye.com/blog/2372350
源碼解讀--(2)hbase-examples BufferedMutator Example http://aperise.iteye.com/blog/2372505
源碼解讀--(3)hbase-examples MultiThreadedClientExample http://aperise.iteye.com/blog/2372534

1.hbase客戶端使用

    1.1 在maven工程中引入hbase客戶端jar

		<!-- hbase -->
		<dependency>
			<groupId>org.apache.hbase</groupId>
			<artifactId>hbase-client</artifactId>
			<version>1.2.1</version>
		</dependency>

 

    1.2 推薦的創建hbase客戶端代碼

    推薦的客戶端使用方式一:

Configuration configuration = HBaseConfiguration.create();    
configuration.set("hbase.zookeeper.property.clientPort", "2181");    
configuration.set("hbase.client.write.buffer", "2097152");    
configuration.set("hbase.zookeeper.quorum","192.168.199.31,192.168.199.32,192.168.199.33,192.168.199.34,192.168.199.35");    
//默認connection實現是org.apache.hadoop.hbase.client.ConnectionManager.HConnectionImplementation
Connection connection = ConnectionFactory.createConnection(configuration);    
//默認table實現是org.apache.hadoop.hbase.client.HTable
Table table = connection.getTable(TableName.valueOf("tableName")); 

//3177不是我杜撰的,是2*hbase.client.write.buffer/put.heapSize()計算出來的 
int bestBathPutSize = 3177;   

try {    
  // Use the table as needed, for a single operation and a single thread    
  // construct List<Put> putLists    
  List<Put> putLists = new ArrayList<Put>();  
  for(int count=0;count<100000;count++){  
    Put put = new Put(rowkey.getBytes());  
    put.addImmutable("columnFamily1".getBytes(), "columnName1".getBytes(), "columnValue1".getBytes());  
    put.addImmutable("columnFamily1".getBytes(), "columnName2".getBytes(), "columnValue2".getBytes());  
    put.addImmutable("columnFamily1".getBytes(), "columnName3".getBytes(), "columnValue3".getBytes());  
    put.setDurability(Durability.SKIP_WAL);
    putLists.add(put);  
      
    if(putLists.size()==bestBathPutSize){  
      //達到最佳大小值了,馬上提交一把  
        table.put(putLists);  
        putLists.clear();  
    }  
  }  
  //剩下的未提交數據,最後做一次提交  
  table.put(putLists)    
} finally {    
  table.close();    
  connection.close();    
} 

    推薦的客戶端使用方式二:

Configuration configuration = HBaseConfiguration.create();        
configuration.set("hbase.zookeeper.property.clientPort", "2181");        
configuration.set("hbase.client.write.buffer", "2097152");        
configuration.set("hbase.zookeeper.quorum","192.168.199.31,192.168.199.32,192.168.199.33,192.168.199.34,192.168.199.35");  
  
BufferedMutatorParams params = new BufferedMutatorParams(TableName.valueOf("tableName"));  
  
//3177不是我杜撰的,是2*hbase.client.write.buffer/put.heapSize()計算出來的     
int bestBathPutSize = 3177;     
  
//這裏利用jdk1.7裏的新特性try(必須實現java.io.Closeable的對象){}catch (Exception e) {}  
//相當於調用了finally功能,調用(必須實現java.io.Closeable的對象)的close()方法,也即會調用conn.close(),mutator.close()  
try(  
  //默認connection實現是org.apache.hadoop.hbase.client.ConnectionManager.HConnectionImplementation   
  Connection conn = ConnectionFactory.createConnection(configuration);  
  //默認mutator實現是org.apache.hadoop.hbase.client.BufferedMutatorImpl  
  BufferedMutator mutator = conn.getBufferedMutator(params);  
){           
  List<Put> putLists = new ArrayList<Put>();      
  for(int count=0;count<100000;count++){      
    Put put = new Put(rowkey.getBytes());      
    put.addImmutable("columnFamily1".getBytes(), "columnName1".getBytes(), "columnValue1".getBytes());      
    put.addImmutable("columnFamily1".getBytes(), "columnName2".getBytes(), "columnValue2".getBytes());      
    put.addImmutable("columnFamily1".getBytes(), "columnName3".getBytes(), "columnValue3".getBytes());      
    put.setDurability(Durability.SKIP_WAL);    
    putLists.add(put);      
          
    if(putLists.size()==bestBathPutSize){      
      //達到最佳大小值了,馬上提交一把      
        mutator.mutate(putLists);     
        mutator.flush();  
        putLists.clear();  
    }      
  }      
  //剩下的未提交數據,最後做一次提交         
  mutator.mutate(putLists);     
  mutator.flush();  
}catch(IOException e) {  
  LOG.info("exception while creating/destroying Connection or BufferedMutator", e);  
}  
    兩種方式做一個對比如下:
Table.put(List<Put>) BufferedMutator.mutate(List<Put>)

Table.put(List<Put>)源代碼本質是將BufferedMutator.mutate(List<Put>)進行了包裝,多了個autoFlush標誌,首先調用BufferedMutator.mutate(List<Put>)按照設定的hbase.client.write.buffer(默認2MB)不斷提交,最後因爲默認的autoFlush=true,所以每次都會提交

BufferedMutator.mutate(List<Put>)會計算所給集合所佔內存,如果超過hbase.client.write.buffer(默認2MB)就提交一次,直到不超過就等待,一直等待到表要關閉前再次提交一次
 

    1.3 被遺棄的hbase客戶端使用代碼

     被遺棄創建方式一:直接通過HTable(Configuration conf, final String tableName)創建

Configuration configuration = HBaseConfiguration.create();    
configuration.set("hbase.zookeeper.property.clientPort", "2181");    
configuration.set("hbase.client.write.buffer", "2097152");    
configuration.set("hbase.zookeeper.quorum","192.168.199.31,192.168.199.32,192.168.199.33,192.168.199.34,192.168.199.35");    
Table table = new HTable(configuration, "tableName"); 

//3177不是我杜撰的,是2*hbase.client.write.buffer/put.heapSize()計算出來的 
int bestBathPutSize = 3177;   

try {    
  // Use the table as needed, for a single operation and a single thread    
  // construct List<Put> putLists    
  List<Put> putLists = new ArrayList<Put>();  
  for(int count=0;count<100000;count++){  
    Put put = new Put(rowkey.getBytes());  
    put.addImmutable("columnFamily1".getBytes(), "columnName1".getBytes(), "columnValue1".getBytes());  
    put.addImmutable("columnFamily1".getBytes(), "columnName2".getBytes(), "columnValue2".getBytes());  
    put.addImmutable("columnFamily1".getBytes(), "columnName3".getBytes(), "columnValue3".getBytes());  
    put.setDurability(Durability.SKIP_WAL);
    putLists.add(put);  
      
    if(putLists.size()==(bestBathPutSize-1)){  
      //達到最佳大小值了,馬上提交一把  
        table.put(putLists);  
        putLists.clear();  
    }  
  }  
  //剩下的未提交數據,最後做一次提交  
  table.put(putLists)    
} finally {    
  table.close();    
  connection.close();    
} 

        被遺棄的方式二:通過HConnectionManager.createConnection(Configuration conf)獲取HTableInterface

Configuration configuration = HBaseConfiguration.create();    
configuration.set("hbase.zookeeper.property.clientPort", "2181");    
configuration.set("hbase.client.write.buffer", "2097152");    
configuration.set("hbase.zookeeper.quorum","192.168.199.31,192.168.199.32,192.168.199.33,192.168.199.34,192.168.199.35");    
HConnection connection = HConnectionManager.createConnection(configuration);
HTableInterface table = connection.getTable(TableName.valueOf("tableName"));

//3177不是我杜撰的,是2*hbase.client.write.buffer/put.heapSize()計算出來的 
int bestBathPutSize = 3177;   

try {    
  // Use the table as needed, for a single operation and a single thread    
  // construct List<Put> putLists    
  List<Put> putLists = new ArrayList<Put>();  
  for(int count=0;count<100000;count++){  
    Put put = new Put(rowkey.getBytes());  
    put.addImmutable("columnFamily1".getBytes(), "columnName1".getBytes(), "columnValue1".getBytes());  
    put.addImmutable("columnFamily1".getBytes(), "columnName2".getBytes(), "columnValue2".getBytes());  
    put.addImmutable("columnFamily1".getBytes(), "columnName3".getBytes(), "columnValue3".getBytes());  
    put.setDurability(Durability.SKIP_WAL);
    putLists.add(put);  
      
    if(putLists.size()==(bestBathPutSize-1)){  
      //達到最佳大小值了,馬上提交一把  
        table.put(putLists);  
        putLists.clear();  
    }  
  }  
  //剩下的未提交數據,最後做一次提交  
  table.put(putLists)    
} finally {    
  table.close();    
  connection.close();    
} 

 

2.hbase客戶端源碼解讀

    前面我們說過,推薦的使用hbase客戶端的方式如下:

Connection connection = ConnectionFactory.createConnection(configuration);  
Table table = connection.getTable(TableName.valueOf("tableName"));  

    那源代碼的查看就從這兩行代碼開始,先來看下ConnectionFactory.createConnection(configuration)

 

    2.1 ConnectionFactory.createConnection(Configuration conf)

    先看下createConnection(Configuration conf)的源代碼,如下:

  public static Connection createConnection(Configuration conf) throws IOException {
    return createConnection(conf, null, null);
  }

    傳入我們構造的Configuration對象,然後調用了ConnectionFactory.createConnection(Configuration conf, ExecutorService pool, User user),繼續看ConnectionFactory.createConnection(Configuration conf, ExecutorService pool, User user)的源代碼,如下:

  public static Connection createConnection(Configuration conf, ExecutorService pool, User user)
  throws IOException {
    //因爲上面傳入的user爲null,這裏代碼不會執行
    if (user == null) {
      UserProvider provider = UserProvider.instantiate(conf);
      user = provider.getCurrent();
    }

    return createConnection(conf, false, pool, user);
  }

    這裏繼續調用了ConnectionFactory.createConnection(final Configuration conf, final boolean managed, final ExecutorService pool, final User user),那麼我們繼續看下相關代碼,如下:

static Connection createConnection(final Configuration conf, final boolean managed, final ExecutorService pool, final User user)
  throws IOException {
    //默認HBASE_CLIENT_CONNECTION_IMPL = "hbase.client.connection.impl"
    //hbase.client.connection.impl供hbase使用者實現自己的hbase鏈接實現類並配置進來使用
    //默認hbase已經提供了實現,無需實現,那麼這裏就取默認實現ConnectionManager.HConnectionImplementation.class.getName()
    //默認hbase的connection實現類也即HConnectionImplementation類
    String className = conf.get(HConnection.HBASE_CLIENT_CONNECTION_IMPL,ConnectionManager.HConnectionImplementation.class.getName());
    Class<?> clazz = null;
    try {
      clazz = Class.forName(className);
    } catch (ClassNotFoundException e) {
      throw new IOException(e);
    }
    try {
      // Default HCM#HCI is not accessible; make it so before invoking.
      //這裏調用HConnectionImplementation類的構造方法HConnectionImplementation(Configuration conf, boolean managed, ExecutorService pool, User user)
      Constructor<?> constructor = clazz.getDeclaredConstructor(Configuration.class, boolean.class, ExecutorService.class, User.class);
      constructor.setAccessible(true);
      return (Connection) constructor.newInstance(conf, managed, pool, user);
    } catch (Exception e) {
      throw new IOException(e);
    }
  }
}

    上面的代碼默認調用ConnectionManager.HConnectionImplementation類返回Connection對象,繼續跟蹤HConnectionImplementation(Configuration conf, boolean managed, ExecutorService pool, User user)代碼:

HConnectionImplementation(Configuration conf, boolean managed, ExecutorService pool, User user) throws IOException {
      //這裏代碼我們需要重點關注
      this(conf);
      //這裏this.user=null
      this.user = user;
      //這裏this.batchPool=null
      this.batchPool = pool;
      //這裏this.managed=false
      this.managed = managed;
      //這裏setupRegistry()默認從hbase.client.registry.impl獲取客戶端使用者實現的zookeeper註冊類,沒有配置就默認創建ZooKeeperRegistry類對象並設置,這個類非常重要,客戶端與zookeeper的交互類就由此類負責
      this.registry = setupRegistry();
      //默認通過ZooKeeperRegistry對象從zookeeper獲取hbase集羣的clusterId
      retrieveClusterId();

       //如果Configuration沒配置hbase.rpc.client.impl就默認創建RpcClientImpl並設置給this.rpcClient
      this.rpcClient = RpcClientFactory.createClient(this.conf, this.clusterId, this.metrics);
      this.rpcControllerFactory = RpcControllerFactory.instantiate(conf);

      // Do we publish the status?
      //如果Configuration沒配置hbase.status.published就默認設置shouldListen=false
      boolean shouldListen = conf.getBoolean(HConstants.STATUS_PUBLISHED, HConstants.STATUS_PUBLISHED_DEFAULT);
          
      //如果Configuration沒配置hbase.status.listener.class就默認創建MulticastListener對象並設置給listenerClass   
      Class<? extends ClusterStatusListener.Listener> listenerClass = conf.getClass(ClusterStatusListener.STATUS_LISTENER_CLASS, ClusterStatusListener.DEFAULT_STATUS_LISTENER_CLASS, ClusterStatusListener.Listener.class);
      if (shouldListen) {
        if (listenerClass == null) {
          LOG.warn(HConstants.STATUS_PUBLISHED + " is true, but " + ClusterStatusListener.STATUS_LISTENER_CLASS + " is not set - not listening status");
        } else {
          //這裏通過hbase事件監聽器監視hbase服務端事件,當hbase服務端服務不可用時,調用rpcClient.cancelConnections關閉鏈接
          clusterStatusListener = new ClusterStatusListener(
              new ClusterStatusListener.DeadServerHandler() {
                @Override
                public void newDead(ServerName sn) {
                  clearCaches(sn);
                  rpcClient.cancelConnections(sn);
                }
              }, conf, listenerClass);
        }
      }
    }

    上面的代碼我們主要關注this(conf);另外一個需要注意的就是方法setupRegistry(),setupRegistry()這裏默認設置的是org.apache.hadoop.hbase.client.ZooKeeperRegistry,這一行並將在後面繼續分析,其它的代碼都比較簡單,我在上面代碼中已經做代碼註釋,繼續看this(conf)代碼:

protected HConnectionImplementation(Configuration conf) {
      //這裏把客戶端使用者傳入的Configuration賦值給this.conf
      this.conf = conf;
      //這裏HConnectionImplementation基於我們傳入的Configuration構建了自己的Configuration類對象this.connectionConfig
      this.connectionConfig = new ConnectionConfiguration(conf);
      this.closed = false;
      //客戶端使用者的Configuration沒有配置hbase.client.pause,那麼就設置默認值this.pause=100
      this.pause = conf.getLong(HConstants.HBASE_CLIENT_PAUSE, HConstants.DEFAULT_HBASE_CLIENT_PAUSE);
      //客戶端使用者的Configuration沒有配置hbase.meta.replicas.use,那麼就設置默認值this.useMetaReplicas=false
      this.useMetaReplicas = conf.getBoolean(HConstants.USE_META_REPLICAS, HConstants.DEFAULT_USE_META_REPLICAS);
      //從this.connectionConfig裏獲取值設置,而客戶端使用者的Configuration沒有配置hbase.client.retries.number就默認設置this.numTries=31
      this.numTries = connectionConfig.getRetriesNumber();
      //客戶端使用者的Configuration沒有配置hbase.rpc.timeout,那麼就設置默認值this.rpcTimeout=60000毫秒
      this.rpcTimeout = conf.getInt(HConstants.HBASE_RPC_TIMEOUT_KEY, HConstants.DEFAULT_HBASE_RPC_TIMEOUT);
      if (conf.getBoolean(CLIENT_NONCES_ENABLED_KEY, true)) {
        synchronized (nonceGeneratorCreateLock) {
          if (ConnectionManager.nonceGenerator == null) {
            ConnectionManager.nonceGenerator = new PerClientRandomNonceGenerator();
          }
          this.nonceGenerator = ConnectionManager.nonceGenerator;
        }
      } else {
        this.nonceGenerator = new NoNonceGenerator();
      }
      //跟蹤region的統計信息
      stats = ServerStatisticTracker.create(conf);
      //hbase客戶端異步操作類
      this.asyncProcess = createAsyncProcess(this.conf);
      this.interceptor = (new RetryingCallerInterceptorFactory(conf)).build();
      this.rpcCallerFactory = RpcRetryingCallerFactory.instantiate(conf, interceptor, this.stats);
      this.backoffPolicy = ClientBackoffPolicyFactory.create(conf);
      if (conf.getBoolean(CLIENT_SIDE_METRICS_ENABLED_KEY, false)) {
        this.metrics = new MetricsConnection(this);
      } else {
        this.metrics = null;
      }
      
      this.hostnamesCanChange = conf.getBoolean(RESOLVE_HOSTNAME_ON_FAIL_KEY, true);
      this.metaCache = new MetaCache(this.metrics);
    }

    上面代碼比較重要的一點是,儘管客戶端傳入了Configuration,但是HConnectionImplementation不會直接使用客戶端傳入的Configuration,而是基於客戶端傳入的Configuration構建了自己的Configuration對象,原因是客戶端傳入的Configuration對象只給了部分值,很多其它值都未給出,那麼HConnectionImplementation就有必要創建自己的Configuration,首先構建自己默認的Configuration,然後把客戶端已經設置的Configuration的相關值覆蓋那些默認值,客戶端沒設置的值就使用默認值,我們繼續看下this.connectionConfig = new ConnectionConfiguration(conf)的源代碼:

ConnectionConfiguration(Configuration conf) {
    //客戶端的Configuration沒有配置hbase.client.pause,那麼就設置默認值this.writeBufferSize=2097152
    this.writeBufferSize = conf.getLong(WRITE_BUFFER_SIZE_KEY, WRITE_BUFFER_SIZE_DEFAULT);
    
    //客戶端的Configuration沒有配置hbase.client.write.buffer,那麼就設置默認值this.metaOperationTimeout=1200000
    this.metaOperationTimeout = conf.getInt(HConstants.HBASE_CLIENT_META_OPERATION_TIMEOUT, HConstants.DEFAULT_HBASE_CLIENT_OPERATION_TIMEOUT);

    //客戶端的Configuration沒有配置hbase.client.meta.operation.timeout,那麼就設置默認值this.operationTimeout=1200000
    this.operationTimeout = conf.getInt(HConstants.HBASE_CLIENT_OPERATION_TIMEOUT, HConstants.DEFAULT_HBASE_CLIENT_OPERATION_TIMEOUT);

    //客戶端的Configuration沒有配置hbase.client.operation.timeout,那麼就設置默認值this.scannerCaching=Integer.MAX_VALUE
    this.scannerCaching = conf.getInt(HConstants.HBASE_CLIENT_SCANNER_CACHING, HConstants.DEFAULT_HBASE_CLIENT_SCANNER_CACHING);

    //客戶端的Configuration沒有配置hbase.client.scanner.max.result.size,那麼就設置默認值this.scannerMaxResultSize=2 * 1024 * 1024
    this.scannerMaxResultSize = conf.getLong(HConstants.HBASE_CLIENT_SCANNER_MAX_RESULT_SIZE_KEY, HConstants.DEFAULT_HBASE_CLIENT_SCANNER_MAX_RESULT_SIZE);

    //客戶端的Configuration沒有配置hbase.client.primaryCallTimeout.get,那麼就設置默認值this.primaryCallTimeoutMicroSecond=10000
    this.primaryCallTimeoutMicroSecond = conf.getInt("hbase.client.primaryCallTimeout.get", 10000); // 10000ms

    //客戶端的Configuration沒有配置hbase.client.replicaCallTimeout.scan,那麼就設置默認值this.replicaCallTimeoutMicroSecondScan=1000000
    this.replicaCallTimeoutMicroSecondScan = conf.getInt("hbase.client.replicaCallTimeout.scan", 1000000); // 1000000ms

    //客戶端的Configuration沒有配置hbase.client.retries.number,那麼就設置默認值this.retries=31
    this.retries = conf.getInt(HConstants.HBASE_CLIENT_RETRIES_NUMBER, HConstants.DEFAULT_HBASE_CLIENT_RETRIES_NUMBER);

    //客戶端的Configuration沒有配置hbase.client.keyvalue.maxsize,那麼就設置默認值this.maxKeyValueSize=-1
    this.maxKeyValueSize = conf.getInt(MAX_KEYVALUE_SIZE_KEY, MAX_KEYVALUE_SIZE_DEFAULT);
  }

    上面的代碼主要是初始化HConnectionImplementation自己的Configuration類型屬性this.connectionConfig,默認客戶端不設置屬性值,這裏創建的this.connectionConfig就使用默認值,這裏將hbase客戶端默認值抽取如下:

  • hbase.client.write.buffer               默認2097152Byte,也即2MB
  • hbase.client.meta.operation.timeout     默認1200000毫秒
  • hbase.client.operation.timeout          默認1200000毫秒
  • hbase.client.scanner.caching            默認Integer.MAX_VALUE
  • hbase.client.scanner.max.result.size    默認2MB
  • hbase.client.primaryCallTimeout.get     默認10000毫秒
  • hbase.client.replicaCallTimeout.scan    默認1000000毫秒
  • hbase.client.retries.number             默認31次
  • hbase.client.keyvalue.maxsize           默認-1,不限制
  • hbase.client.ipc.pool.type
  • hbase.client.ipc.pool.size
  • hbase.client.pause                      100
  • hbase.client.max.total.tasks            100
  • hbase.client.max.perserver.tasks        2
  • hbase.client.max.perregion.tasks        1
  • hbase.client.instance.id
  • hbase.client.scanner.timeout.period     60000
  • hbase.client.rpc.codec
  • hbase.regionserver.lease.period         被hbase.client.scanner.timeout.period代替,60000
  • hbase.client.fast.fail.mode.enabled     FALSE
  • hbase.client.fastfail.threshold         60000
  • hbase.client.fast.fail.cleanup.duration 600000
  • hbase.client.fast.fail.interceptor.impl
  • hbase.client.backpressure.enabled       false

 

    2.2 與zookeeper交互的ZooKeeperRegistry

    上面我們分析知道客戶端使用者傳入的Configuration只有設置的值纔會在客戶端上生效,而未設置的值則交由默認值設置,另外一個非常重要的就是剛纔所提到的與zookeeper交互的類org.apache.hadoop.hbase.client.ZooKeeperRegistry

package org.apache.hadoop.hbase.client;

import java.io.IOException;
import java.io.InterruptedIOException;
import java.util.List;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.hadoop.hbase.HRegionInfo;
import org.apache.hadoop.hbase.HRegionLocation;
import org.apache.hadoop.hbase.RegionLocations;
import org.apache.hadoop.hbase.ServerName;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.zookeeper.MetaTableLocator;
import org.apache.hadoop.hbase.zookeeper.ZKClusterId;
import org.apache.hadoop.hbase.zookeeper.ZKTableStateClientSideReader;
import org.apache.hadoop.hbase.zookeeper.ZKUtil;
import org.apache.zookeeper.KeeperException;

/**
 * A cluster registry that stores to zookeeper.
 */
class ZooKeeperRegistry implements Registry {
  private static final Log LOG = LogFactory.getLog(ZooKeeperRegistry.class);
  // hbase連接,在初始化函數中會進行設置
  ConnectionManager.HConnectionImplementation hci;

  @Override
  public void init(Connection connection) {
    if (!(connection instanceof ConnectionManager.HConnectionImplementation)) {
      throw new RuntimeException("This registry depends on HConnectionImplementation");
    }
    //設置hbase連接
    this.hci = (ConnectionManager.HConnectionImplementation)connection;
  }

  @Override
  public RegionLocations getMetaRegionLocation() throws IOException {
  	//通過hbase連接中的Configuration獲取zookeeper地址後,通過hbase連接獲取與zookeeper交互的ZooKeeperKeepAliveConnection
    ZooKeeperKeepAliveConnection zkw = hci.getKeepAliveZooKeeperWatcher();

    try {
      if (LOG.isTraceEnabled()) {
        LOG.trace("Looking up meta region location in ZK," + " connection=" + this);
      }
      //從zookeeper中獲取所有的hbase region元數據信息
      List<ServerName> servers = new MetaTableLocator().blockUntilAvailable(zkw, hci.rpcTimeout, hci.getConfiguration());
      if (LOG.isTraceEnabled()) {
        if (servers == null) {
          LOG.trace("Looked up meta region location, connection=" + this + "; servers = null");
        } else {
          StringBuilder str = new StringBuilder();
          for (ServerName s : servers) {
            str.append(s.toString());
            str.append(" ");
          }
          LOG.trace("Looked up meta region location, connection=" + this + "; servers = " + str.toString());
        }
      }
      if (servers == null) return null;
      
      //組裝hbase RegionLocations數組進行返回
      HRegionLocation[] locs = new HRegionLocation[servers.size()];
      int i = 0;
      for (ServerName server : servers) {
        HRegionInfo h = RegionReplicaUtil.getRegionInfoForReplica(HRegionInfo.FIRST_META_REGIONINFO, i);
        if (server == null) locs[i++] = null;
        else locs[i++] = new HRegionLocation(h, server, 0);
      }
      return new RegionLocations(locs);
    } catch (InterruptedException e) {
      Thread.currentThread().interrupt();
      return null;
    } finally {
      zkw.close();
    }
  }

  private String clusterId = null;

  @Override
  public String getClusterId() {
    if (this.clusterId != null) return this.clusterId;
    // No synchronized here, worse case we will retrieve it twice, that's
    //  not an issue.
    ZooKeeperKeepAliveConnection zkw = null;
    try {
      zkw = hci.getKeepAliveZooKeeperWatcher();
      this.clusterId = ZKClusterId.readClusterIdZNode(zkw);
      if (this.clusterId == null) {
        LOG.info("ClusterId read in ZooKeeper is null");
      }
    } catch (KeeperException e) {
      LOG.warn("Can't retrieve clusterId from Zookeeper", e);
    } catch (IOException e) {
      LOG.warn("Can't retrieve clusterId from Zookeeper", e);
    } finally {
      if (zkw != null) zkw.close();
    }
    return this.clusterId;
  }

  @Override
  public boolean isTableOnlineState(TableName tableName, boolean enabled)
  throws IOException {
    ZooKeeperKeepAliveConnection zkw = hci.getKeepAliveZooKeeperWatcher();
    try {
      if (enabled) {
        return ZKTableStateClientSideReader.isEnabledTable(zkw, tableName);
      }
      return ZKTableStateClientSideReader.isDisabledTable(zkw, tableName);
    } catch (KeeperException e) {
      throw new IOException("Enable/Disable failed", e);
    } catch (InterruptedException e) {
      throw new InterruptedIOException();
    } finally {
       zkw.close();
    }
  }

  @Override
  public int getCurrentNrHRS() throws IOException {
    ZooKeeperKeepAliveConnection zkw = hci.getKeepAliveZooKeeperWatcher();
    try {
      // We go to zk rather than to master to get count of regions to avoid
      // HTable having a Master dependency.  See HBase-2828
      return ZKUtil.getNumberOfChildren(zkw, zkw.rsZNode);
    } catch (KeeperException ke) {
      throw new IOException("Unexpected ZooKeeper exception", ke);
    } finally {
        zkw.close();
    }
  }
}

    這個類非常重要,因爲所有的與zookeeper的交互都由它來完成。

 

    2.3 HConnectionImplementation.getTable(TableName tableName)

    前面我們說過,推薦的使用hbase客戶端的方式如下:

Connection connection = ConnectionFactory.createConnection(configuration);  
Table table = connection.getTable(TableName.valueOf("tableName"));  

    上面2.1中已經知悉默認connection實現是HConnectionImplementation,那麼這裏我們繼續跟蹤HConnectionImplementation.getTable(TableName tableName)方法,代碼如下:

    public HTableInterface getTable(TableName tableName) throws IOException {
      return getTable(tableName, getBatchPool());
    }

   繼續看HConnectionImplementation.getTable(TableName tableName, ExecutorService pool)的代碼:

    public HTableInterface getTable(TableName tableName, ExecutorService pool) throws IOException {
      //默認managed=false
      if (managed) {
        throw new NeedUnmanagedConnectionException();
      }
      return new HTable(tableName, this, connectionConfig, rpcCallerFactory, rpcControllerFactory, pool);
    }

    繼續看HTable的構造方法HTable(TableName tableName, final ClusterConnection connection, final ConnectionConfiguration tableConfig, final RpcRetryingCallerFactory rpcCallerFactory, final RpcControllerFactory rpcControllerFactory, final ExecutorService pool),代碼如下:

public HTable(TableName tableName, final ClusterConnection connection, final ConnectionConfiguration tableConfig, final RpcRetryingCallerFactory rpcCallerFactory, final RpcControllerFactory rpcControllerFactory, final ExecutorService pool) throws IOException {
    if (connection == null || connection.isClosed()) {
      throw new IllegalArgumentException("Connection is null or closed.");
    }
    //設置hbase數據表名
    this.tableName = tableName;
    //調用close方法時,默認不關閉連接,這一點非常重要,默認調用table.close()是不會關閉之前創建的connection的,這一點在後面的table.close()裏會介紹
    this.cleanupConnectionOnClose = false;
    //設置this.connection值爲HConnectionImplementation創建的connection實現類
    this.connection = connection;
    //從HConnectionImplementation獲取客戶端傳入的configuration對象
    this.configuration = connection.getConfiguration();
    //從HConnectionImplementation獲取HConnectionImplementation基於客戶端傳入的configuration創建的configuration對象
    this.connConfiguration = tableConfig;
    //從HConnectionImplementation獲取pool,HConnectionImplementation的默認pool爲this.batchPool = getThreadPool(conf.getInt("hbase.hconnection.threads.max", 256)
    this.pool = pool;
    if (pool == null) {
      this.pool = getDefaultExecutor(this.configuration);
      this.cleanupPoolOnClose = true;
    } else {
      //在HConnectionImplementation中已經初始化了this.batchPool = getThreadPool(conf.getInt("hbase.hconnection.threads.max", 256),所以這裏會設置cleanupPoolOnClose,默認也不會關閉線程池
      this.cleanupPoolOnClose = false;
    }

    this.rpcCallerFactory = rpcCallerFactory;
    this.rpcControllerFactory = rpcControllerFactory;

    //這個方法我們後面重點關注,其根據客戶端傳入的Configuration初始化HTable的參數
    this.finishSetup();
  }

    上面的代碼我已經加了註釋,需要注意的是cleanupConnectionOnClose屬性,該屬性默認值爲false,在調用table.close()方法時候,只是關閉了table而已但table後面的connection是沒有關閉的,再者是屬性cleanupPoolOnClose,雖然我們沒有傳入線程池,但是HConnectionImplementation會自己創建線程池this.batchPool = getThreadPool(conf.getInt("hbase.hconnection.threads.max", 256)傳過來使用,所以這裏會設置this.cleanupPoolOnClose = false,默認在table.close()調用時候,也不會關閉線程池,那麼這裏這裏繼續跟蹤上面代碼最後的this.finishSetup(),代碼如下:

private void finishSetup() throws IOException {
    //HTable的屬性connConfiguration若爲空,就基於客戶端傳入的Configuration構建新的connConfiguration
    if (connConfiguration == null) {
      connConfiguration = new ConnectionConfiguration(configuration);
    }

    //HTable的屬性設置
    this.operationTimeout = tableName.isSystemTable() ? connConfiguration.getMetaOperationTimeout() : connConfiguration.getOperationTimeout();
    this.scannerCaching = connConfiguration.getScannerCaching();
    this.scannerMaxResultSize = connConfiguration.getScannerMaxResultSize();
    if (this.rpcCallerFactory == null) {
      this.rpcCallerFactory = connection.getNewRpcRetryingCallerFactory(configuration);
    }
    if (this.rpcControllerFactory == null) {
      this.rpcControllerFactory = RpcControllerFactory.instantiate(configuration);
    }

    // puts need to track errors globally due to how the APIs currently work.
    //hbase的異步操作類
    multiAp = this.connection.getAsyncProcess();

    this.closed = false;
    //hbase的region操作工具類
    this.locator = new HRegionLocator(tableName, connection);
  }
    經過上面的分析,我們有必要看下table.close()的源代碼:
public void close() throws IOException {
    //如果已經關閉了,直接返回
    if (this.closed) {
      return;
    }
    //關閉前做最後一次提交
    flushCommits();
    //默認在構造HTable時候,cleanupPoolOnClose=false,這裏不會去關閉線程池
    if (cleanupPoolOnClose) {
      this.pool.shutdown();
      try {
        boolean terminated = false;
        do {
          // wait until the pool has terminated
          terminated = this.pool.awaitTermination(60, TimeUnit.SECONDS);
        } while (!terminated);
      } catch (InterruptedException e) {
        this.pool.shutdownNow();
        LOG.warn("waitForTermination interrupted");
      }
    }
    //默認在構造HTable時候,cleanupConnectionOnClose=false,這裏不會去關閉table持有的connection
    if (cleanupConnectionOnClose) {
      if (this.connection != null) {
        this.connection.close();
      }
    }
    this.closed = true;
  }

 

    2.4 HTable.put(final List<Put> puts)

    我們已經通過如下代碼:

Connection connection = ConnectionFactory.createConnection(configuration);  
Table table = connection.getTable(TableName.valueOf("tableName"));

    創建了connection,其默認實現類爲org.apache.hadoop.hbase.client.ConnectionManager.HConnectionImplementation,然後創建了table,其默認實現類爲org.apache.hadoop.hbase.client.HTable,那麼接下來就是分析客戶端的批量提交方法:HTable.put(final List<Put> puts),代碼如下:

  public void put(final List<Put> puts) throws IOException {
    //根據設置的緩存大小,達到緩存相關值就進行批量提交
    getBufferedMutator().mutate(puts);
    //不管有無數據未提交,默認autoFlush=true,那麼就最後提交一次
    if (autoFlush) {
      flushCommits();
    }
  }

    這裏先看下HTable.getBufferedMutator()源代碼:

  BufferedMutator getBufferedMutator() throws IOException {
    if (mutator == null) {
      //從HConnectionImplementation獲取pool,HConnectionImplementation的默認pool爲this.batchPool = getThreadPool(conf.getInt("hbase.hconnection.threads.max", 256)
      //根據hbase.client.write.buffer設置的值,默認2MB,構造緩衝區
      this.mutator = (BufferedMutatorImpl) connection.getBufferedMutator(
          new BufferedMutatorParams(tableName)
              .pool(pool)
              .writeBufferSize(connConfiguration.getWriteBufferSize())
              .maxKeyValueSize(connConfiguration.getMaxKeyValueSize())
      );
    }
    return mutator;
  }

    上面的代碼默認構造了一個BufferedMutatorImpl類並返回,繼續跟蹤BufferedMutatorImpl的方法mutate(List<? extends Mutation> ms)

public void mutate(List<? extends Mutation> ms) throws InterruptedIOException, RetriesExhaustedWithDetailsException {
    //如果BufferedMutatorImpl已經關閉,直接退出返回
    if (closed) {
      throw new IllegalStateException("Cannot put when the BufferedMutator is closed.");
    }

    //這裏先不斷循環累計提交的List<Put>記錄所佔的空間,放置到toAddSize
    long toAddSize = 0;
    for (Mutation m : ms) {
      if (m instanceof Put) {
        validatePut((Put) m);
      }
      toAddSize += m.heapSize();
    }

    // This behavior is highly non-intuitive... it does not protect us against
    // 94-incompatible behavior, which is a timing issue because hasError, the below code
    // and setter of hasError are not synchronized. Perhaps it should be removed.
    if (ap.hasError()) {
      //設置BufferedMutatorImpl當前記錄的提交記錄所佔空間值爲toAddSize
      currentWriteBufferSize.addAndGet(toAddSize);
      //把提交的記錄List<Put>放置到緩存對象writeAsyncBuffer,在爲提交完成前先不進行清理
      writeAsyncBuffer.addAll(ms);
      //這裏當捕獲到異常時候,再進行異常前的一次數據提交
      backgroundFlushCommits(true);
    } else {
      //設置BufferedMutatorImpl當前記錄的提交記錄所佔空間值爲toAddSize
      currentWriteBufferSize.addAndGet(toAddSize);
      //把提交的記錄List<Put>放置到緩存對象writeAsyncBuffer,在爲提交完成前先不進行清理
      writeAsyncBuffer.addAll(ms);
    }

    // Now try and queue what needs to be queued.
    // 如果當前提交的List<Put>記錄所佔空間大於hbase.client.write.buffer設置的值,默認2MB,那麼就馬上調用backgroundFlushCommits方法
    // 如果小於hbase.client.write.buffer設置的值,那麼就直接退出,啥也不做
    while (currentWriteBufferSize.get() > writeBufferSize) {
      backgroundFlushCommits(false);
    }
  }

    上面的代碼不斷循環累計提交的List<Put>記錄所佔的空間,如果所佔空間大於hbase.client.write.buffer設置的值,那麼就馬上調用backgroundFlushCommits(false)方法,否則啥也不做,如果出錯就馬上調用一次backgroundFlushCommits(true),所以我們很有必要繼續跟蹤BufferedMutatorImpl.backgroundFlushCommits(boolean synchronous)代碼:

private void backgroundFlushCommits(boolean synchronous) throws InterruptedIOException, RetriesExhaustedWithDetailsException {
    LinkedList<Mutation> buffer = new LinkedList<>();
    // Keep track of the size so that this thread doesn't spin forever
    long dequeuedSize = 0;

    try {
      //分析所有提交的List<Put>,Put是Mutation的實現
      Mutation m;
      //如果(hbase.client.write.buffer <= 0 || 0 < (whbase.client.write.buffer * 2) || synchronous)&& writeAsyncBuffer裏仍然有Mutation對象
      //那麼就不斷計算所佔空間大小dequeuedSize
      //currentWriteBufferSize的大小則遞減
      while ((writeBufferSize <= 0 || dequeuedSize < (writeBufferSize * 2) || synchronous) && (m = writeAsyncBuffer.poll()) != null) {
        buffer.add(m);
        long size = m.heapSize();
        dequeuedSize += size;
        currentWriteBufferSize.addAndGet(-size);
      }

      //backgroundFlushCommits(false)時候,當List<Put>,這裏不會進入
      if (!synchronous && dequeuedSize == 0) {
        return;
      }

      //backgroundFlushCommits(false)時候,這裏會進入,並且不會等待結果返回
      if (!synchronous) {
        //不會等待結果返回
        ap.submit(tableName, buffer, true, null, false);
        if (ap.hasError()) {
          LOG.debug(tableName + ": One or more of the operations have failed -"
              + " waiting for all operation in progress to finish (successfully or not)");
        }
      }
      //backgroundFlushCommits(true)時候,這裏會進入,並且會等待結果返回
      if (synchronous || ap.hasError()) {
        while (!buffer.isEmpty()) {
          ap.submit(tableName, buffer, true, null, false);
        }
        //會等待結果返回
        RetriesExhaustedWithDetailsException error = ap.waitForAllPreviousOpsAndReset(null);
        if (error != null) {
          if (listener == null) {
            throw error;
          } else {
            this.listener.onException(error, this);
          }
        }
      }
    } finally {
      //如果還有數據,那麼給到外面最後提交
      for (Mutation mut : buffer) {
        long size = mut.heapSize();
        currentWriteBufferSize.addAndGet(size);
        dequeuedSize -= size;
        writeAsyncBuffer.add(mut);
      }
    }
  }

    這裏會調用ap.submit(tableName, buffer, true, null, false)直接提交,並且不會等待返回結果,而ap.submit(tableName, buffer, true, null, false)會調用AsyncProcess.submit(ExecutorService pool, TableName tableName,List<? extends Row> rows, boolean atLeastOne, Batch.Callback<CResult> callback,boolean needResults),這裏源代碼如下:

  public <CResult> AsyncRequestFuture submit(TableName tableName, List<? extends Row> rows,
      boolean atLeastOne, Batch.Callback<CResult> callback, boolean needResults)
      throws InterruptedIOException {
    return submit(null, tableName, rows, atLeastOne, callback, needResults);
  }
public <CResult> AsyncRequestFuture submit(ExecutorService pool, TableName tableName, List<? extends Row> rows, boolean atLeastOne, Batch.Callback<CResult> callback, boolean needResults) throws InterruptedIOException {
    //如果提交的記錄數爲0,就直接返回NO_REQS_RESULT
    if (rows.isEmpty()) {
      return NO_REQS_RESULT;
    }

    Map<ServerName, MultiAction<Row>> actionsByServer = new HashMap<ServerName, MultiAction<Row>>();
    //依據提交的List<Put>的記錄數構建retainedActions
    List<Action<Row>> retainedActions = new ArrayList<Action<Row>>(rows.size());

    NonceGenerator ng = this.connection.getNonceGenerator();
    long nonceGroup = ng.getNonceGroup(); // Currently, nonce group is per entire client.

    // Location errors that happen before we decide what requests to take.
    List<Exception> locationErrors = null;
    List<Integer> locationErrorRows = null;
    //只要retainedActions不爲空,那麼就一直執行
    do {
      // Wait until there is at least one slot for a new task.
      // 默認maxTotalConcurrentTasks=100,即最多100個異步線程用於處理元數據獲取任務,如果超過100,就等待
      waitForMaximumCurrentTasks(maxTotalConcurrentTasks - 1);

      // Remember the previous decisions about regions or region servers we put in the
      //  final multi.
      // 記錄本次提交的List<Put>對應的region和regionserver
      Map<HRegionInfo, Boolean> regionIncluded = new HashMap<HRegionInfo, Boolean>();
      Map<ServerName, Boolean> serverIncluded = new HashMap<ServerName, Boolean>();

      int posInList = -1;
      Iterator<? extends Row> it = rows.iterator();
      while (it.hasNext()) {
        //這裏默認傳入一個Put對象,因爲Put是Row的繼承類
        Row r = it.next();
        //建立變量loc用來存儲Put對象對應的region對應的元數據信息
        HRegionLocation loc;
        try {
          if (r == null) {
            throw new IllegalArgumentException("#" + id + ", row cannot be null");
          }
          // Make sure we get 0-s replica.
          //取得Put對象對應的region元數據信息的所有備份信息,第一次調用時候會緩存中是沒有元數據信息的,那麼就會去鏈接zookeeper上查找,找到後就加入到緩存,下一次直接從緩存中獲取
          RegionLocations locs = connection.locateRegion(
              tableName, r.getRow(), true, true, RegionReplicaUtil.DEFAULT_REPLICA_ID);
          if (locs == null || locs.isEmpty() || locs.getDefaultRegionLocation() == null) {
            throw new IOException("#" + id + ", no location found, aborting submit for"
                + " tableName=" + tableName + " rowkey=" + Bytes.toStringBinary(r.getRow()));
          }
          //取得Put對象對應的region元數據信息的所有備份信息數組中的第一個
          loc = locs.getDefaultRegionLocation();
        } catch (IOException ex) {
          locationErrors = new ArrayList<Exception>();
          locationErrorRows = new ArrayList<Integer>();
          LOG.error("Failed to get region location ", ex);
          // This action failed before creating ars. Retain it, but do not add to submit list.
          // We will then add it to ars in an already-failed state.
          retainedActions.add(new Action<Row>(r, ++posInList));
          locationErrors.add(ex);
          locationErrorRows.add(posInList);
          it.remove();
          break; // Backward compat: we stop considering actions on location error.
        }

        //這裏判斷是否可以操作,因爲最多也就100個異步線程獲取元數據信息,如果都忙就等待
        if (canTakeOperation(loc, regionIncluded, serverIncluded)) {
          Action<Row> action = new Action<Row>(r, ++posInList);
          setNonce(ng, r, action);//
          retainedActions.add(action);
          // TODO: replica-get is not supported on this path
          byte[] regionName = loc.getRegionInfo().getRegionName();
          //把同一個區的提交任務進行收集,這裏先只獲知元數據信息,用於知道數據需要提交到哪個region和regionserver,最後循環外再做提交
          addAction(loc.getServerName(), regionName, action, actionsByServer, nonceGroup);
          it.remove();
        }
      }
    } while (retainedActions.isEmpty() && atLeastOne && (locationErrors == null));

    if (retainedActions.isEmpty()) return NO_REQS_RESULT;

    // 這裏已經知道數據該提交到哪個region和regionserver,就進行批量提交
    return submitMultiActions(tableName, retainedActions, nonceGroup, callback, null, needResults, locationErrors, locationErrorRows, actionsByServer, pool);
  }

    上面代碼會去尋找提交的List<Put>的每個Put對象對應的region是哪個,對應的regionserver是哪個,然後進行批量提交,這裏要提到另外一個值hbase.client.max.total.tasks(默認值100,意思爲客戶端最大處理線程數),如果去請求Put對象對應的region是哪個和對應的regionserver是哪個的操作大於100,那麼就要等待,我們回到最初的客戶端批量提交代碼:

  public void put(final List<Put> puts) throws IOException {
    //根據設置的緩存大小,達到緩存相關值就進行批量提交
    getBufferedMutator().mutate(puts);
    //不管有無數據未提交,默認autoFlush=true,那麼就最後提交一次
    if (autoFlush) {
      flushCommits();
    }
  }

    上面的分析可知,如果客戶端提交的List<Put>所佔空間滿足不同條件會進行不同處理,總結如下:

  • List<Put>所佔空間<hbase.client.write.buffer:getBufferedMutator().mutate(puts)會直接退出,直接執行flushCommits()
  • hbase.client.write.buffer<List<Put>所佔空間<2*hbase.client.write.buffer:getBufferedMutator().mutate(puts)裏面會執行backgroundFlushCommits(false),處理完後執行flushCommits()
  • 2*hbase.client.write.buffer<List<Put>所佔空間:getBufferedMutator().mutate(puts)裏面會執行backgroundFlushCommits(false),多餘的未提交數據會保留,然後執行flushCommits()

    緊接着,如果HTable的屬性autoFlush(默認爲true),那麼不管剩下的數據多少,也會進行最後一次提交數據到hbase服務端,這時候flushCommits()裏調用的是getBufferedMutator().flush(),而getBufferedMutator().flush()調用的是BufferedMutatorImpl.backgroundFlushCommits(true),最後調用上面的ap.submit(tableName, buffer, true, null, false)並且會調用ap.waitForAllPreviousOpsAndReset(null)等待返回結果,至此hbase客戶端批量提交的源代碼分析完畢。

    

    2.5.HConnectionImplementation.locateRegionInMeta 

    上面的代碼HTable.put(final List<Put> puts)分析中我們需要關注另一個重要的信息,就是org.apache.hadoop.hbase.client.AsyncProcess的方法public <CResult> AsyncRequestFuture submit(TableName tableName, List<? extends Row> rows, boolean atLeastOne, Batch.Callback<CResult> callback, boolean needResults),在這個方法裏有這麼一段代碼:

          // 獲取我們的數據表的region信息
          RegionLocations locs = connection.locateRegion(tableName,r.getRow(), true, true, RegionReplicaUtil.DEFAULT_REPLICA_ID);

    實質是調用了org.apache.hadoop.hbase.client.ConnectionManager.HConnectionImplementation的方法public RegionLocations locateRegion(final TableName tableName, final byte [] row, boolean useCache, boolean retry, int replicaId)這個方法加載了我們的hbase數據表的region信息,代碼解釋如下:

public RegionLocations locateRegion(final TableName tableName, final byte [] row, boolean useCache, boolean retry, int replicaId) throws IOException {
      //如果當前連接已經關閉,拋出異常
      if (this.closed) throw new IOException(toString() + " closed");
      //如果客戶端傳入hbase數據表爲空,拋出異常
      if (tableName== null || tableName.getName().length == 0) {
        throw new IllegalArgumentException("table name cannot be null or zero length");
      }
      //TableName.META_TABLE_NAME=hbase:meta(冒號前hbase爲包名,meta爲表名)
      //我們傳入的是我們自己的hbase數據表名,而不是hbase:meta,所以這裏不會進入
      if (tableName.equals(TableName.META_TABLE_NAME)) {
        return locateMeta(tableName, useCache, replicaId);
      } else {
        // 這裏的代碼會進入
        // 這裏會去hbase的元數據信息表hbase:meta裏去按照我們所給的數據表名和rowkey尋找我們的hbase數據表的region信息
        return locateRegionInMeta(tableName, row, useCache, retry, replicaId);
      }
    }

    我們繼續關locateRegionInMeta(tableName, row, useCache, retry, replicaId),代碼註釋如下:

    /*
      * 這裏會去hbase的元數據信息表hbase:meta裏去按照我們所給的數據表名和rowkey尋找我們的hbase數據表的region信息
      */
    private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, boolean useCache, boolean retry, int replicaId) throws IOException {
      // 這裏傳入的useCache=true,所以會進入
      if (useCache) {
      //雖然進入了,但是第一次從緩存中找不到我們的數據表的相關信息
        RegionLocations locations = getCachedLocation(tableName, row);
        if (locations != null && locations.getRegionLocation(replicaId) != null) {
          return locations;
        }
      }

      //這裏去元數據表hbase:meta中找數據,所以需要構造rowkey
      // rowkey=tableName+我們傳入的rowkey+"99999999999999"+前面字符的md5HashBytes
      byte[] metaKey = HRegionInfo.createRegionName(tableName, row, HConstants.NINES, false);

      //這裏構造元數據表hbase:meta的查詢scan
      Scan s = new Scan();
      s.setReversed(true);
      s.setStartRow(metaKey);
      s.setSmall(true);
      s.setCaching(1);
      if (this.useMetaReplicas) {
        s.setConsistency(Consistency.TIMELINE);
      }

      //默認numTries=31次,無法從元數據表hbase:meta獲取信息,那麼就一直嘗試31次
      int localNumRetries = (retry ? numTries : 1);

      for (int tries = 0; true; tries++) {
        if (tries >= localNumRetries) {
          throw new NoServerForRegionException("Unable to find region for " + Bytes.toStringBinary(row) + " in " + tableName + " after " + localNumRetries + " tries.");
        }
        if (useCache) {//這裏雖然進入了,因爲useCache=true,但是我們第一次還是無法從緩存拿到數據
          RegionLocations locations = getCachedLocation(tableName, row);
          if (locations != null && locations.getRegionLocation(replicaId) != null) {
            return locations;
          }
        } else {
          // If we are not supposed to be using the cache, delete any existing cached location
          // so it won't interfere.
          metaCache.clearCache(tableName, row);
        }

        
        // 因爲緩存拿不到,那麼就從元數據表hbase:meta獲取region信息
        try {
          Result regionInfoRow = null;
          ReversedClientScanner rcs = null;
          try {
            //這裏很重要,告訴剛纔構造的scan用於表TableName.META_TABLE_NAME,而TableName.META_TABLE_NAME=hbase:meta
            rcs = new ClientSmallReversedScanner(conf, s, TableName.META_TABLE_NAME, this, rpcCallerFactory, rpcControllerFactory, getMetaLookupPool(), 0);
            //好了,這裏拿到了我們的數據表的regionInfoRow信息,regionInfoRow是元數據表hbase:meta中的一行數據
            regionInfoRow = rcs.next();
          } finally {
            if (rcs != null) {
              rcs.close();
            }
          }

          if (regionInfoRow == null) {
            throw new TableNotFoundException(tableName);
          }

          // 轉換數據表的regionInfoRow信息爲我們需要的HRegionLocation
          RegionLocations locations = MetaTableAccessor.getRegionLocations(regionInfoRow);
          if (locations == null || locations.getRegionLocation(replicaId) == null) {
            throw new IOException("HRegionInfo was null in " + tableName + ", row=" + regionInfoRow);
          }
          
          //我們拿到了我們的hbase數據表的HRegionLocation,但是此時再做個檢查,避免此時hbase宕機了或者已經split了或者拿錯了
          HRegionInfo regionInfo = locations.getRegionLocation(replicaId).getRegionInfo();
          if (regionInfo == null) {
            throw new IOException("HRegionInfo was null or empty in " + TableName.META_TABLE_NAME + ", row=" + regionInfoRow);
          }
          if (!regionInfo.getTable().equals(tableName)) {
            throw new TableNotFoundException( "Table '" + tableName + "' was not found, got: " + regionInfo.getTable() + ".");
          }
          if (regionInfo.isSplit()) {
            throw new RegionOfflineException("the only available region for" + " the required row is a split parent," + " the daughters should be online soon: " + regionInfo.getRegionNameAsString());
          }
          if (regionInfo.isOffline()) {
            throw new RegionOfflineException("the region is offline, could" + " be caused by a disable table call: " + regionInfo.getRegionNameAsString());
          }
          ServerName serverName = locations.getRegionLocation(replicaId).getServerName();
          if (serverName == null) {
            throw new NoServerForRegionException("No server address listed " + "in " + TableName.META_TABLE_NAME + " for region " + regionInfo.getRegionNameAsString() + " containing row " + Bytes.toStringBinary(row));
          }
          if (isDeadServer(serverName)){
            throw new RegionServerStoppedException("hbase:meta says the region "+ regionInfo.getRegionNameAsString()+" is managed by the server " + serverName + ", but it is dead.");
          }
          
          // 好了檢查無誤了,那麼爲了讓下一次不要這麼麻煩,先緩存起來,這樣拿的也快
          cacheLocation(tableName, locations);
          // 好了,該返回region信息了
          return locations;
        } catch (TableNotFoundException e) {
          // if we got this error, probably means the table just plain doesn't
          // exist. rethrow the error immediately. this should always be coming
          // from the HTable constructor.
          throw e;
        } catch (IOException e) {
          ExceptionUtil.rethrowIfInterrupt(e);

          if (e instanceof RemoteException) {
            e = ((RemoteException)e).unwrapRemoteException();
          }
          if (tries < localNumRetries - 1) {
            if (LOG.isDebugEnabled()) {
              LOG.debug("locateRegionInMeta parentTable=" + TableName.META_TABLE_NAME + ", metaLocation=" + ", attempt=" + tries + " of " + localNumRetries + " failed; retrying after sleep of " + ConnectionUtils.getPauseTime(this.pause, tries) + " because: " + e.getMessage());
            }
          } else {
            throw e;
          }
          // Only relocate the parent region if necessary
          if(!(e instanceof RegionOfflineException || e instanceof NoServerForRegionException)) {
            relocateRegion(TableName.META_TABLE_NAME, metaKey, replicaId);
          }
        }
        //沒找到,那麼沉睡一段時間然後重試次數未到31次,那麼繼續循環找吧,直到找到,如果次數大於31,那麼只有拋出異常
        try{
          Thread.sleep(ConnectionUtils.getPauseTime(this.pause, tries));
        } catch (InterruptedException e) {
          throw new InterruptedIOException("Giving up trying to location region in " + "meta: thread is interrupted.");
        }
      }
    }

    上述代碼我們可以得知在首次org.apache.hadoop.hbase.client.ConnectionManager.HConnectionImplementation是如何加載我們需要的hbase數據表的信息的,我們看到hbase有個元數據表hbase:meta,這裏hbase是namespace而meta是表名,我們自己創建的數據表的元數據信息都存儲在這個元數據表hbase:meta中,第一次的時候會去元數據表hbase:meta中查找,找到後就加入緩存,第二次的時候直接從緩存獲取我們的數據表的region信息

 

3.從分析源碼中學到的對於hbase客戶端的優化知識

  • hbase客戶端裏傳入hbase.client.write.buffer(默認2MB),加到客戶端提交的緩存大小;
  • hbase客戶端提交採用批量提交,批量提交的List<Put>的size計算公式=hbase.client.write.buffer*2/Put大小,Put大小可通過put.heapSize()獲取,以hbase.client.write.buffer=2097152,put.heapSize()=1320舉例,最佳的批量提交記錄大小=2*2097152/1320=3177;
  • hbase客戶端儘量採用多線程併發寫
  • hbase客戶端所在機器性能要好,不然速度上不去
  • 能接受關閉WAL的話儘量關閉,速度也會相應提升

4.hbase性能調研寫入速度測試記錄



 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章