HBase表设计 - 盐表Salted Table

本文的主线 概念 => 分区 => 原理 => 优点 => 缺点 => 表格存储

本文基于Phoenix搭建

概念

  • HBase sequential write may suffer from region server hotspotting if your row key is monotonically increasing

  • Salting the row key provides a way to mitigate the problem

  • Phoenix provides a way to transparently salt the row key with a salting byte for a particular table. You need to specify this in table creation time by specifying a table property “SALT_BUCKETS” with a value from 1 to 256

分区

CREATE TABLE IF NOT EXISTS t_normal (
    id VARCHAR PRIMARY KEY,
    name VARCHAR,
    age INTEGER,
    address VARCHAR
);
UPSERT INTO t_normal VALUES('id1', 'XiaoWang', 22, 'London');

UPSERT INTO t_normal VALUES('id2', 'XiaoWeng', 18, 'New York');
./hbase-2.0.0/bin/hbase shell

scan 'T_NORMAL'
ROW         COLUMN+CELL
 id1        column=0:\x00\x00\x00\x00, timestamp=1609143024854, value=x
 id1        column=0:\x80\x0B, timestamp=1609143024854, value=XiaoWang
 id1        column=0:\x80\x0C, timestamp=1609143024854, value=\x80\x00\x00\x16
 id1        column=0:\x80\x0D, timestamp=1609143024854, value=London
 id2        column=0:\x00\x00\x00\x00, timestamp=1609143028213, value=x
 id2        column=0:\x80\x0B, timestamp=1609143028213, value=XiaoWeng
 id2        column=0:\x80\x0C, timestamp=1609143028213, value=\x80\x00\x00\x12
 id2        column=0:\x80\x0D, timestamp=1609143028213, value=New York
2 row(s)
Took 0.1361 seconds
CREATE TABLE IF NOT EXISTS t_salt (
    id VARCHAR PRIMARY KEY,
    name VARCHAR,
    age INTEGER,
    address VARCHAR
) SALT_BUCKETS = 5;
UPSERT INTO t_salt VALUES('id1', 'XiaoWang', 22, 'London');

UPSERT INTO t_salt VALUES('id2', 'XiaoWeng', 18, 'New York');
./hbase-2.0.0/bin/hbase shell

scan 'T_SALT'
ROW         COLUMN+CELL
 \x00id1    column=0:\x00\x00\x00\x00, timestamp=1609143085641, value=x
 \x00id1    column=0:\x80\x0B, timestamp=1609143085641, value=XiaoWang
 \x00id1    column=0:\x80\x0C, timestamp=1609143085641, value=\x80\x00\x00\x16
 \x00id1    column=0:\x80\x0D, timestamp=1609143085641, value=London
 \x01id2    column=0:\x00\x00\x00\x00, timestamp=1609143089175, value=x
 \x01id2    column=0:\x80\x0B, timestamp=1609143089175, value=XiaoWeng
 \x01id2    column=0:\x80\x0C, timestamp=1609143089175, value=\x80\x00\x00\x12
 \x01id2    column=0:\x80\x0D, timestamp=1609143089175, value=New York
2 row(s)
Took 0.5924 seconds

原理

new_row_key = (++index % BUCKETS_NUMBER) + original_row_key

优点

  • Using salted table with pre-split would help uniformly distribute write workload across all the region servers, thus improves the write performance

  • Reading from salted table can also reap benefits from the more uniform distribution of data

缺点

  • When doing a parallel scan across all region servers, we can take advantage of this properties to perform a merge sort of the client side

表格存储

  • 主键是数据表中每一行的唯一标识 主键由1到4个主键列组成

  • 组成主键的第一个主键列又称为分区键

表格存储会根据数据表中每一行分区键的值所属的范围自动将一行数据分配到对应的分区和机器上 以达到负载均衡的目的

参考

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章