HBase表設計 - 鹽表Salted Table

本文的主線 概念 => 分區 => 原理 => 優點 => 缺點 => 表格存儲

本文基於Phoenix搭建

概念

  • HBase sequential write may suffer from region server hotspotting if your row key is monotonically increasing

  • Salting the row key provides a way to mitigate the problem

  • Phoenix provides a way to transparently salt the row key with a salting byte for a particular table. You need to specify this in table creation time by specifying a table property “SALT_BUCKETS” with a value from 1 to 256

分區

CREATE TABLE IF NOT EXISTS t_normal (
    id VARCHAR PRIMARY KEY,
    name VARCHAR,
    age INTEGER,
    address VARCHAR
);
UPSERT INTO t_normal VALUES('id1', 'XiaoWang', 22, 'London');

UPSERT INTO t_normal VALUES('id2', 'XiaoWeng', 18, 'New York');
./hbase-2.0.0/bin/hbase shell

scan 'T_NORMAL'
ROW         COLUMN+CELL
 id1        column=0:\x00\x00\x00\x00, timestamp=1609143024854, value=x
 id1        column=0:\x80\x0B, timestamp=1609143024854, value=XiaoWang
 id1        column=0:\x80\x0C, timestamp=1609143024854, value=\x80\x00\x00\x16
 id1        column=0:\x80\x0D, timestamp=1609143024854, value=London
 id2        column=0:\x00\x00\x00\x00, timestamp=1609143028213, value=x
 id2        column=0:\x80\x0B, timestamp=1609143028213, value=XiaoWeng
 id2        column=0:\x80\x0C, timestamp=1609143028213, value=\x80\x00\x00\x12
 id2        column=0:\x80\x0D, timestamp=1609143028213, value=New York
2 row(s)
Took 0.1361 seconds
CREATE TABLE IF NOT EXISTS t_salt (
    id VARCHAR PRIMARY KEY,
    name VARCHAR,
    age INTEGER,
    address VARCHAR
) SALT_BUCKETS = 5;
UPSERT INTO t_salt VALUES('id1', 'XiaoWang', 22, 'London');

UPSERT INTO t_salt VALUES('id2', 'XiaoWeng', 18, 'New York');
./hbase-2.0.0/bin/hbase shell

scan 'T_SALT'
ROW         COLUMN+CELL
 \x00id1    column=0:\x00\x00\x00\x00, timestamp=1609143085641, value=x
 \x00id1    column=0:\x80\x0B, timestamp=1609143085641, value=XiaoWang
 \x00id1    column=0:\x80\x0C, timestamp=1609143085641, value=\x80\x00\x00\x16
 \x00id1    column=0:\x80\x0D, timestamp=1609143085641, value=London
 \x01id2    column=0:\x00\x00\x00\x00, timestamp=1609143089175, value=x
 \x01id2    column=0:\x80\x0B, timestamp=1609143089175, value=XiaoWeng
 \x01id2    column=0:\x80\x0C, timestamp=1609143089175, value=\x80\x00\x00\x12
 \x01id2    column=0:\x80\x0D, timestamp=1609143089175, value=New York
2 row(s)
Took 0.5924 seconds

原理

new_row_key = (++index % BUCKETS_NUMBER) + original_row_key

優點

  • Using salted table with pre-split would help uniformly distribute write workload across all the region servers, thus improves the write performance

  • Reading from salted table can also reap benefits from the more uniform distribution of data

缺點

  • When doing a parallel scan across all region servers, we can take advantage of this properties to perform a merge sort of the client side

表格存儲

  • 主鍵是數據表中每一行的唯一標識 主鍵由1到4個主鍵列組成

  • 組成主鍵的第一個主鍵列又稱爲分區鍵

表格存儲會根據數據表中每一行分區鍵的值所屬的範圍自動將一行數據分配到對應的分區和機器上 以達到負載均衡的目的

參考

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章