本文的主線 概念 => 分區 => 原理 => 優點 => 缺點 => 表格存儲
本文基於Phoenix搭建
概念
HBase sequential write may suffer from region server hotspotting if your row key is monotonically increasing
Salting the row key provides a way to mitigate the problem
Phoenix provides a way to transparently salt the row key with a salting byte for a particular table. You need to specify this in table creation time by specifying a table property “SALT_BUCKETS” with a value from 1 to 256
分區
CREATE TABLE IF NOT EXISTS t_normal (
id VARCHAR PRIMARY KEY,
name VARCHAR,
age INTEGER,
address VARCHAR
);
UPSERT INTO t_normal VALUES('id1', 'XiaoWang', 22, 'London');
UPSERT INTO t_normal VALUES('id2', 'XiaoWeng', 18, 'New York');
./hbase-2.0.0/bin/hbase shell
scan 'T_NORMAL'
ROW COLUMN+CELL
id1 column=0:\x00\x00\x00\x00, timestamp=1609143024854, value=x
id1 column=0:\x80\x0B, timestamp=1609143024854, value=XiaoWang
id1 column=0:\x80\x0C, timestamp=1609143024854, value=\x80\x00\x00\x16
id1 column=0:\x80\x0D, timestamp=1609143024854, value=London
id2 column=0:\x00\x00\x00\x00, timestamp=1609143028213, value=x
id2 column=0:\x80\x0B, timestamp=1609143028213, value=XiaoWeng
id2 column=0:\x80\x0C, timestamp=1609143028213, value=\x80\x00\x00\x12
id2 column=0:\x80\x0D, timestamp=1609143028213, value=New York
2 row(s)
Took 0.1361 seconds
CREATE TABLE IF NOT EXISTS t_salt (
id VARCHAR PRIMARY KEY,
name VARCHAR,
age INTEGER,
address VARCHAR
) SALT_BUCKETS = 5;
UPSERT INTO t_salt VALUES('id1', 'XiaoWang', 22, 'London');
UPSERT INTO t_salt VALUES('id2', 'XiaoWeng', 18, 'New York');
./hbase-2.0.0/bin/hbase shell
scan 'T_SALT'
ROW COLUMN+CELL
\x00id1 column=0:\x00\x00\x00\x00, timestamp=1609143085641, value=x
\x00id1 column=0:\x80\x0B, timestamp=1609143085641, value=XiaoWang
\x00id1 column=0:\x80\x0C, timestamp=1609143085641, value=\x80\x00\x00\x16
\x00id1 column=0:\x80\x0D, timestamp=1609143085641, value=London
\x01id2 column=0:\x00\x00\x00\x00, timestamp=1609143089175, value=x
\x01id2 column=0:\x80\x0B, timestamp=1609143089175, value=XiaoWeng
\x01id2 column=0:\x80\x0C, timestamp=1609143089175, value=\x80\x00\x00\x12
\x01id2 column=0:\x80\x0D, timestamp=1609143089175, value=New York
2 row(s)
Took 0.5924 seconds
原理
new_row_key = (++index % BUCKETS_NUMBER) + original_row_key
優點
Using salted table with pre-split would help uniformly distribute write workload across all the region servers, thus improves the write performance
Reading from salted table can also reap benefits from the more uniform distribution of data
缺點
- When doing a parallel scan across all region servers, we can take advantage of this properties to perform a merge sort of the client side
表格存儲
主鍵是數據表中每一行的唯一標識 主鍵由1到4個主鍵列組成
組成主鍵的第一個主鍵列又稱爲分區鍵
表格存儲會根據數據表中每一行分區鍵的值所屬的範圍自動將一行數據分配到對應的分區和機器上 以達到負載均衡的目的