MySQL分區表(理論+實戰)

參考

https://dev.mysql.com/doc/refman/5.7/en/create-table.html#create-table-partitioning

分類

1.HASH

哈希一個或多個列，以創建用於放置和定位行的鍵。expr是一個使用一個或多個表列的表達式。這可以是任何有效的MySQL表達式(包括MySQL函數)，生成一個整數值。例如，它們都是使用分區BY散列的有效CREATE TABLE語句

CREATE TABLE t1 (col1 INT, col2 CHAR(5))
    PARTITION BY HASH(col1);

CREATE TABLE t1 (col1 INT, col2 CHAR(5), col3 DATETIME)
    PARTITION BY HASH ( YEAR(col3) );

2.KEY

這類似於散列，只是MySQL提供了散列函數，以保證數據的均勻分佈。column_list參數只是一個包含1個或多個表列的列表(最多16個)。這個例子展示了一個簡單的按鍵分區的表，有4個分區

CREATE TABLE tk (col1 INT, col2 CHAR(5), col3 DATE)
    PARTITION BY KEY(col3)
    PARTITIONS 4;

3.RANGE

在本例中，expr使用一組小於操作符的值顯示一個值範圍。在使用範圍分區時，必須使用小於的值定義至少一個分區。不能在範圍分區中使用值

CREATE TABLE t1 (
    year_col  INT,
    some_data INT
)
PARTITION BY RANGE (year_col) (
    PARTITION p0 VALUES LESS THAN (1991),
    PARTITION p1 VALUES LESS THAN (1995),
    PARTITION p2 VALUES LESS THAN (1999),
    PARTITION p3 VALUES LESS THAN (2002),
    PARTITION p4 VALUES LESS THAN (2006),
    PARTITION p5 VALUES LESS THAN MAXVALUE
);

4.LIST

這在根據表列(可能的值的限制集，例如州或國家代碼)分配分區時非常有用。在這種情況下，可以將屬於某個州或國家的所有行分配給單個分區，或者爲某個州或國家集保留一個分區。它與RANGE類似，只是可以使用IN中的值爲每個分區指定允許的值。值IN與要匹配的值列表一起使用。例如，你可以創建一個分區方案如下

CREATE TABLE client_firms (
    id   INT,
    name VARCHAR(35)
)
PARTITION BY LIST (id) (
    PARTITION r0 VALUES IN (1, 5, 9, 13, 17, 21),
    PARTITION r1 VALUES IN (2, 6, 10, 14, 18, 22),
    PARTITION r2 VALUES IN (3, 7, 11, 15, 19, 23),
    PARTITION r3 VALUES IN (4, 8, 12, 16, 20, 24)
);

說明

一般來說生產系統中大多以時間作爲分區，所以本文僅針對range分區進行詳細介紹

實戰

準備測試表和測試數據

drop table if exists test_table;
create table if not exists test_table (
    data_date date,
    json_str json
);
insert into test_table values('2019-01-01','{"name":"xavier","age":"18","gender":"male"}');
insert into test_table values('2019-01-02','{"name":"xavier","age":"19","gender":"male"}');
insert into test_table values('2019-01-02','{"name":"xavier","age":"20","gender":"male"}');
insert into test_table values('2019-01-03','{"name":"xavier","age":"21","gender":"male"}');
insert into test_table values('2019-01-03','{"name":"xavier","age":"21","gender":"male"}');
insert into test_table values('2019-01-03','{"name":"xavier","age":"22","gender":"male"}');
insert into test_table values('2019-01-04','{"name":"xavier","age":"23","gender":"male"}');
insert into test_table values('2019-01-04','{"name":"xavier","age":"24","gender":"male"}');
insert into test_table values('2019-01-04','{"name":"xavier","age":"25","gender":"male"}');
insert into test_table values('2019-01-04','{"name":"xavier","age":"26","gender":"male"}');
insert into test_table values('2019-01-05','{"name":"xavier","age":"27","gender":"male"}');
insert into test_table values('2019-01-05','{"name":"xavier","age":"28","gender":"male"}');
insert into test_table values('2019-01-05','{"name":"xavier","age":"29","gender":"male"}');
insert into test_table values('2019-01-05','{"name":"xavier","age":"30","gender":"male"}');
insert into test_table values('2019-01-05','{"name":"xavier","age":"31","gender":"male"}');

查看普通表的執行計劃

explain select * from test_table;

可以發現沒有分區，且全表掃描，接着創建時間分區表和測試數據（注意：分區表一定要預先創建好）

drop table if exists range_table;
create table if not exists range_table (
    data_date datetime,
    json_str json
)
partition by range (to_days(data_date)) (
    partition p_20190101 values less than (to_days('2019-01-02')),
    partition p_20190102 values less than (to_days('2019-01-03')),
    partition p_20190103 values less than (to_days('2019-01-04')),
    partition p_20190104 values less than (to_days('2019-01-05')),
    partition p_maxvalue values less than maxvalue  -- 容錯
);
insert into range_table values('2019-01-01','{"name":"xavier","age":"18","gender":"male"}');
insert into range_table values('2019-01-02','{"name":"xavier","age":"19","gender":"male"}');
insert into range_table values('2019-01-02','{"name":"xavier","age":"20","gender":"male"}');
insert into range_table values('2019-01-03','{"name":"xavier","age":"21","gender":"male"}');
insert into range_table values('2019-01-03','{"name":"xavier","age":"21","gender":"male"}');
insert into range_table values('2019-01-03','{"name":"xavier","age":"22","gender":"male"}');
insert into range_table values('2019-01-04','{"name":"xavier","age":"23","gender":"male"}');
insert into range_table values('2019-01-04','{"name":"xavier","age":"24","gender":"male"}');
insert into range_table values('2019-01-04','{"name":"xavier","age":"25","gender":"male"}');
insert into range_table values('2019-01-04','{"name":"xavier","age":"26","gender":"male"}');

查看分區表的執行計劃

explain select * from range_table;

此時就發現已經有了分區信息，說明分區已經生效，那比如我們查詢‘2019-01-03’那天的數據

explain select * from range_table where data_date='2019-01-03';

可以發現直接鎖定到了分區p_20190103中並且總記錄是3條，很明顯分區的優勢已經體現出來了，不需要再進行全表掃描了

補充

因爲這個分區是預先定義好的不會動態增加，所以需要一條SQL用來專門增加分區

alter table range_table reorganize partition p_maxvalue into (
     partition p_20190105 values less than (to_days('2019-01-06'))
    ,partition p_maxvalue values less than (maxvalue)
);

更多詳細分區信息如下

select table_name,partition_name,subpartition_name,partition_ordinal_position,partition_method,partition_expression,partition_description,table_rows,avg_row_length,data_length,create_time,update_time
from information_schema.partitions where table_schema='test' and table_name = 'range_table';

MySQL分區表(理論+實戰)

參考

分類

說明

實戰

補充

linux安裝cuda和cudnn

測試人員都是畫畫大神，讓我看看誰還不會用代碼圖？

Object.values()對象遍歷

我拍了拍Redis，被移出了羣聊···

網絡現代化通向雲原生應用的高速公路

面試官：說說你對序列化的理解

我宣佈，這是我找到的史上AI最全論文體系！

HBase1.4.12

Hive優雅的處理Json數據

Hue4.2.0

Spark on Hive with Thriftserver

基於Zookeeper客戶端Curator監聽節點上下線

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結