MySQL分区(一)

目录

一、分区概述

二、分区类型

1. Range分区

2.List分区

3. Columns分区

         4. Hash分区

5. Key分区

三、注意


一、分区概述

分区是指根据一定的规则,数据库把一个表分解成多个更小的、更容易管理的部分。分区有利于管理非常大的表。

MySQL分区的优点主要包括以下4个方面:

  • 和单个磁盘或者文件系统相比,可以存储更多的数据;
  • 优化查询。where子句包含分区条件时,可以只扫描对应分区,缩小了查询范围。同时在涉及count()和sum()等聚合函数时,可以在多个分区上并行处理;
  • 对于已经过期或不需要的数据,可以通过删除分区快速删除;
  • 跨多个磁盘来分散数据查询,以获得更大的查询吞吐量;

查看当前版本是否支持分区,执行

SHOW PLUGINS;

官方分区文档地址 

二、分区类型

MySQL5.5之后分区类型主要有五大类:

  • RANGE分区:列值在给定范围内,则属于该分区;
  • LIST分区:和range类似,不同在于,不是范围而是一组离散值;列值在这组离散值中就在这个分区;
  • COLUMNS分区:分为Range Columns分区和List Columns分区,这两者分别是range分区和list分区的扩展;
  • HASH分区:基于给定的分区个数进行分区;
  • KEY分区:类似于HASH分区;

感念比较抽象,结合示例比较好理解

1. Range分区

语法:

partition by range (expr) (

    partition pName values less than (val),

    .....

)

其中,expr为列或者基于列的表达式,类型必须为整数[TINYINTSMALLINTMEDIUMINTINT (INTEGER), BIGINT]

pName是分区名称,可自定义

less than (val) :比val小

val为临界值,整型,也可以是运算结果是整型的表达式

示例:

CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT NOT NULL,
    store_id INT NOT NULL
)
PARTITION BY RANGE (store_id) (
    PARTITION p0 VALUES LESS THAN (6),
    PARTITION p1 VALUES LESS THAN (11),
    PARTITION p2 VALUES LESS THAN (16),
    PARTITION p3 VALUES LESS THAN MAXVALUE
);

maxvalue代表最大值(MAXVALUE is used to represent the least upper bound for the type of integer in question. -MAXVALUE represents the greatest lower bound.)

示例:

CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY RANGE ( YEAR(separated) ) (
    PARTITION p0 VALUES LESS THAN (1991),
    PARTITION p1 VALUES LESS THAN (1996),
    PARTITION p2 VALUES LESS THAN (2001),
    PARTITION p3 VALUES LESS THAN MAXVALUE
);

示例:

CREATE TABLE quarterly_report_status (
    report_id INT NOT NULL,
    report_status VARCHAR(20) NOT NULL,
    report_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
)
PARTITION BY RANGE ( UNIX_TIMESTAMP(report_updated) ) (
    PARTITION p0 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-01-01 00:00:00') ),
    PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-04-01 00:00:00') ),
    PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-07-01 00:00:00') ),
    PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-10-01 00:00:00') ),
    PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-01-01 00:00:00') ),
    PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-04-01 00:00:00') ),
    PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-07-01 00:00:00') ),
    PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-10-01 00:00:00') ),
    PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP('2010-01-01 00:00:00') ),
    PARTITION p9 VALUES LESS THAN (MAXVALUE)
);

由于range仅支持整型,这里使用函数转换

2.List分区

语法:

partition by list(expr) (

    partition pName values in (val1,val2,...,valx),

    .....

)

List分区和Range分区主要区别,在List分区中,每个分区都是基于一组离散值列表,而Range分区是基于连续范围

示例:

CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY LIST(store_id) (
    PARTITION pNorth VALUES IN (3,5,6,9,17),
    PARTITION pEast VALUES IN (1,2,10,11,19,20),
    PARTITION pWest VALUES IN (4,12,13,14,18),
    PARTITION pCentral VALUES IN (7,8,15,16)
);

注意,如果插入数据不能划分到任一分区,则插入失败

示例:

mysql> CREATE TABLE h2 (
    ->   c1 INT,
    ->   c2 INT
    -> )
    -> PARTITION BY LIST(c1) (
    ->   PARTITION p0 VALUES IN (1, 4, 7),
    ->   PARTITION p1 VALUES IN (2, 5, 8)
    -> );
Query OK, 0 rows affected (0.11 sec)

mysql> INSERT INTO h2 VALUES (3, 5);
ERROR 1525 (HY000): Table has no partition for value 3

 该错误会导致事物回滚,批量插入时需小心,尤其出现null值时,null值不在列表里,也会插入失败

这个时候可以使用 ignore

mysql> INSERT IGNORE INTO h2 VALUES (2, 5), (6, 10), (7, 5), (3, 1), (1, 9);
Query OK, 3 rows affected (0.00 sec)
Records: 5  Duplicates: 2  Warnings: 0

mysql> SELECT * FROM h2;
+------+------+
| c1   | c2   |
+------+------+
|    7 |    5 |
|    1 |    9 |
|    2 |    5 |
+------+------+
3 rows in set (0.00 sec)

3. Columns分区

Columns分区分为Range Columns和List Columns,分别是Range分区和List分区的变种。Columns分区可以在分区键上使用多列。另外,Columns分区支持非整型的列,支持的数据类型如下:

3.1 Range Columns分区

语法:

partition by range columns( column_name [,column_name] [,...] ) (

    partition pName values less than ( val [,val] [,...] ) ,

    ....

)

Range Columns分区和Range分区不同:

  • Range Columns不接受表达式,仅支持列(column_name)
  • Range Columns可以接受多个列
  • Range Columns不限于整数列,支持string, DATE 和 DATETIME

示例:

CREATE TABLE rcx (
       a INT,
       b INT,
       c CHAR(3),
       d INT
)
PARTITION BY RANGE COLUMNS(a,d,c) (
       PARTITION p0 VALUES LESS THAN (5,10,'ggg'),
       PARTITION p1 VALUES LESS THAN (10,20,'mmm'),
       PARTITION p2 VALUES LESS THAN (15,30,'sss'),
       PARTITION p3 VALUES LESS THAN (MAXVALUE,MAXVALUE,MAXVALUE)
);

 注意:column_name的数量和val数量要一致,并一一对应;这里比较是元组的整体比较而不是单个数值

CREATE TABLE rc1 (
    a INT,
    b INT
)
PARTITION BY RANGE COLUMNS(a, b) (
    PARTITION p0 VALUES LESS THAN (5, 12),
    PARTITION p1 VALUES LESS THAN (MAXVALUE, MAXVALUE)
);
mysql> INSERT INTO rc1 VALUES (5,10), (5,11), (5,12);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT PARTITION_NAME,TABLE_ROWS
    ->     FROM INFORMATION_SCHEMA.PARTITIONS
    ->     WHERE TABLE_NAME = 'rc1';
+--------------+----------------+------------+
| TABLE_SCHEMA | PARTITION_NAME | TABLE_ROWS |
+--------------+----------------+------------+
| p            | p0             |          2 |
| p            | p1             |          1 |
+--------------+----------------+------------+
2 rows in set (0.00 sec)

 这里可以看出(5,10)(5,11)都进入了p0,(5,12)进入p1(information_schema下的partitions表中保存了各表分区信息)

mysql> SELECT (5,10) < (5,12), (5,11) < (5,12), (5,12) < (5,12);
+-----------------+-----------------+-----------------+
| (5,10) < (5,12) | (5,11) < (5,12) | (5,12) < (5,12) |
+-----------------+-----------------+-----------------+
|               1 |               1 |               0 |
+-----------------+-----------------+-----------------+
1 row in set (0.00 sec)

示例: 

CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT NOT NULL,
    store_id INT NOT NULL
);
ALTER TABLE employees PARTITION BY RANGE COLUMNS (hired)  (
    PARTITION p0 VALUES LESS THAN ('1970-01-01'),
    PARTITION p1 VALUES LESS THAN ('1980-01-01'),
    PARTITION p2 VALUES LESS THAN ('1990-01-01'),
    PARTITION p3 VALUES LESS THAN ('2000-01-01'),
    PARTITION p4 VALUES LESS THAN ('2010-01-01'),
    PARTITION p5 VALUES LESS THAN (MAXVALUE)
);

3.2 List Columns分区

List Columns分区是List分区的变种,主要区别是支持的列类型增多

CREATE TABLE customers_1 (
    first_name VARCHAR(25),
    last_name VARCHAR(25),
    street_1 VARCHAR(30),
    street_2 VARCHAR(30),
    city VARCHAR(15),
    renewal DATE
)
PARTITION BY LIST COLUMNS(city) (
    PARTITION pRegion_1 VALUES IN('Oskarshamn', 'Högsby', 'Mönsterås'),
    PARTITION pRegion_2 VALUES IN('Vimmerby', 'Hultsfred', 'Västervik'),
    PARTITION pRegion_3 VALUES IN('Nässjö', 'Eksjö', 'Vetlanda'),
    PARTITION pRegion_4 VALUES IN('Uppvidinge', 'Alvesta', 'Växjo')
);
CREATE TABLE customers_2 (
    first_name VARCHAR(25),
    last_name VARCHAR(25),
    street_1 VARCHAR(30),
    street_2 VARCHAR(30),
    city VARCHAR(15),
    renewal DATE
)
PARTITION BY LIST COLUMNS(renewal) (
    PARTITION pWeek_1 VALUES IN('2010-02-01', '2010-02-02', '2010-02-03',
        '2010-02-04', '2010-02-05', '2010-02-06', '2010-02-07'),
    PARTITION pWeek_2 VALUES IN('2010-02-08', '2010-02-09', '2010-02-10',
        '2010-02-11', '2010-02-12', '2010-02-13', '2010-02-14'),
    PARTITION pWeek_3 VALUES IN('2010-02-15', '2010-02-16', '2010-02-17',
        '2010-02-18', '2010-02-19', '2010-02-20', '2010-02-21'),
    PARTITION pWeek_4 VALUES IN('2010-02-22', '2010-02-23', '2010-02-24',
        '2010-02-25', '2010-02-26', '2010-02-27', '2010-02-28')
);

4. Hash分区

语法:

partition by hash (expr) partitions num

其中,expr为整型列或者返回值为整型的表达式,num为正整数,分区个数

示例:

CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY HASH(store_id)
PARTITIONS 4;
CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY HASH( YEAR(hired) )
PARTITIONS 4;

Hash分区主要是用来确保数据均匀的分布在各分区

算法逻辑:

N = MOD(exprnum)

即expr对分区个数num取余,得到所在分区

注意:expr的值需要一定区分度,即尽量满足线性均匀分布;否则会导致大量数据集中在某一个分区,导致分区效果不明显

4.1 Linear Hash分区(线性Hash分区)

语法:

partition by linear hash (expr) partitions num

和Hash分区不同点:

  • 语法上多了linear
  • 求取N的算法不再是简单的取余,而是一套复杂的逻辑,有兴趣的可以参考Linear Hash分区

和Hash分区比优势在于,分区的添加,删除,合并和拆分速度更快,这在处理包含大量数据(TB)的表时可能会很有用。缺点是,与使用常规哈希分区获得的分布相比,数据不太可能在分区之间均匀分布

示例:

CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY LINEAR HASH( YEAR(hired) )
PARTITIONS 4;

5. Key分区

语法:

partition by key( [column_name] [,column_name] [,...] ) partitions num

和Hash分区类似,不同在于:

  • Key分区仅能使用column
  • Key分区支持多个列
  • Key分区支持整型外的其他类型

示例:

CREATE TABLE k1 (
    id INT NOT NULL PRIMARY KEY,
    name VARCHAR(20)
)
PARTITION BY KEY()   --- 默认使用主键
PARTITIONS 2;

CREATE TABLE k1 (
    id INT NOT NULL,
    name VARCHAR(20),
    UNIQUE KEY (id)
)
PARTITION BY KEY()  --- 使用唯一索引列
PARTITIONS 2;

CREATE TABLE tm1 (
    s1 CHAR(32) PRIMARY KEY
)
PARTITION BY KEY(s1)
PARTITIONS 10;

5.1 Linear Key分区(线性Key分区)

和Linear Hash分区逻辑相似

示例:

CREATE TABLE tk (
    col1 INT NOT NULL,
    col2 CHAR(5),
    col3 DATE
)
PARTITION BY LINEAR KEY (col1)
PARTITIONS 3;

三、注意

  • 如果有主键,则主键必须参与分区

转载请注明出处,整理不易,感谢赞赏~~~~~~~~~~~~~~~~~~~

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章