MySQL表分区
前言
表数据量过大时(一般指超过500万条)时可以考虑分区或分表方式。
相比分表,分区更为简单,不需要对表的列进行拆分,应用程序不需要修改,也不需要引入分表分库的中间件。
表分区
常见的表分区是根据用户id、或区域,或日期进行分区。
根据日期进行分区
常见按天或按月进行分区,取决于每个分区的数据量大小。
查看数据库是否支持分区
show variables like '%partition%';
查看表分区定义
show create table <table>;
增加分区
作为分区的字段必须为主键或唯一索引的一部分。
按天进行分区:
ALTER TABLE <table> ADD PARTITION BY RANGE (to_days(visit_date))
(
PARTITION p_20200401 VALUES LESS THAN (to_days('2020-04-02')),
PARTITION p_20200402 VALUES LESS THAN (to_days('2020-04-03')),
PARTITION p_20200403 VALUES LESS THAN (to_days('2020-04-04')),
PARTITION p_20200404 VALUES LESS THAN (to_days('2020-04-05')),
PARTITION p_20200405 VALUES LESS THAN (to_days('2020-04-06')),
PARTITION p_20200406 VALUES LESS THAN (to_days('2020-04-07')),
PARTITION p_20200407 VALUES LESS THAN (to_days('2020-04-08')),
PARTITION p_20200408 VALUES LESS THAN (to_days('2020-04-09')),
PARTITION p_20200409 VALUES LESS THAN (to_days('2020-04-10')),
PARTITION p_20200410 VALUES LESS THAN (to_days('2020-04-11')),
PARTITION p_20200411 VALUES LESS THAN (to_days('2020-04-12')),
PARTITION p_20200412 VALUES LESS THAN (to_days('2020-04-13')),
PARTITION p_20200413 VALUES LESS THAN (to_days('2020-04-14')),
PARTITION p_20200414 VALUES LESS THAN (to_days('2020-04-15')),
PARTITION p_20200415 VALUES LESS THAN (to_days('2020-04-16')),
PARTITION p_20200416 VALUES LESS THAN (to_days('2020-04-17')),
PARTITION p_20200417 VALUES LESS THAN (to_days('2020-04-18')),
PARTITION p_20200418 VALUES LESS THAN (to_days('2020-04-19')),
PARTITION p_20200419 VALUES LESS THAN (to_days('2020-04-20')),
PARTITION p_20200420 VALUES LESS THAN (to_days('2020-04-21')),
PARTITION p_20200421 VALUES LESS THAN (to_days('2020-04-22')),
PARTITION p_20200422 VALUES LESS THAN (to_days('2020-04-23')),
PARTITION p_20200423 VALUES LESS THAN (to_days('2020-04-24')),
PARTITION p_20200424 VALUES LESS THAN (to_days('2020-04-25')),
PARTITION p_20200425 VALUES LESS THAN (to_days('2020-04-26')),
PARTITION p_20200426 VALUES LESS THAN (to_days('2020-04-27')),
PARTITION p_20200427 VALUES LESS THAN (to_days('2020-04-28')),
PARTITION p_20200428 VALUES LESS THAN (to_days('2020-04-29')),
PARTITION p_20200429 VALUES LESS THAN (to_days('2020-04-30')),
PARTITION p_other VALUES LESS THAN (MAXVALUE)
);
按月分区:
ALTER TABLE <table> ADD PARTITION BY RANGE (to_days(visit_date))
(
PARTITION p_202001 VALUES LESS THAN (to_days('2020-02-01')),
PARTITION p_202002 VALUES LESS THAN (to_days('2020-03-01')),
PARTITION p_202003 VALUES LESS THAN (to_days('2020-04-01')),
PARTITION p_202004 VALUES LESS THAN (to_days('2020-05-01')),
PARTITION p_202005 VALUES LESS THAN (to_days('2020-06-01')),
PARTITION p_202006 VALUES LESS THAN (to_days('2020-07-01')),
PARTITION p_202007 VALUES LESS THAN (to_days('2020-08-01')),
PARTITION p_202008 VALUES LESS THAN (to_days('2020-09-01')),
PARTITION p_202009 VALUES LESS THAN (to_days('2020-10-01')),
PARTITION p_202010 VALUES LESS THAN (to_days('2020-11-01')),
PARTITION p_202011 VALUES LESS THAN (to_days('2020-12-01')),
PARTITION p_202012 VALUES LESS THAN (to_days('2021-01-01')),
PARTITION p_other VALUES LESS THAN (MAXVALUE)
);
早期的MySQL版本只支持整数来进行范围分区,因此需要用to_days()
函数将日期转换为整数,较新的MySQL已经支持直接根据Date或Datetime类型的列来分区,对Timestamp类型的列还是需要用UNIX_TIMESTAMP()
函数将时间戳转换为整数。
参考:
查看执行计划是否使用了分区
EXPLAIN PARTITIONS SELECT count(1) FROM <table> WHERE visit_date = '2020-04-01';
检查partitions是否命中了正确的分区。
查看执行计划
EXPLAIN SELECT count(1) FROM <table> WHERE visit_date = '2020-04-01';
执行计划说明:
- type的性能好坏依次为:system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL
- 一般要求type性能不应该低于range,最好能达到ref或以上
参考: