MySQL按天或按月分区

MySQL表分区

前言

表数据量过大时(一般指超过500万条)时可以考虑分区或分表方式。

相比分表,分区更为简单,不需要对表的列进行拆分,应用程序不需要修改,也不需要引入分表分库的中间件。

表分区

常见的表分区是根据用户id、或区域,或日期进行分区。

根据日期进行分区

常见按天或按月进行分区,取决于每个分区的数据量大小。

查看数据库是否支持分区

show variables like '%partition%';

查看表分区定义

show create table <table>;

增加分区

作为分区的字段必须为主键或唯一索引的一部分。

按天进行分区:

ALTER TABLE <table> ADD PARTITION BY RANGE (to_days(visit_date))
  (
    PARTITION p_20200401 VALUES LESS THAN (to_days('2020-04-02')),
    PARTITION p_20200402 VALUES LESS THAN (to_days('2020-04-03')),
    PARTITION p_20200403 VALUES LESS THAN (to_days('2020-04-04')),
    PARTITION p_20200404 VALUES LESS THAN (to_days('2020-04-05')),
    PARTITION p_20200405 VALUES LESS THAN (to_days('2020-04-06')),
    PARTITION p_20200406 VALUES LESS THAN (to_days('2020-04-07')),
    PARTITION p_20200407 VALUES LESS THAN (to_days('2020-04-08')),
    PARTITION p_20200408 VALUES LESS THAN (to_days('2020-04-09')),
    PARTITION p_20200409 VALUES LESS THAN (to_days('2020-04-10')),
    PARTITION p_20200410 VALUES LESS THAN (to_days('2020-04-11')),
    PARTITION p_20200411 VALUES LESS THAN (to_days('2020-04-12')),
    PARTITION p_20200412 VALUES LESS THAN (to_days('2020-04-13')),
    PARTITION p_20200413 VALUES LESS THAN (to_days('2020-04-14')),
    PARTITION p_20200414 VALUES LESS THAN (to_days('2020-04-15')),
    PARTITION p_20200415 VALUES LESS THAN (to_days('2020-04-16')),
    PARTITION p_20200416 VALUES LESS THAN (to_days('2020-04-17')),
    PARTITION p_20200417 VALUES LESS THAN (to_days('2020-04-18')),
    PARTITION p_20200418 VALUES LESS THAN (to_days('2020-04-19')),
    PARTITION p_20200419 VALUES LESS THAN (to_days('2020-04-20')),
    PARTITION p_20200420 VALUES LESS THAN (to_days('2020-04-21')),
    PARTITION p_20200421 VALUES LESS THAN (to_days('2020-04-22')),
    PARTITION p_20200422 VALUES LESS THAN (to_days('2020-04-23')),
    PARTITION p_20200423 VALUES LESS THAN (to_days('2020-04-24')),
    PARTITION p_20200424 VALUES LESS THAN (to_days('2020-04-25')),
    PARTITION p_20200425 VALUES LESS THAN (to_days('2020-04-26')),
    PARTITION p_20200426 VALUES LESS THAN (to_days('2020-04-27')),
    PARTITION p_20200427 VALUES LESS THAN (to_days('2020-04-28')),
    PARTITION p_20200428 VALUES LESS THAN (to_days('2020-04-29')),
    PARTITION p_20200429 VALUES LESS THAN (to_days('2020-04-30')),
    PARTITION p_other    VALUES LESS THAN (MAXVALUE)
  );

按月分区:

ALTER TABLE <table> ADD PARTITION BY RANGE (to_days(visit_date))
  (
    PARTITION p_202001 VALUES LESS THAN (to_days('2020-02-01')),
    PARTITION p_202002 VALUES LESS THAN (to_days('2020-03-01')),
    PARTITION p_202003 VALUES LESS THAN (to_days('2020-04-01')),
    PARTITION p_202004 VALUES LESS THAN (to_days('2020-05-01')),
    PARTITION p_202005 VALUES LESS THAN (to_days('2020-06-01')),
    PARTITION p_202006 VALUES LESS THAN (to_days('2020-07-01')),
    PARTITION p_202007 VALUES LESS THAN (to_days('2020-08-01')),
    PARTITION p_202008 VALUES LESS THAN (to_days('2020-09-01')),
    PARTITION p_202009 VALUES LESS THAN (to_days('2020-10-01')),
    PARTITION p_202010 VALUES LESS THAN (to_days('2020-11-01')),
    PARTITION p_202011 VALUES LESS THAN (to_days('2020-12-01')),
    PARTITION p_202012 VALUES LESS THAN (to_days('2021-01-01')),
    PARTITION p_other  VALUES LESS THAN (MAXVALUE)
  );

早期的MySQL版本只支持整数来进行范围分区,因此需要用to_days()函数将日期转换为整数,较新的MySQL已经支持直接根据Date或Datetime类型的列来分区,对Timestamp类型的列还是需要用UNIX_TIMESTAMP() 函数将时间戳转换为整数。

参考:

查看执行计划是否使用了分区

EXPLAIN PARTITIONS SELECT count(1) FROM <table> WHERE visit_date = '2020-04-01';

检查partitions是否命中了正确的分区。

查看执行计划

EXPLAIN SELECT count(1) FROM <table> WHERE visit_date = '2020-04-01';

执行计划说明:

  1. type的性能好坏依次为:system > const > eq_ref > ref > fulltext > ref_or_null > index_merge > unique_subquery > index_subquery > range > index > ALL
  2. 一般要求type性能不应该低于range,最好能达到ref或以上

参考:

参考资料

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章