抛出问题
今天运营同学说我们mis后台的一个列表页打不开了,经过排查每一条 sql 语句,几条都是一秒多,sql 大概都一样,拿出来其中一条:
MySQL [ymtprice2]> select count(*) as count from t_price where day_time >= '2020-02-08' and status = 2;
+--------+
| count |
+--------+
| 312516 |
+--------+
1 row in set (1.44 sec)
count(*) 看一下表里数据:
MySQL > select count(*) from t_price;
+----------+
| count(*) |
+----------+
| 1345590 |
+----------+
1 row in set (0.19 sec)
才一百多万条数据,也不多啊,估计大概率是索引失效了,看一下联合索引的顺序:
MySQL > show index from t_price;
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t_price | 0 | PRIMARY | 1 | id | A | 1252838 | NULL | NULL | | BTREE | | |
| t_price | 1 | product | 1 | day_time | A | 984 | NULL | NULL | | BTREE | | |
| t_price | 1 | product | 2 | status | A | 1954 | NULL | NULL | | BTREE | | |
| t_price | 1 | product | 3 | product_id | A | 250567 | NULL | NULL | | BTREE | | |
| t_price | 1 | market | 1 | day_time | A | 986 | NULL | NULL | | BTREE | | |
| t_price | 1 | market | 2 | status | A | 1792 | NULL | NULL | | BTREE | | |
| t_price | 1 | market | 3 | market_id | A | 73696 | NULL | NULL | | BTREE | | |
| t_price | 1 | day_time | 1 | day_time | A | 964 | NULL | NULL | | BTREE | | |
| t_price | 1 | status | 1 | status | A | 6 | NULL | NULL | | BTREE | | |
| t_price | 1 | province_id | 1 | province_id | A | 62 | NULL | NULL | | BTREE | | |
| t_price | 1 | customer_id | 1 | customer_id | A | 2047 | NULL | NULL | | BTREE | | |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
11 rows in set (0.00 sec)
没有任何问题啊,用 explain 看一下执行计划:
+----+-------------+---------+------+--------------------------------+--------+---------+-------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+--------------------------------+--------+---------+-------+--------+-------------+
| 1 | SIMPLE | t_price | ref | product,market,day_time,status | status | 1 | const | 626419 | Using where |
+----+-------------+---------+------+--------------------------------+--------+---------+-------+--------+-------------+
1 row in set (0.00 sec)
发现问题了,实际使用的索引是 status,而我们的语句希望使用的是 product ,可是为什么它明明能够满足联合索引 product 的最左原则,却使用单列索引 status 呢?
突然想到 《高性能MySQL》中的一句话:优化器会根据需要扫描的行数选择合适的索引,当行数相近时,优化器会选择开销最小的索引。何为开销最小,根据以下2个方面平衡总开销:
- 扫描行数最小,为了减少对比次数
- 索引页最小,为了减少磁盘IO开销
回到我们的问题中来,因为我们并没有查询需要的实际字段,只是查询了行数,所以优化器认为 status 索引是单列索引, 磁盘IO比 product 联合索引要小很多,所以没有使用 product 索引。优化很简单,加 hint 即可:
MySQL [ymtprice2]> select count(*) as count from t_price force index (product) where day_time >= '2020-02-08' and status = 2;
+--------+
| count |
+--------+
| 312516 |
+--------+
1 row in set (0.11 sec)
优化完之后回想起之前也在网上查过索引失效的问题,不过文章之间都有相互矛盾的地方,本着“实验出真知”的原则,自己特意针对不同情况做了一些实验,将结果做个总结。
准备数据
CREATE TABLE `test` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL DEFAULT '',
`age` int(11) NOT NULL DEFAULT 0,
`nick` varchar(200) DEFAULT NULL,
`status` int(11) NOT NULL DEFAULT 0 COMMENT '状态',
PRIMARY KEY (`id`),
KEY `name` (`name`) USING BTREE,
KEY `age` (`age`) USING BTREE,
KEY `nick` (`nick`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8
其中nick字段有索引,但是nick字段允许为NULL(用于验证判断null是否走索引的情况)
for ($i = 1; $i <= 30000; $i++){
$tmp = [
'name' => 'abc' . rand(0, 10000),
'age' => rand(0, 10000),
'status' => $i % 2,
];
if ($i % 5 != 0){
$tmp['nick'] = 'xyz' . rand(0, 10000);
}
$table->insert($tmp);
}
当字段允许为NULL时,is null和is not null
首先设置数据分布,这个数据比例一共测试了2次,这两次当查询 * 时都不会走索引,not null 行数占比都在百分之17多一点点,而只查询nick字段,也就是是满足覆盖索引条件时,可以走索引:
- 当 null 行数为5136时,查询 nick is not null 查询不走索引,此时 not null 行数占比为百分之17.12%
- 当 null 行数为5148时,查询 nick is not null 查询不走索引,此时 not null 行数占比为百分之17.16%
update test set nick = null where id >= 35150;
update test set nick = 'abc' where id < 35150;
mysql> select count(*) from test where nick is null;
+----------+
| count(*) |
+----------+
| 24852 |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from test where nick is not null;
+----------+
| count(*) |
+----------+
| 5148 |
+----------+
1 row in set (0.03 sec)
查询所有字段时,索引失效:
mysql> explain select * from test where nick is not null;
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | nick | NULL | NULL | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)
改成只查询 nick 字段,试一下满足覆盖索引的情况:
mysql> explain select nick from test where nick is not null;
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| 1 | SIMPLE | test | range | nick | nick | 603 | NULL | 5146 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
1 row in set (0.02 sec)
我们再改变临界的一行,继续 select * 发现可以走索引了,结果如下:
update test set nick = null where id >= 35149;
update test set nick = 'abc' where id < 35149;
mysql> explain select * from test where nick is not null;
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | nick | NULL | NULL | NULL | 29905 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.02 sec)
is null 在网上争议不大,所以我们不改变数据分布,直接测试 select * ... is null 的情况
mysql> explain select nick from test where nick is null;
+----+-------------+-------+------+---------------+------+---------+-------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+-------+-------+--------------------------+
| 1 | SIMPLE | test | ref | nick | nick | 603 | const | 14952 | Using where; Using index |
+----+-------------+-------+------+---------------+------+---------+-------+-------+--------------------------+
1 row in set (0.01 sec)
所以结论:is null 可以走索引,is not null 在满足条件的行数大于17%时,大概率不会走索引(因为测试中这个比例在浮动,所以说大概率),但并不是网上大多数文章说的那样不走索引;而满足覆盖索引条件时,也会走索引。
当使用!=条件时(这里需要注意了啊,网上很多文章一口咬定!=不会走索引)
实验过程和上面一样,也是测试了2次,数据占比也是在17%之内属于安全范围,下面直接贴其中一个过程和结果:
update test set name = '' where id >= 35150;
update test set name = 'abc' where id < 35150;
mysql> select count(*) from test where name != '';
+----------+
| count(*) |
+----------+
| 5148 |
+----------+
1 row in set (0.04 sec)
mysql> select count(*) from test where name = '';
+----------+
| count(*) |
+----------+
| 24852 |
+----------+
1 row in set (0.01 sec)
mysql> explain select * from test where name != '';
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | name | NULL | NULL | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.02 sec)
mysql> explain select name from test where name != '';
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| 1 | SIMPLE | test | range | name | name | 152 | NULL | 5148 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
1 row in set (0.01 sec)
结论:!= 可以走索引,!= 在满足条件的行数大于17%时,大概率不会走索引(因为测试中这个比例在浮动,所以说大概率),但并不是网上大多数文章说的那样不走索引;而满足覆盖索引条件时,也会走索引。
使用or的情况
当我们status字段没有索引的时候,发现是全表扫描
explain select * from test where age = 2644 or status = 1;
我们为status字段加上索引之后,发现用到了age和status两个索引
explain select * from test where age = 2644 or status = 1;
结论:or是否走索引取决于or前后的两个字段是否都建立了索引。
not in情况
not in 我们分2种情况测试,分别是使用主键索引和使用普通索引,每一个里边又分别测试了查询所有字段、不满足覆盖索引的字段,以及满足索引的字段。
mysql> explain select * from test where age not in (9837);
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | age | NULL | NULL | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.03 sec)
mysql> explain select name from test where age not in (9837);
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | age | NULL | NULL | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)
mysql> explain select age from test where age not in (9837);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| 1 | SIMPLE | test | range | age | age | 4 | NULL | 15389 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.08 sec)
mysql> explain select * from test where id not in (9837);
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| 1 | SIMPLE | test | range | PRIMARY | PRIMARY | 4 | NULL | 14920 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
1 row in set (0.01 sec)
mysql> explain select name from test where id not in (9837);
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| 1 | SIMPLE | test | range | PRIMARY | PRIMARY | 4 | NULL | 14920 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
1 row in set (0.01 sec)
mysql> explain select age from test where id not in (9837);
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| 1 | SIMPLE | test | range | PRIMARY | PRIMARY | 4 | NULL | 14920 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
1 row in set (0.01 sec)
结论:使用主键索引查询任何字段都可以走索引,满足覆盖索引时可以走索引,这两种功能情况都是因为无需二次回表。
having情况
having 我们同样也分两种情况测试,一种使用主键索引,一种是普通索引。
使用普通索引
mysql> explain select name from test where age HAVING (2644);
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | NULL | NULL | NULL | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)
mysql> explain select age from test where age HAVING (2644);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| 1 | SIMPLE | test | index | NULL | age | 4 | NULL | 29839 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.01 sec)
使用主键索引
mysql> explain select status from test where id HAVING (12345);
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | NULL | NULL | NULL | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)
mysql> explain select name from test where id HAVING (12345);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| 1 | SIMPLE | test | index | NULL | name | 152 | NULL | 29839 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.01 sec)
mysql> explain select age from test where id HAVING (12345);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| 1 | SIMPLE | test | index | NULL | age | 4 | NULL | 29839 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.01 sec)
mysql> explain select id from test where id HAVING (12345);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| 1 | SIMPLE | test | index | NULL | age | 4 | NULL | 29839 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.01 sec)
结论:使用having时,当使用普通索引时,只有满足覆盖索引时才会走索引,否则不会走索引;当使用主键索引时,只有返回建立索引的字段时才会走索引,否则不会走索引。
其他通用情况
- 隐式转换:传入的条件和字段类型不一致,索引必失效,无法命中索引
- 联合索引使用时没有遵循最左匹配原则索引会断开(使用一部分)或失效
- 当走索引返回的行数大于全表的80%时,优化器是选择不走索引
- 当用like左通配符时,索引失效,原因是违反最左原则
- 联合索引使用时没有遵循最左匹配原则索引会断开(使用一部分)或失效,但是会有特殊情况
结论
其实索引失效无非就是三种原因:
- 不满足最左原则断开
- 查询优化器认为使用索引比全表扫描开销还要大
- 查询优化器错误了选择它认为开销更低的索引
我们在日常使用索引的时候遵循以下几点,基本上不会导致索引失效:
- 遵循最左原则
- 索引覆盖范围大于上面说的17%
- 传入条件字段时保持和字段类型一致
- 不在字段上进行计算
纸上得来终觉浅,绝知此事要躬行,如果看完了对您有帮助,动动小手点个赞,关注一下,会持续更新技术文章!