拋出問題
今天運營同學說我們mis後臺的一個列表頁打不開了,經過排查每一條 sql 語句,幾條都是一秒多,sql 大概都一樣,拿出來其中一條:
MySQL [ymtprice2]> select count(*) as count from t_price where day_time >= '2020-02-08' and status = 2;
+--------+
| count |
+--------+
| 312516 |
+--------+
1 row in set (1.44 sec)
count(*) 看一下表裏數據:
MySQL > select count(*) from t_price;
+----------+
| count(*) |
+----------+
| 1345590 |
+----------+
1 row in set (0.19 sec)
才一百多萬條數據,也不多啊,估計大概率是索引失效了,看一下聯合索引的順序:
MySQL > show index from t_price;
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| t_price | 0 | PRIMARY | 1 | id | A | 1252838 | NULL | NULL | | BTREE | | |
| t_price | 1 | product | 1 | day_time | A | 984 | NULL | NULL | | BTREE | | |
| t_price | 1 | product | 2 | status | A | 1954 | NULL | NULL | | BTREE | | |
| t_price | 1 | product | 3 | product_id | A | 250567 | NULL | NULL | | BTREE | | |
| t_price | 1 | market | 1 | day_time | A | 986 | NULL | NULL | | BTREE | | |
| t_price | 1 | market | 2 | status | A | 1792 | NULL | NULL | | BTREE | | |
| t_price | 1 | market | 3 | market_id | A | 73696 | NULL | NULL | | BTREE | | |
| t_price | 1 | day_time | 1 | day_time | A | 964 | NULL | NULL | | BTREE | | |
| t_price | 1 | status | 1 | status | A | 6 | NULL | NULL | | BTREE | | |
| t_price | 1 | province_id | 1 | province_id | A | 62 | NULL | NULL | | BTREE | | |
| t_price | 1 | customer_id | 1 | customer_id | A | 2047 | NULL | NULL | | BTREE | | |
+---------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
11 rows in set (0.00 sec)
沒有任何問題啊,用 explain 看一下執行計劃:
+----+-------------+---------+------+--------------------------------+--------+---------+-------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------+------+--------------------------------+--------+---------+-------+--------+-------------+
| 1 | SIMPLE | t_price | ref | product,market,day_time,status | status | 1 | const | 626419 | Using where |
+----+-------------+---------+------+--------------------------------+--------+---------+-------+--------+-------------+
1 row in set (0.00 sec)
發現問題了,實際使用的索引是 status,而我們的語句希望使用的是 product ,可是爲什麼它明明能夠滿足聯合索引 product 的最左原則,卻使用單列索引 status 呢?
突然想到 《高性能MySQL》中的一句話:優化器會根據需要掃描的行數選擇合適的索引,當行數相近時,優化器會選擇開銷最小的索引。何爲開銷最小,根據以下2個方面平衡總開銷:
- 掃描行數最小,爲了減少對比次數
- 索引頁最小,爲了減少磁盤IO開銷
回到我們的問題中來,因爲我們並沒有查詢需要的實際字段,只是查詢了行數,所以優化器認爲 status 索引是單列索引, 磁盤IO比 product 聯合索引要小很多,所以沒有使用 product 索引。優化很簡單,加 hint 即可:
MySQL [ymtprice2]> select count(*) as count from t_price force index (product) where day_time >= '2020-02-08' and status = 2;
+--------+
| count |
+--------+
| 312516 |
+--------+
1 row in set (0.11 sec)
優化完之後回想起之前也在網上查過索引失效的問題,不過文章之間都有相互矛盾的地方,本着“實驗出真知”的原則,自己特意針對不同情況做了一些實驗,將結果做個總結。
準備數據
CREATE TABLE `test` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL DEFAULT '',
`age` int(11) NOT NULL DEFAULT 0,
`nick` varchar(200) DEFAULT NULL,
`status` int(11) NOT NULL DEFAULT 0 COMMENT '狀態',
PRIMARY KEY (`id`),
KEY `name` (`name`) USING BTREE,
KEY `age` (`age`) USING BTREE,
KEY `nick` (`nick`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=utf8
其中nick字段有索引,但是nick字段允許爲NULL(用於驗證判斷null是否走索引的情況)
for ($i = 1; $i <= 30000; $i++){
$tmp = [
'name' => 'abc' . rand(0, 10000),
'age' => rand(0, 10000),
'status' => $i % 2,
];
if ($i % 5 != 0){
$tmp['nick'] = 'xyz' . rand(0, 10000);
}
$table->insert($tmp);
}
當字段允許爲NULL時,is null和is not null
首先設置數據分佈,這個數據比例一共測試了2次,這兩次當查詢 * 時都不會走索引,not null 行數佔比都在百分之17多一點點,而只查詢nick字段,也就是是滿足覆蓋索引條件時,可以走索引:
- 當 null 行數爲5136時,查詢 nick is not null 查詢不走索引,此時 not null 行數佔比爲百分之17.12%
- 當 null 行數爲5148時,查詢 nick is not null 查詢不走索引,此時 not null 行數佔比爲百分之17.16%
update test set nick = null where id >= 35150;
update test set nick = 'abc' where id < 35150;
mysql> select count(*) from test where nick is null;
+----------+
| count(*) |
+----------+
| 24852 |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from test where nick is not null;
+----------+
| count(*) |
+----------+
| 5148 |
+----------+
1 row in set (0.03 sec)
查詢所有字段時,索引失效:
mysql> explain select * from test where nick is not null;
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | nick | NULL | NULL | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)
改成只查詢 nick 字段,試一下滿足覆蓋索引的情況:
mysql> explain select nick from test where nick is not null;
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| 1 | SIMPLE | test | range | nick | nick | 603 | NULL | 5146 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
1 row in set (0.02 sec)
我們再改變臨界的一行,繼續 select * 發現可以走索引了,結果如下:
update test set nick = null where id >= 35149;
update test set nick = 'abc' where id < 35149;
mysql> explain select * from test where nick is not null;
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | nick | NULL | NULL | NULL | 29905 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.02 sec)
is null 在網上爭議不大,所以我們不改變數據分佈,直接測試 select * ... is null 的情況
mysql> explain select nick from test where nick is null;
+----+-------------+-------+------+---------------+------+---------+-------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+-------+-------+--------------------------+
| 1 | SIMPLE | test | ref | nick | nick | 603 | const | 14952 | Using where; Using index |
+----+-------------+-------+------+---------------+------+---------+-------+-------+--------------------------+
1 row in set (0.01 sec)
所以結論:is null 可以走索引,is not null 在滿足條件的行數大於17%時,大概率不會走索引(因爲測試中這個比例在浮動,所以說大概率),但並不是網上大多數文章說的那樣不走索引;而滿足覆蓋索引條件時,也會走索引。
當使用!=條件時(這裏需要注意了啊,網上很多文章一口咬定!=不會走索引)
實驗過程和上面一樣,也是測試了2次,數據佔比也是在17%之內屬於安全範圍,下面直接貼其中一個過程和結果:
update test set name = '' where id >= 35150;
update test set name = 'abc' where id < 35150;
mysql> select count(*) from test where name != '';
+----------+
| count(*) |
+----------+
| 5148 |
+----------+
1 row in set (0.04 sec)
mysql> select count(*) from test where name = '';
+----------+
| count(*) |
+----------+
| 24852 |
+----------+
1 row in set (0.01 sec)
mysql> explain select * from test where name != '';
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | name | NULL | NULL | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.02 sec)
mysql> explain select name from test where name != '';
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| 1 | SIMPLE | test | range | name | name | 152 | NULL | 5148 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
1 row in set (0.01 sec)
結論:!= 可以走索引,!= 在滿足條件的行數大於17%時,大概率不會走索引(因爲測試中這個比例在浮動,所以說大概率),但並不是網上大多數文章說的那樣不走索引;而滿足覆蓋索引條件時,也會走索引。
使用or的情況
當我們status字段沒有索引的時候,發現是全表掃描
explain select * from test where age = 2644 or status = 1;
我們爲status字段加上索引之後,發現用到了age和status兩個索引
explain select * from test where age = 2644 or status = 1;
結論:or是否走索引取決於or前後的兩個字段是否都建立了索引。
not in情況
not in 我們分2種情況測試,分別是使用主鍵索引和使用普通索引,每一個裏邊又分別測試了查詢所有字段、不滿足覆蓋索引的字段,以及滿足索引的字段。
mysql> explain select * from test where age not in (9837);
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | age | NULL | NULL | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.03 sec)
mysql> explain select name from test where age not in (9837);
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | age | NULL | NULL | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)
mysql> explain select age from test where age not in (9837);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| 1 | SIMPLE | test | range | age | age | 4 | NULL | 15389 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.08 sec)
mysql> explain select * from test where id not in (9837);
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| 1 | SIMPLE | test | range | PRIMARY | PRIMARY | 4 | NULL | 14920 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
1 row in set (0.01 sec)
mysql> explain select name from test where id not in (9837);
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| 1 | SIMPLE | test | range | PRIMARY | PRIMARY | 4 | NULL | 14920 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
1 row in set (0.01 sec)
mysql> explain select age from test where id not in (9837);
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
| 1 | SIMPLE | test | range | PRIMARY | PRIMARY | 4 | NULL | 14920 | Using where |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+-------------+
1 row in set (0.01 sec)
結論:使用主鍵索引查詢任何字段都可以走索引,滿足覆蓋索引時可以走索引,這兩種功能情況都是因爲無需二次回表。
having情況
having 我們同樣也分兩種情況測試,一種使用主鍵索引,一種是普通索引。
使用普通索引
mysql> explain select name from test where age HAVING (2644);
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | NULL | NULL | NULL | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)
mysql> explain select age from test where age HAVING (2644);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| 1 | SIMPLE | test | index | NULL | age | 4 | NULL | 29839 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.01 sec)
使用主鍵索引
mysql> explain select status from test where id HAVING (12345);
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | test | ALL | NULL | NULL | NULL | NULL | 29839 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
1 row in set (0.01 sec)
mysql> explain select name from test where id HAVING (12345);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| 1 | SIMPLE | test | index | NULL | name | 152 | NULL | 29839 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.01 sec)
mysql> explain select age from test where id HAVING (12345);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| 1 | SIMPLE | test | index | NULL | age | 4 | NULL | 29839 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.01 sec)
mysql> explain select id from test where id HAVING (12345);
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
| 1 | SIMPLE | test | index | NULL | age | 4 | NULL | 29839 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------+
1 row in set (0.01 sec)
結論:使用having時,當使用普通索引時,只有滿足覆蓋索引時纔會走索引,否則不會走索引;當使用主鍵索引時,只有返回建立索引的字段時纔會走索引,否則不會走索引。
其他通用情況
- 隱式轉換:傳入的條件和字段類型不一致,索引必失效,無法命中索引
- 聯合索引使用時沒有遵循最左匹配原則索引會斷開(使用一部分)或失效
- 當走索引返回的行數大於全表的80%時,優化器是選擇不走索引
- 當用like左通配符時,索引失效,原因是違反最左原則
- 聯合索引使用時沒有遵循最左匹配原則索引會斷開(使用一部分)或失效,但是會有特殊情況
結論
其實索引失效無非就是三種原因:
- 不滿足最左原則斷開
- 查詢優化器認爲使用索引比全表掃描開銷還要大
- 查詢優化器錯誤了選擇它認爲開銷更低的索引
我們在日常使用索引的時候遵循以下幾點,基本上不會導致索引失效:
- 遵循最左原則
- 索引覆蓋範圍大於上面說的17%
- 傳入條件字段時保持和字段類型一致
- 不在字段上進行計算
紙上得來終覺淺,絕知此事要躬行,如果看完了對您有幫助,動動小手點個贊,關注一下,會持續更新技術文章!