也是很巧合，之前遇到過一次情況，一條SQL，根據時間範圍查數據，但有時候速度很快，有時候速度就慢。

第一反應是沒有設置索引，但開發人員告訴我已經設置了二級索引，查詢的速度依然有快有慢。

通過explain解析，發現同一條SQL，時間範圍不一樣，有的使用了索引，有的全表掃描。

當時我都懵逼了！

經過查詢，發現一種說法是當查詢的數據量超過全表的30%，就不再走索引，而直接全表掃描。

經過排查，我們的情況恰好符合，如果某一週數據約小於全表數據的30%時，走索引，反之全表掃描。

當時臨時給出的解決辦法就是把數據緩存，要慢也就是第一次慢，可以忍受。也想過用覆蓋索引，但是要查詢的字段太多了，不可能搞個那麼大的索引。

但爲什麼產生這種情況，沒找到答案，後來一直斷斷續續地關注這個問題，現在發現了一個能說服我自己的解答：

先說一下，這個閾值不一定是30%，也可以是別的數。

簡單地說，就是MySQL的優化器一旦認爲走索引的代價高於全表掃描，就直接掃全表。這個結論可能反常識，因爲我們一直聽說的就是索引提高查詢性能。

這裏涉及兩個因素：

1、如果走二級索引查詢，步驟就是：二級索引–主鍵索引，這就是回表查詢。

2、硬盤隨機I/O的性能遠低於順序I/O。

所以如果你的查詢數據量大到一定程度，MySQL認爲回表查詢的隨機I/O代價大於順序I/O全表掃的情況下，將放棄走索引。

mysql在掃描情況下會選擇全表掃描而不是走索引

技術分享 | 爲什麼 SELECT 查詢選擇全表掃描，而不走索引？

愛可生開源社區IP屬地: 天津

0.1262021.10.12 13:25:00字數 836閱讀 448

作者：Charizard

愛可生服務團隊成員，主要負責公司數據庫運維產品問題診斷；努力在數據庫和 IT 領域裏摸爬滾打中。

本文來源：原創投稿

*愛可生開源社區出品，原創內容未經授權不得隨意使用，轉載請聯繫小編並註明來源。

1、基本結論

SQL的執行成本（cost）是 MySQL 優化器選擇 SQL 執行計劃時一個重要考量因素。當優化器認爲使用索引的成本高於全表掃描的時候，優化器將會選擇全表掃描，而不是使用索引。

下面通過一個實驗來說明。

2、問題現象

如下結構的一張表，表中約有104w行數據：

CREATE TABLE `test03` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '自增主鍵',
  `dept` tinyint(4) NOT NULL COMMENT '部門id',
  `name` varchar(30) COLLATE utf8mb4_bin DEFAULT NULL COMMENT '用戶名稱',
  `create_time` datetime NOT NULL COMMENT '註冊時間',
  `last_login_time` datetime DEFAULT NULL COMMENT '最後登錄時間',
  PRIMARY KEY (`id`),
  KEY `ct_index` (`create_time`)
) ENGINE=InnoDB AUTO_INCREMENT=1048577 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin COMMENT='測試表'

查詢1，並未用到ct_index(create_time)索引:

type爲ALL，而不是range
rows行數和全錶行數接近

# 查詢1
mysql> explain select * from test03 where create_time > '2021-10-01 02:04:36';
+----+-------------+--------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table  | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+--------+------------+------+---------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | test03 | NULL       | ALL  | ct_index      | NULL | NULL    | NULL | 1045955 |    50.00 | Using where |
+----+-------------+--------+------------+------+---------------+------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

而查詢2，則用到了ct_index(create_time)索引:

# 查詢2
mysql> explain select * from test03 where create_time < '2021-01-01 02:04:36';
+----+-------------+--------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+
| id | select_type | table  | partitions | type  | possible_keys | key      | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+--------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | test03 | NULL       | range | ct_index      | ct_index | 5       | NULL |  169 |   100.00 | Using index condition |
+----+-------------+--------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+

3、獲得SQL優化器處理信息

這裏使用optimizer trace工具，觀察MySQL對SQL的優化處理過程：

# 調大trace的容量，防止被截斷
set global optimizer_trace_max_mem_size = 1048576;

# 開啓optimizer_trace
set optimizer_trace="enabled=on";

# 執行SQL
select * from test03 where create_time > '2021-10-01 02:04:36';

# SQL執行完成之後，查看TRACE
 select TRACE from INFORMATION_SCHEMA.OPTIMIZER_TRACE\G

獲得關於此SQL的詳細優化器處理信息：

mysql> select TRACE from INFORMATION_SCHEMA.OPTIMIZER_TRACE\G                                                                                       
*************************** 1. row ***************************
TRACE: {
  "steps": [
    {
      "join_preparation": {
        "select#": 1,
        "steps": [
          {
            "expanded_query": "/* select#1 */ select `test03`.`id` AS `id`,`test03`.`dept` AS `dept`,`test03`.`name` AS `name`,`test03`.`create_time` AS `create_time`,`test03`.`last_login_time` AS `last_login_time` from `test03` where (`test03`.`create_time` > '2021-10-01 02:04:36')"
          }
        ]
      }
    },
    {
      "join_optimization": {
        "select#": 1,
        "steps": [
          {
            "condition_processing": {
              "condition": "WHERE",
              "original_condition": "(`test03`.`create_time` > '2021-10-01 02:04:36')",
              "steps": [
                {
                  "transformation": "equality_propagation",
                  "resulting_condition": "(`test03`.`create_time` > '2021-10-01 02:04:36')"
                },
                {
                  "transformation": "constant_propagation",
                  "resulting_condition": "(`test03`.`create_time` > '2021-10-01 02:04:36')"
                },
                {
                  "transformation": "trivial_condition_removal",
                  "resulting_condition": "(`test03`.`create_time` > '2021-10-01 02:04:36')"
                }
              ]
            }
          },
          {
            "substitute_generated_columns": {
            }
          },
          {
            "table_dependencies": [
              {
                "table": "`test03`",
                "row_may_be_null": false,
                "map_bit": 0,
                "depends_on_map_bits": [
                ]
              }
            ]
          },
          {
            "ref_optimizer_key_uses": [
            ]
          },
          {
            "rows_estimation": [
              {
                "table": "`test03`",
                "range_analysis": {
                  "table_scan": {
                    "rows": 1045955,
                    "cost": 212430
                  },
                  "potential_range_indexes": [
                    {
                      "index": "PRIMARY",
                      "usable": false,
                      "cause": "not_applicable"
                    },
                    {
                      "index": "ct_index",
                      "usable": true,
                      "key_parts": [
                        "create_time",
                        "id"
                      ]
                    }
                  ],
                  "setup_range_conditions": [
                  ],
                  "group_index_range": {
                    "chosen": false,
                    "cause": "not_group_by_or_distinct"
                  },
                  "analyzing_range_alternatives": {
                    "range_scan_alternatives": [
                      {
                        "index": "ct_index",
                        "ranges": [
                          "0x99aac22124 < create_time"
                        ],
                        "index_dives_for_eq_ranges": true,
                        "rowid_ordered": false,
                        "using_mrr": false,
                        "index_only": false,
                        "rows": 522977,
                        "cost": 627573,
                        "chosen": false,
                        "cause": "cost"
                      }
                    ],
                    "analyzing_roworder_intersect": {
                      "usable": false,
                      "cause": "too_few_roworder_scans"
                    }
                  }
                }
              }
            ]
          },
          {
            "considered_execution_plans": [
              {
                "plan_prefix": [
                ],
                "table": "`test03`",
                "best_access_path": {
                  "considered_access_paths": [
                    {
                      "rows_to_scan": 1045955,
                      "access_type": "scan",
                      "resulting_rows": 1.05e6,
                      "cost": 212428,
                      "chosen": true
                    }
                  ]
                },
                "condition_filtering_pct": 100,
                "rows_for_plan": 1.05e6,
                "cost_for_plan": 212428,
                "chosen": true
              }
            ]
          },
          {
            "attaching_conditions_to_tables": {
              "original_condition": "(`test03`.`create_time` > '2021-10-01 02:04:36')",
              "attached_conditions_computation": [
              ],
              "attached_conditions_summary": [
                {
                  "table": "`test03`",
                  "attached": "(`test03`.`create_time` > '2021-10-01 02:04:36')"
                }
              ]
            }
          },
          {
            "refine_plan": [
              {
                "table": "`test03`"
              }
            ]
          }
        ]
      }
    },
    {
      "join_execution": {
        "select#": 1,
        "steps": [
        ]
      }
    }
  ]
}
1 row in set (0.00 sec)

通過逐行閱讀，發現優化器在join_optimization（SQL優化階段）部分的rows_estimation內容裏：

明確指出了使用索引ct_index(create_time)和全表掃描的成本差異
同時指出了未選擇索引的原因：cost

image

4、爲什麼使用索引的成本比全表掃描還高？

通過觀察優化器的信息，不難發現，使用索引掃描行數約52w行，而全表掃描約爲104w行。爲什麼優化器反而認爲使用索引的成本比全表掃描還高呢？

因爲當ct_index(create_time)這個普通索引並不包括查詢的所有列，因此需要通過ct_index的索引樹找到對應的主鍵id，然後再到id的索引樹進行數據查詢，即回表（通過索引查出主鍵，再去查數據行），這樣成本必然上升。尤其是當回表的數據量比較大的時候，經常會出現MySQL優化器認爲回表查詢代價過高而不選擇索引的情況。

這裏可以回頭看查詢1 和查詢2的數據量佔比：

查詢1的數據量佔整個表的60%，回表成本高，因此優化器選擇了全表掃描
查詢2的數據量佔整個表的0.02%，因此優化器選擇了索引

mysql> select (select count(*) from test03 where create_time > '2021-10-01 02:04:36')/(select count(*) from test03) as '>20211001', (select count(*) from test03 where create_time < '2021-01-01 02:04:36')/(select count(*) from test03) as '<20210101';
+-----------+-----------+
| >20211001 | <20210101 |
+-----------+-----------+
|    0.5997 |    0.0002 |
+-----------+-----------+
1 row in set (0.44 sec)

另外，在MySQL的官方文檔中對此也有簡要的描述：

當優化器任務全表掃描成本更低的時候，就不會使用索引
並沒有一個固定的數據量佔比來決定優化器是否使用全表掃描（曾經是30%）
優化器在選擇的時候會考慮更多的因素，如：表大小，行數量，IO塊大小等

https://dev.mysql.com/doc/refman/5.7/en/where-optimization.html

image

參考文檔：

https://opensource.actionsky.com/20201127-mysql/

https://blog.csdn.net/CSDNcircular/article/details/107253747

參考：https://blog.51cto.com/u_15067227/2603679

技術分享 | 爲什麼 SELECT 查詢選擇全表掃描，而不走索引？

技術分享 | 爲什麼 SELECT 查詢選擇全表掃描，而不走索引？

1、基本結論

2、問題現象

3、獲得SQL優化器處理信息

4、爲什麼使用索引的成本比全表掃描還高？

【面試準備】又一次失敗的面試經歷，題目離譜～資深軟件測試工程師

left join on後面加條件與 where後面加條件的區別

35歲感想-捕蛇者說想到的能爭第一就不當第二 20多歲如此上進的小夥子，過了35歲竟然躺平了

技術分享 | 爲什麼 SELECT 查詢選擇全表掃描，而不走索引？

kafka 可以脫離 zookeeper 單獨使用嗎

Kafka 如何保證消息不丟失？

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結