STRAIGHT_JOIN 與 INNER JOIN 的功能完全一致
使用 INNER JOIN 時,mysql會根據優化規則自動判斷 應該先加載哪個表
但有時自動的操作未必最優,就需要手動操作,其語法如下:
select ..from tab1 straiht_join tab2 where ...
使用了 straight_join 後,tab1 會先於 tab2 載入。
【現象】
生產環境中遇到一個例子,執行sql需要1.29s 已經超出業務方的要求,需要進行優化,sql 如下
select d.instance_no,d.zone_id, d.region_no,d.user_id,d.cores,d.mem,d.disk, d.tx_pub , u.idkp , m.image_no, m.platform, m.image_size
from user u, instance d , image m
where d.region_no = 'cn-cm9002' and
u.user_id = d.user_id and
d.image_id = m.image_id and
d.status != 8 and
d.gmt_create <'2013-08-19 14:00:00';
-------------------------------
3120 rows in set (1.29 sec)
hy@3309 03:09:09>explain select d.instance_no,d.zone_id, d.region_no,d.user_id,d.cores,d.mem,d.disk, d.tx_pub , u.idkp , m.image_no, m.platform, m.image_size
-> from user u, instance d , image m
-> where d.region_no = 'cn-cm9002' and
-> u.user_id = d.user_id and
-> d.image_id = m.image_id and
-> d.status != 8 and
-> d.gmt_create <'2013-08-19 14:00:00';
+----+-------------+-------+--------+-------------------------------------+--------------------+---------+------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------------------+--------------------+---------+------------------+--------+-------------+
| 1 | SIMPLE | u | index | PRIMARY | idkp | 98 | NULL | 133002 | Using index |
| 1 | SIMPLE | d | ref | image_id,ind_i_uid_hostname,user_id | ind_i_uid_hostname | 4 | hy.u.user_id | 1 | Using where |
| 1 | SIMPLE | m | eq_ref | PRIMARY | PRIMARY | 4 | hy.d.image_id | 1 | |
+----+-------------+-------+--------+-------------------------------------+--------------------+---------+------------------+--------+-------------+
3 rows in set (0.00 sec)
【解決方法】
使用 straight_join 方式優化sql 執行的順序 結果如下:
rac1@3309 15:01:55>explain select d.instance_no,d.zone_id, d.region_no,d.user_id,d.cores,d.mem,d.disk, d.tx_pub , u.idkp , m.image_no, m.platform, m.image_size
-> from instance d straight_join user u on u.user_id = d.user_id, image m
-> where d.region_no = 'cn-cm9002' and
-> d.image_id = m.image_id and
-> d.status != 8 and
-> d.gmt_create <'2013-08-19 14:00:00';
+----+-------------+-------+--------+-------------------------------------+---------+---------+------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------------------+---------+---------+------------------+--------+-------------+
| 1 | SIMPLE | d | ALL | image_id,ind_i_uid_hostname,user_id | NULL | NULL | NULL | 316473 | Using where |
| 1 | SIMPLE | m | eq_ref | PRIMARY | PRIMARY | 4 | hy.d.image_id | 1 | |
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | hy.d.user_id | 1 | |
+----+-------------+-------+--------+-------------------------------------+---------+---------+------------------+--------+-------------+
3 rows in set (0.00 sec)
--------------------------------
3120 rows in set (0.39 sec)
【問題分析】
上面的介紹中描述mysql的優化器只支持 nest loop ,對於多表連接會mysql優化器採用了簡單的方式:選擇結果集小的表作爲驅動表。
instance表連接 user表有兩種連接方式:
A 選擇user 表作爲驅動表 優化器掃描133002行
B 選擇instance表作爲驅動表 優化器掃描 316473行
因此優化器選擇了看起來正確的執行計劃 以user表作爲驅動表。但是我們查看where條件,正確的應該是通過instance 的region_no,status ,gmt_create 過濾得到instance的結果集,再來和user,image中的表進行關聯。
而執行計劃是掃描user表中全部的記錄再去關聯instance 表和image表,顯然執行順序有偏差。因此加上straight_join hint之後,強制優化器選擇 instance爲驅動表,按照正確的執行計劃執行。
附上表的記錄數:
hy@3309 01:11:17> select count(*) from user;
+----------+
| count(*) |
+----------+
| 134221 |
+----------+
1 row in set (0.02 sec)
hy@3309 01:19:44> select count(*) from instance;
+----------+
| count(*) |
+----------+
| 375732 |
+----------+
1 row in set (0.06 sec)
hy@3309 01:19:54> select count(*) from image;
+----------+
| count(*) |
+----------+
| 18858 |
+----------+
1 row in set (0.00 sec)