七個(gè)常見的SQL慢查詢問題,及其解決方法
得益于摩爾定律,計(jì)算機(jī)性能已大幅提升,加上數(shù)據(jù)庫(kù)的進(jìn)步以及微服務(wù)所倡導(dǎo)的各種反模式設(shè)計(jì)。因此,我們現(xiàn)在編寫復(fù)雜SQL查詢的機(jī)會(huì)越來(lái)越少。業(yè)界(是的,甚至包括谷歌)已經(jīng)開始提倡不要進(jìn)行專門的SQL優(yōu)化,因?yàn)楣?jié)省下來(lái)的資源并不足以抵消員工的工資成本。但是,作為工程師,我們應(yīng)該在技術(shù)上努力追求卓越,成為本領(lǐng)域的頂尖科學(xué)家。
在這里,將介紹7個(gè)常見的SQL慢查詢語(yǔ)句,并解釋如何優(yōu)化它們的性能。希望這對(duì)你有所幫助。
由DALLE-3生成
1. LIMIT語(yǔ)句
分頁(yè)是最常用的方案之一,但也容易出現(xiàn)問題。例如,對(duì)于以下簡(jiǎn)單的語(yǔ)句,DBA通常建議的解決方案是添加一個(gè)包含type、name和create_time字段的復(fù)合索引。這樣,條件和排序就可以有效利用索引,從而顯著提高性能。
SELECT *
FROM operation
WHERE type = 'SQLStats'
AND name = 'SlowLog'
ORDER BY create_time
LIMIT 1000, 10;
這可能會(huì)解決90%以上DBA的問題。但是,當(dāng)LIMIT子句變成“LIMIT 1000000, 10”時(shí),程序員仍會(huì)抱怨“為什么在只查詢10條記錄的時(shí)候,速度還這么慢?” 要知道,數(shù)據(jù)庫(kù)不知道第1000000條記錄從何處開始,所以即使有索引,它仍需要從頭開始計(jì)算。在大多數(shù)情況下,這個(gè)性能問題是由于懶惰編程造成的。
在前端數(shù)據(jù)瀏覽或批量導(dǎo)出大量數(shù)據(jù)的場(chǎng)景中,可以使用上一頁(yè)的最大值作為查詢參數(shù)。SQL可以重新設(shè)計(jì)如下:
SELECT *
FROM operation
WHERE type = 'SQLStats'
AND name = 'SlowLog'
AND create_time > '2017-03-16 14:00:00'
ORDER BY create_time
LIMIT 10;
采用這種新設(shè)計(jì)后,查詢時(shí)間保持不變,不會(huì)隨著數(shù)據(jù)量的增加而變化。
2. 隱式轉(zhuǎn)換
SQL語(yǔ)句中另一個(gè)常見的錯(cuò)誤是查詢變量和字段定義的類型不匹配。以下面的語(yǔ)句為例:
mysql> explain extended SELECT *
> FROM my_balance b
> WHERE b.bpn = 14000000123
> AND b.isverified IS NULL ;
mysql> show warnings;
| Warning | 1739 | Cannot use ref access on index 'bpn' due to type or collation conversion on field 'bpn'
在這種情況下,字段bpn被定義為varchar(20),而MySQL的策略是在比較之前將字符串轉(zhuǎn)換為數(shù)字。這會(huì)導(dǎo)致函數(shù)被應(yīng)用到表字段上,從而使索引失效。
這種情況可能是由應(yīng)用程序框架自動(dòng)填充參數(shù)造成的,而不是程序員的本意。如今,應(yīng)用程序框架通常都很復(fù)雜,雖然它們提供了便利,但也可能帶來(lái)隱患。
3. 連接更新和刪除
盡管MySQL 5.6引入了物化,但它只優(yōu)化了SELECT語(yǔ)句。對(duì)于UPDATE或DELETE語(yǔ)句,需要使用JOIN手動(dòng)重寫。
例如,請(qǐng)看下面的UPDATE語(yǔ)句。MySQL實(shí)際上執(zhí)行了一個(gè)循環(huán)/嵌套子查詢(DEPENDENT SUBQUERY),執(zhí)行時(shí)間可想而知。
UPDATE operation o
SET status = 'applying'
WHERE o.id IN (SELECT id
FROM (SELECT o.id,
o.status
FROM operation o
WHERE o.group = 123
AND o.status NOT IN ( 'done' )
ORDER BY o.parent,
o.id
LIMIT 1) t);
執(zhí)行計(jì)劃如下:
+----+--------------------+-------+-------+---------------+---------+---------+-------+------+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+-------+---------------+---------+---------+-------+------+-----------------------------------------------------+
| 1 | PRIMARY | o | index | | PRIMARY | 8 | | 24 | Using where; Using temporary |
| 2 | DEPENDENT SUBQUERY | | | | | | | | Impossible WHERE noticed after reading const tables |
| 3 | DERIVED | o | ref | idx_2,idx_5 | idx_5 | 8 | const | 1 | Using where; Using filesort |
+----+--------------------+-------+-------+---------------+---------+---------+-------+------+-----------------------------------------------------+
將其重寫為JOIN后,子查詢的選擇類型從DEPENDENT SUBQUERY變?yōu)镈ERIVED,執(zhí)行時(shí)間顯著得從7秒縮短到2毫秒。
UPDATE operation o
JOIN (SELECT o.id,
o.status
FROM operation o
WHERE o.group = 123
AND o.status NOT IN ( 'done' )
ORDER BY o.parent,
o.id
LIMIT 1) t
ON o.id = t.id
SET status = 'applying';
簡(jiǎn)化后的執(zhí)行計(jì)劃如下:
+----+-------------+-------+------+---------------+-------+---------+-------+------+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+-------+---------+-------+------+-----------------------------------------------------+
| 1 | PRIMARY | | | | | | | | Impossible WHERE noticed after reading const tables |
| 2 | DERIVED | o | ref | idx_2,idx_5 | idx_5 | 8 | const | 1 | Using where; Using filesort |
+----+-------------+-------+------+---------------+-------+---------+-------+------+-----------------------------------------------------+
4. 混合排序
MySQL無(wú)法利用索引進(jìn)行混合排序。但是,在某些場(chǎng)景下,仍然可以使用特殊方法來(lái)提高性能。
SELECT *
FROM my_order o
INNER JOIN my_appraise a ON a.orderid = o.id
ORDER BY a.is_reply ASC,
a.appraise_time DESC
LIMIT 0, 20;
執(zhí)行計(jì)劃顯示的是全表掃描:
+----+-------------+-------+--------+-------------+---------+---------+---------------+---------+-+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
+----+-------------+-------+--------+-------------+---------+---------+---------------+---------+-+
| 1 | SIMPLE | a | ALL | idx_orderid | NULL | NULL | NULL | 1967647 | Using filesort |
| 1 | SIMPLE | o | eq_ref | PRIMARY | PRIMARY | 122 | a.orderid | 1 | NULL |
+----+-------------+-------+--------+---------+---------+---------+-----------------+---------+-+
由于is_reply只有0和1兩種狀態(tài),我們可以將其重寫如下,從而將執(zhí)行時(shí)間從1.58秒縮短到2毫秒:
SELECT *
FROM ((SELECT *
FROM my_order o
INNER JOIN my_appraise a
ON a.orderid = o.id
AND is_reply = 0
ORDER BY appraise_time DESC
LIMIT 0, 20)
UNION ALL
(SELECT *
FROM my_order o
INNER JOIN my_appraise a
ON a.orderid = o.id
AND is_reply = 1
ORDER BY appraise_time DESC
LIMIT 0, 20)) t
ORDER BY is_reply ASC,
appraisetime DESC
LIMIT 20;
5. EXISTS語(yǔ)句
在處理EXISTS子句時(shí),MySQL仍然使用嵌套子查詢進(jìn)行執(zhí)行。以下面的SQL語(yǔ)句為例:
SELECT *
FROM my_neighbor n
LEFT JOIN my_neighbor_apply sra
ON n.id = sra.neighbor_id
AND sra.user_id = 'xxx'
WHERE n.topic_status < 4
AND EXISTS(SELECT 1
FROM message_info m
WHERE n.id = m.neighbor_id
AND m.inuser = 'xxx')
AND n.topic_type <> 5;
+----+--------------------+-------+------+-----+------------------------------------------+---------+-------+---------+ -----+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
+----+--------------------+-------+------+ -----+------------------------------------------+---------+-------+---------+ -----+
| 1 | PRIMARY | n | ALL | | NULL | NULL | NULL | 1086041 | Using where |
| 1 | PRIMARY | sra | ref | | idx_user_id | 123 | const | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | m | ref | | idx_message_info | 122 | const | 1 | Using index condition; Using where |
+----+--------------------+-------+------+ -----+------------------------------------------+---------+-------+---------+ -----+
通過(guò)刪除EXISTS子句并將其更改為JOIN, 我們可以避免嵌套子查詢,并將執(zhí)行時(shí)間從1.93秒減少到1毫秒。
SELECT *
FROM my_neighbor n
INNER JOIN message_info m
ON n.id = m.neighbor_id
AND m.inuser = 'xxx'
LEFT JOIN my_neighbor_apply sra
ON n.id = sra.neighbor_id
AND sra.user_id = 'xxx'
WHERE n.topic_status < 4
AND n.topic_type <> 5;
新的執(zhí)行計(jì)劃如下:
+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+
| 1 | SIMPLE | m | ref | | idx_message_info | 122 | const | 1 | Using index condition |
| 1 | SIMPLE | n | eq_ref | | PRIMARY | 122 | ighbor_id | 1 | Using where |
| 1 | SIMPLE | sra | ref | | idx_user_id | 123 | const | 1 | Using where |
+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+
6. 條件下推
在某些情況下,外部查詢條件無(wú)法下推到復(fù)雜的視圖或子查詢中:
- 聚合子查詢。
- 帶有LIMIT的子查詢。
- UNION或UNION ALL子查詢。
- 輸出字段中的子查詢。
請(qǐng)看下面的語(yǔ)句,其中的條件會(huì)影響聚合子查詢:
SELECT *
FROM (SELECT target,
Count(*)
FROM operation
GROUP BY target) t
WHERE target = 'rm-xxxx';
+----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+
| 1 | PRIMARY | n | ALL | NULL | NULL | NULL | NULL | 1086041 | Using where |
| 1 | PRIMARY | sra | ref | NULL | idx_user_id | 123 | const | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | m | ref | NULL | idx_message_info | 122 | const | 1 | Using index condition; Using where |
+----+-------------+------------+-------+---------------+-------------+---------+-------+------+-------------+
通過(guò)刪除EXISTS子句并將其更改為JOIN,我們可以避免嵌套子查詢并將執(zhí)行時(shí)間從1.93秒減少到1毫秒。
SELECT *
FROM my_neighbor n
INNER JOIN message_info m
ON n.id = m.neighbor_id
AND m.inuser = 'xxx'
LEFT JOIN my_neighbor_apply sra
ON n.id = sra.neighbor_id
AND sra.user_id = 'xxx'
WHERE n.topic_status < 4
AND n.topic_type <> 5;
新的執(zhí)行計(jì)劃如下:
+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+
| 1 | SIMPLE | m | ref | | idx_message_info | 122 | const | 1 | Using index condition |
| 1 | SIMPLE | n | eq_ref | | PRIMARY | 122 | ighbor_id | 1 | Using where |
| 1 | SIMPLE | sra | ref | | idx_user_id | 123 | const | 1 | Using where |
+----+-------------+-------+--------+ -----+------------------------------------------+---------+ -----+------+ -----+
7. 提前縮小范圍
讓我們看看以下經(jīng)過(guò)部分優(yōu)化的示例(左連接中的主表作為主查詢條件):
SELECT a.*,
c.allocated
FROM (
SELECT resourceid
FROM my_distribute d
WHERE isdelete = 0
AND cusmanagercode = '1234567'
ORDER BY salecode limit 20) a
LEFT JOIN
(
SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated
FROM my_resources
GROUP BY resourcesid) c
ON a.resourceid = c.resourcesid;
這條語(yǔ)句是否還存在其他問題?很明顯,子查詢c是對(duì)整個(gè)表進(jìn)行聚合查詢,在處理大量表時(shí)可能會(huì)導(dǎo)致性能下降。
事實(shí)上,對(duì)于子查詢c,左連接的結(jié)果集只關(guān)心可以與主表的resourceid匹配的數(shù)據(jù)。因此,我們可以將語(yǔ)句重寫如下,將執(zhí)行時(shí)間從2秒減少到2毫秒:
SELECT a.*,
c.allocated
FROM (
SELECT resourceid
FROM my_distribute d
WHERE isdelete = 0
AND cusmanagercode = '1234567'
ORDER BY salecode limit 20) a
LEFT JOIN
(
SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated
FROM my_resources r,
(
SELECT resourceid
FROM my_distribute d
WHERE isdelete = 0
AND cusmanagercode = '1234567'
ORDER BY salecode limit 20) a
WHERE r.resourcesid = a.resourcesid
GROUP BY resourcesid) c
ON a.resourceid = c.resourcesid;
然而,子查詢a在我們的SQL語(yǔ)句中出現(xiàn)了多次。這種方法不僅會(huì)產(chǎn)生額外的成本,而且也會(huì)使語(yǔ)句變得更加復(fù)雜。我們可以使用WITH語(yǔ)句來(lái)簡(jiǎn)化它:
WITH a AS
(
SELECT resourceid
FROM my_distribute d
WHERE isdelete = 0
AND cusmanagercode = '1234567'
ORDER BY salecode limit 20)
SELECT a.*,
c.allocated
FROM a
LEFT JOIN
(
SELECT resourcesid, sum(ifnull(allocation, 0) * 12345) allocated
FROM my_resources r,
a
WHERE r.resourcesid = a.resourcesid
GROUP BY resourcesid) c
ON a.resourceid = c.resourcesid;
結(jié)論
數(shù)據(jù)庫(kù)編譯器生成的執(zhí)行計(jì)劃決定了SQL語(yǔ)句的實(shí)際執(zhí)行方式。但是,編譯器只能盡力提供服務(wù),沒有一個(gè)數(shù)據(jù)庫(kù)編譯器是完美的。上述情況在其他數(shù)據(jù)庫(kù)中也同樣存在。了解了數(shù)據(jù)庫(kù)編譯器的特性,我們就能繞過(guò)它的限制,編寫出高性能的SQL語(yǔ)句。
在設(shè)計(jì)數(shù)據(jù)模型和編寫SQL語(yǔ)句時(shí),將算法思維或算法意識(shí)引入到這個(gè)過(guò)程非常重要。在編寫復(fù)雜的SQL語(yǔ)句時(shí),養(yǎng)成使用WITH語(yǔ)句的習(xí)慣可以簡(jiǎn)化語(yǔ)句,減輕數(shù)據(jù)庫(kù)的負(fù)擔(dān)。
最后,下面是SQL語(yǔ)句的執(zhí)行順序:
FROM
ON
JOIN
WHERE
GROUP BY
HAVING
SELECT
DISTINCT
ORDER BY
LIMIT