MySQL Innodb如何找出阻塞事務源頭SQL
在MySQL數據庫中出現了阻塞問題,如何快速查找定位問題根源?在實驗開始前,我們先梳理一下有什么工具或命令查看MySQL的阻塞,另外,我們也要一一對比其優(yōu)劣,因為有些命令可能在實際環(huán)境下可能并不適用。
-
show engine innodb status
-
Innotop工具
-
INNODB_TRX 等系統(tǒng)表
下面我們理論聯系實際,通過實驗來測試總結一下這個問題。首先構造測試環(huán)境,數據庫測試環(huán)境為( 5.7.21 MySQL Community Server 和5.6.20-enterprise-commercial,這兩個測試環(huán)境我都測試驗證過)
- mysql> use MyDB;
- Reading table information for completion of table and column names
- You can turn off this feature to get a quicker startup with -A
- Database changed
- mysql> create table test_blocking(id int primary key, name varchar(12));
- Query OK, 0 rows affected (0.05 sec)
- mysql> insert into test_blocking
- -> select 1, 'kerry' from dual;
- Query OK, 1 row affected (0.00 sec)
- Records: 1 Duplicates: 0 Warnings: 0
- mysql> insert into test_blocking
- -> select 2, 'jimmy' from dual;
- Query OK, 1 row affected (0.00 sec)
- Records: 1 Duplicates: 0 Warnings: 0
- mysql> insert into test_blocking
- -> select 3, 'kkk' from dual;
- Query OK, 1 row affected (0.00 sec)
- Records: 1 Duplicates: 0 Warnings: 0
準備好測試環(huán)境數據后,那么我們接下來開始實驗,為了實驗效果,我們先將參數innodb_lock_wait_timeout設置為100。
- mysql> show variables like 'innodb_lock_wait_timeout';
- +--------------------------+-------+
- | Variable_name | Value |
- +--------------------------+-------+
- | innodb_lock_wait_timeout | 50 |
- +--------------------------+-------+
- 1 row in set (0.00 sec)
- mysql> set global innodb_lock_wait_timeout=100 ;
- Query OK, 0 rows affected (0.00 sec)
- mysql> select connection_id() from dual;
- +-----------------+
- | connection_id() |
- +-----------------+
- | 8 |
- +-----------------+
- 1 row in set (0.00 sec)
- mysql> set session autocommit=0;
- Query OK, 0 rows affected (0.00 sec)
- mysql> select * from test_blocking where id=1 for update;
- +----+-------+
- | id | name |
- +----+-------+
- | 1 | kerry |
- +----+-------+
- 1 row in set (0.00 sec)
然后在第二個連接會話中執(zhí)行更新腳本,構造被阻塞的案例
- mysql> select connection_id() from dual;
- +-----------------+
- | connection_id() |
- +-----------------+
- | 9 |
- +-----------------+
- 1 row in set (0.00 sec)
- mysql> update test_blocking set name='kk' where id=1;
在第三個連接會話執(zhí)行下面命令,查看TRANSACTIONS相關信息:
- mysql> show engine innodb statusG;
使用show engine innodb status命令后,可以查看其輸出的TRANSACTIONS部分信息,如上截圖所示,找到類似TRX HAS BEEN WATING …部分的信息,
通過那部分信息,我們可以看到update test_blocking set name=’kk’ where id=1這個SQL語句被阻塞了14秒,一直在等待獲取X Lock。
- TRANSACTIONS
- ------------
- Trx id counter 148281 #下一個事務ID
- Purge done for trx's n:o < 148273 undo n:o < 0 state: running but idle
- History list length 552
- LIST OF TRANSACTIONS FOR EACH SESSION:
- ---TRANSACTION 0, not started
- MySQL thread id 15, OS thread handle 0x4cc64940, query id 261 localhost root cleaning up
- ---TRANSACTION 0, not started
- MySQL thread id 14, OS thread handle 0x4cbe2940, query id 278 localhost root init
- show engine innodb status
- ---TRANSACTION 148280, ACTIVE 24 sec
- 2 lock struct(s), heap size 360, 1 row lock(s)
- MySQL thread id 8, OS thread handle 0x4cba1940, query id 276 localhost root cleaning up
- ---TRANSACTION 148279, ACTIVE 313 sec starting index read
- mysql tables in use 1, locked 1
- LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s)
- MySQL thread id 9, OS thread handle 0x4cc23940, query id 277 localhost root updating #線程ID為9, 操作系統(tǒng)線程句柄為0x4cc23940, 查詢ID為277,賬號為root的UPDATE操作
- update test_blocking set name='kk' where id=1 #具體SQL語句
- ------- TRX HAS BEEN WAITING 14 SEC FOR THIS LOCK TO BE GRANTED: #TRX等待授予鎖已經有14秒了
- RECORD LOCKS space id 337 page no 3 n bits 72 index `PRIMARY` of table `MyDB`.`test_blocking` trx id 148279 lock_mode X locks rec but not gap waiting
- #在space id=337(test_blocking表的表空間),page no=3的頁上,表test_blocking上的主鍵索引在等待X鎖
- Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
- 0: len 4; hex 80000001; asc ;; #***個字段是主鍵,制度按長為4,值為1
- 1: len 6; hex 000000024322; asc C";; #該字段為6個字節(jié)的事務id,這個id表示最近一次被更新的事務id(對應十進制為148258)
- 2: len 7; hex 9a000001f20110; asc ;; #該字段為7個字節(jié)的回滾指針,用于mvcc
- 3: len 5; hex 6b65727279; asc kerry;; #該字段表示的是此記錄的第二個字段,長度為5,值為kerry(如果表有多個字段,那么此處后面還有記錄)
- mysql> select * from information_schema.INNODB_SYS_DATAFILES where space=337;
- +-------+--------------------------+
- | SPACE | PATH |
- +-------+--------------------------+
- | 337 | ./MyDB/test_blocking.ibd |
- +-------+--------------------------+
- 1 row in set (0.00 sec)
- mysql>
但是這種方式也有一些弊端,例如生產環(huán)境很復雜,尤其是有大量事務的情況下。諸多信息根本無法清晰判斷知道誰阻塞了誰;其次一點也不直觀; 另外,這個也無法定位blocker 的SQL語句。這種方式只能作為輔助分析用途,通過查看取鎖的詳細信息,幫助進一步診斷問題。
2: Innotop工具
如下所示,Innotop工具很多情況下也不能定位到阻塞的語句(Blocking Query), 也僅僅能獲取一些鎖相關信息
3:通過查詢information_schema數據庫下與事務相關的幾個系統(tǒng)表
還是構造之前的測試案例,在***個會話中使用SELECT FOR UPDATE鎖定其中一行記錄
- mysql> use MyDB;
- Database changed
- mysql> set session autocommit=0;
- Query OK, 0 rows affected (0.00 sec)
- mysql> select connection_id() from dual;
- +-----------------+
- | connection_id() |
- +-----------------+
- | 17 |
- +-----------------+
- 1 row in set (0.00 sec)
- mysql> select * from test_blocking where id=1 for update;
- +----+-------+
- | id | name |
- +----+-------+
- | 1 | kerry |
- +----+-------+
- 1 row in set (0.00 sec)
- mysql>
然后在第二個連接會話中執(zhí)行更新腳本,構造被阻塞的案例
- mysql> use MyDB;
- Database changed
- mysql> select connection_id() from dual;
- +-----------------+
- | connection_id() |
- +-----------------+
- | 19 |
- +-----------------+
- 1 row in set (0.00 sec)
- mysql> update test_blocking set name='kk' where id=1;
此時阻我們在第三個連接會話查找誰被阻塞了
- SELECT b.trx_mysql_thread_id AS 'blocked_thread_id'
- ,b.trx_query AS 'blocked_sql_text'
- ,c.trx_mysql_thread_id AS 'blocker_thread_id'
- ,c.trx_query AS 'blocker_sql_text'
- ,( Unix_timestamp() - Unix_timestamp(c.trx_started) )
- AS 'blocked_time'
- FROM information_schema.innodb_lock_waits a
- INNER JOIN information_schema.innodb_trx b
- ON a.requesting_trx_id = b.trx_id
- INNER JOIN information_schema.innodb_trx c
- ON a.blocking_trx_id = c.trx_id
- WHERE ( Unix_timestamp() - Unix_timestamp(c.trx_started) ) > 4;
- SELECT a.sql_text,
- c.id,
- d.trx_started
- FROM performance_schema.events_statements_current a
- join performance_schema.threads b
- ON a.thread_id = b.thread_id
- join information_schema.processlist c
- ON b.processlist_id = c.id
- join information_schema.innodb_trx d
- ON c.id = d.trx_mysql_thread_id
- where c.id=17
- ORDER BY d.trx_startedG;
如下截圖所示,***個SQL語句能夠查到線程19 被線程 17阻塞了, 被阻塞的SQL語句為“update test_blocking set name=’kk’ where id=1;”, 能夠查到被阻塞了多長時間,但是無法查到源頭SQL語句。此時就需要第二個SQL語句登場,找到源頭語句。
但是不要太天真的認為第二個SQL語句能夠獲取所有場景下的阻塞源頭SQL語句,實際業(yè)務場景,會話可能在執(zhí)行一個存儲過程或復雜的業(yè)務,有可能它執(zhí)行完阻塞源頭SQL后,繼續(xù)在執(zhí)行其它SQL語句,此時,你抓取的是這個連接會話***執(zhí)行的SQL語句,如下所示,我簡單構造了一個例子。就能構造這樣的一個場景。這個我曾經寫過一篇博客“為什么數據庫有時候不能定位阻塞(Blocker)源頭的SQL語句”,分析SQL Server和ORACLE 定位查找阻塞源頭SQL語句,現在看來真是大道同源,殊途同歸。
- http://www.cnblogs.com/kerrycode/p/5821413.html
- mysql> select * from test_blocking where id=1 for update;
- +----+-------+
- | id | name |
- +----+-------+
- | 1 | kerry |
- +----+-------+
- 1 row in set (0.00 sec)
- mysql> delete from student where stu_id=1001;
- Query OK, 1 row affected (0.00 sec)
- mysql>
總結: 最簡單、方便的還是上面兩個SQL查詢定位blocker的SQL語句,但是需要注意:有時候它也查不到真正阻塞的源頭SQL語句。所以還需結合應用程序代碼與上下文環(huán)境進行整體分析、判斷!