Error: semaphore wait has lasted 大於 600 seconds導致數據庫實例重啓

生產數據庫實例在晚上的時候突然重啓(真是悲劇的事情)~

1.環境:
DB version:mariadb 10.0.28 x64

OS version:centos6.6 x64

kernel:2.6.32-504.el6.x86_64

系統sem:kernel.sem = 1000 40960001000 4096

2.error log
InnoDB: ###### Diagnostic info printed to the standard error stream
InnoDB: Warning: a long semaphore wait:
--Thread 139562561287936 has waited at row0ins.cc line 2730 for 923.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7ef7b756b4a0 '&block->lock'
a writer (thread id 139562561287936) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file row0sel.cc line 4152
Last time write locked in file /home/buildbot/buildbot/build/mariadb-10.0.28/storage/xtradb/row/row0ins.cc line 2730
InnoDB: Warning: a long semaphore wait:
--Thread 139562523481856 has waited at row0ins.cc line 2730 for 921.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7ef9c7eb5500 '&block->lock'
a writer (thread id 139562523481856) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file buf0flu.cc line 1069
Last time write locked in file /home/buildbot/buildbot/build/mariadb-10.0.28/storage/xtradb/btr/btr0sea.cc line 979
InnoDB: Warning: a long semaphore wait:
--Thread 139562609211136 has waited at row0ins.cc line 2730 for 911.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7ef7a6ef37e0 '&block->lock'
a writer (thread id 139562609211136) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 0, lock_word: ffffffffffffffff
Last time read locked in file buf0flu.cc line 1069
Last time write locked in file /home/buildbot/buildbot/build/mariadb-10.0.28/storage/xtradb/row/row0ins.cc line 2730
InnoDB: Warning: a long semaphore wait:
--Thread 139562551437056 has waited at trx0undo.ic line 171 for 907.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x7ef9c5d1e4e0 '&block->lock'
a writer (thread id 139562551437056) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 0, lock_word: ffffffffffffffff
Last time read locked in file buf0flu.cc line 1069
Last time write locked in file /home/buildbot/buildbot/build/mariadb-10.0.28/storage/xtradb/include/trx0undo.ic line 171
InnoDB: Warning: a long semaphore wait:
--Thread 139560391223040 has waited at row0ins.cc line 2730 for 906.00 seconds the semaphore:
X-lock on RW-latch at 0x7ef9c7eb5500 '&block->lock'
a writer (thread id 139562523481856) has reserved it in mode  wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file buf0flu.cc line 1069
Last time write locked in file /home/buildbot/buildbot/build/mariadb-10.0.28/storage/xtradb/btr/btr0sea.cc line 979
InnoDB: ###### Diagnostic info printed to the standard error stream
InnoDB: Error: semaphore wait has lasted > 600 seconds
InnoDB: We intentionally crash the server, because it appears to be hung.
2017-03-30 21:11:18 7eee7d9fe700  InnoDB: Assertion failure in thread 139562774947584 in file srv0srv.cc line 2222
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
InnoDB: about forcing recovery.
3.臨時方案
查看手冊的時候發現自適應哈希索引可能會導致btr0sea.c文件的rw latch獲取爭用,從而導致SEMAPHORES問題。

詳細鏈接:https://dev.mysql.com/doc/refman/5.7/en/innodb-adaptive-hash.html。

臨時解決方案:set global innodb_adaptive_hash_index=0;

4.後續處理
給mariadb提了一個bug,發現centos6.6有坑。 Haswell-based Servers在centos6.6內核下可能會導致hang死(跟我的環境一模一樣~)。詳細鏈接
https://www.infoq.com/news/2015/05/redhat-futex
https://groups.google.com/forum/?hl=zh-Cn#!starred/codership-team/Ne6WsTWixH8
當然在後續如果出現問題希望能夠及時gdb dump出來文件提供給官方進行研究吧。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章