[ERROR] [FATAL] InnoDB: Over 95 percent of the buffer pool is occupied by lock heaps or the adaptive。
这个问题是发生在mysql 5.7系列5.7.20及之前的版本,8.0系列8.0.3及之前的版本的bug,在5.7.21/8.0.4上已修复。在这个错误发生时,可以通过select * from performance_schema.memory_summary_global_by_event_name where EVENT_NAME like 'memory/innodb%' order by CURRENT_NUMBER_OF_BYTES_USED desc 观察到大量的内存使用。
问题的原因是mysql在处理包含外键约束的update/delete语句时,使用了迭代的方式来处理外键约束关联的表,对更新的每一行都创建一个update node,并从innodb buffer pool内存池堆中分配内存。但是在删除操作结束后并没有将内存释放回操作系统。最后的这句解释了我长久以来的一个问题,当初有一个库,整个库的大小并不大,但是在最终崩溃后的日志里发现,外键相关的锁,锁住了大约1500w记录,这个记录数超过了库中所有表合起来的大小(因为内存没有释放回操作系统?),当日苦思不解,今时豁然开朗。
具体的可以看这里:https://github.com/mysql/mysql-server/commit/7b26dc98a624d5cdf79cd5eee455cb03e3bccb5a
Bug #26191879 FOREIGN KEY CASCADES USE EXCESSIVE MEMORY
PROBLEM
The iterative approach used in the fix of Bug 16244691 was causing
excessive memory usage when doing a cascade operation owing to the
fact that for each row being updated we were allocating a update
node from heap and this memory was not getting released to OS
even after completion of delete.
FIX
We went back to recursive approach of cascading operation
instead of iterative approach .
这个在mysql代码中被归为Bug #26191879。这个问题的引入源于对另外一个Bug #16244691的修复,根据时间线来看,初始的问题源于外键约束的存在,导致对外键引用表的递归操作深度太深,引发了stack overflow问题,为了修复这个问题,将最初的递归方式操作外键引用表改为迭代操作方式,但是修改不彻底,导致了大量内存的使用,并引入了新的Bug #26191879,最终的解决方法仍是将迭代方式回退到递归的方式。具体的时间线如下:
27 Mar 2013 :https://github.com/mysql/mysql-server/commit/ff20c67fe4c6182bbee359d9bdaece09199670c5
28 Mar 2013:https://github.com/mysql/mysql-server/commit/ff20c67fe4c6182bbee359d9bdaece09199670c5
30 Mar 2013 :https://github.com/mysql/mysql-server/commit/b00e1446ddbdb0e28b3fd9190ead98f70b4488b6
22 Apr 2013 :https://github.com/mysql/mysql-server/commit/be7b0558a52b93f820c838928ce665df08f83885
14 May 2013:https://github.com/mysql/mysql-server/commit/377774689bf6a16af74182753fe950d514c2c6dd
此问题在High Memory Consumption With Foreign Key Cascades in MySQL< 5.7.21 (Doc ID 2335142.1)中也有描述。
此问题一旦发生,最终会导致mysql crash掉,还是比较严重的。最终解决方法是升级到5.7.21/8.0.4或以上的版本,临时的解决方法为优化sql,使sql在执行的时候,通过合适的索引进行扫描,从而避免sql扫描太多行记录造成内存占用;或者调大innodb buffer内存池大小。并做好Innodb_buffer_pool_pages_misc指标的监控。
The excessive memory usage is a regression bug in 5.7, and was reported here:
It is fixed as of the 5.7.21, 8.0.4 release. Upgrade to that version to solve this issue.