System hang caused by vm.pagecache_limit_mb
This document (000020418) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server for SAP Applications 11
Situation
crash> sys|grep LOAD LOAD AVERAGE: 89.98, 82.02, 82.12
The number of tasks currently waiting for the page cache shrink:
crash> foreach bt | grep __shrink_page_cache | wc -l 68
Comparing the current page cache size with the configured limit:
crash> kmem -i|grep CACHE CACHED 42364004 161.6 GB 10% of TOTAL MEM crash> p vm_pagecache_limit_mb vm_pagecache_limit_mb = $1 = 31458 #30GB
The stack trace of the tasks that are reclaiming the page cache could look as below:
#14 [ffffb441bbacbbb8] __shrink_page_cache at ffffffffad1cddad #15 [ffffb441bbacbc38] add_to_page_cache_lru at ffffffffad1b39a5 #16 [ffffb441bbacbc68] pagecache_get_page at ffffffffad1b518b #17 [ffffb441bbacbca0] grab_cache_page_write_begin at ffffffffad1b58fc #18 [ffffb441bbacbcb0] ext4_da_write_begin at ffffffffc096670d [ext4] #19 [ffffb441bbacbd28] generic_perform_write at ffffffffad1b2f72 #20 [ffffb441bbacbda0] __generic_file_write_iter at ffffffffad1b6454 #21 [ffffb441bbacbde8] ext4_file_write_iter at ffffffffc095427e [ext4] #22 [ffffb441bbacbe48] __vfs_write at ffffffffad249e2c #23 [ffffb441bbacbec0] vfs_write at ffffffffad24ae4d
Resolution
We would strongly recommend to not use the page cache limit option, unless it is really necessary (only in rare cases if the systems are continuously swapping excessively, and only if the aggressive swap-out is indeed causing performance issues). The pagecache limit can be disabled by setting:
# /etc/sysctl.conf ------------------------------- vm.pagecache_limit_mb = 0 vm.pagecache_limit_ignore_dirty = 1
Cause
# mm/vmscan.c -------------------------------------------- 3883 /* 3884 * Function to shrink the page cache 3885 * 3886 * This function calculates the number of pages (nr_pages) the page 3887 * cache is over its limit and shrinks the page cache accordingly. 3888 * 3889 * The maximum number of pages, the page cache shrinks in one call of 3890 * this function is limited to SWAP_CLUSTER_MAX pages. Therefore it may 3891 * require a number of calls to actually reach the vm_pagecache_limit_kb. 3892 * 3893 * This function is similar to shrink_all_memory, except that it may never 3894 * swap out mapped pages and only does two passes. 3895 */ 3896 static void __shrink_page_cache(gfp_t mask)
Shrinking a large pagecache is an expensive tasks, and to make it worse many CPUs could also hit the page cache reclaim at the same time and fight for resources. This situation will lead to a non-responsive system, on which most of the CPUs will be busy by reclaiming the page cache, while all the other tasks will wait on sleep state till this job completes.
Status
Additional Information
vm.pagecache_limit_mb recommended value for system with a memory size up to 64GB is 1/16 (~6%) of the amount of RAM, but not less than 512 MByte.):
< 8 GB: 512 (recommended min. limit) 8 GB: 512 (= 8 * 1024 MB / 16) 16 GB: 1024 (= 16 * 1024 MB / 16) 32 GB: 2048 (= 32 * 1024 MB / 16) 64 GB: 4096 (= 64 * 1024 MB / 16)
For large systems with mode than 64GB of memory, the recommended value would be 2% of the amount of RAM, but not less than 4096 MB, for example:
256 GB: 5243 (=2% of 256 * 1024 MB) 512 GB: 10486 (=2% of 512 * 1024 MB) 1024 GB: 20972 (=2% of 1024 * 1024 MB) 2048 GB: 41943 (=2% of 2048 * 1024 MB) ...
Please be aware that a value smaller than the recommendations can easily lead to a non-responsive system.
If the pagecache_limit is used, it should always be set to a value well above the 'dirty' limit (vm.dirty_ratio /vm.dirty_bytes), as referenced on TID#000019008, a good practice would be to set:
# /etc/sysctl.conf ------------------------------- vm.dirty_bytes = 629145600 vm.dirty_background_bytes = 314572800
The page cache limit option has been dropped on SLES 15.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000020418
- Creation Date: 27-Oct-2021
- Modified Date:27-Oct-2021
-
- SUSE Linux Enterprise Server for SAP Applications
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com