Process or system hang due to possible XFS corruption
This document (000021632) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server for SAP Applications 15 SP4
SUSE Linux Enterprise Server for SAP Applications 15 SP5
SUSE Linux Enterprise Server 15 SP4
SUSE Linux Enterprise Server 15 SP5
Situation
Systems running SUSE Linux Enterprise Server Service Pack 4 and 5 might encounter a possible XFS filesystem corruption. Mounting a corrupted filesystem might hang with a stack trace similar to:
[<0>] xfs_extent_busy_flush+0x6a/0xb0 [xfs]
[<0>] xfs_alloc_ag_vextent_size+0x149/0x700 [xfs]
[<0>] xfs_alloc_ag_vextent+0x11d/0x140 [xfs]
[<0>] xfs_alloc_fix_freelist+0x23a/0x460 [xfs]
[<0>] xfs_free_extent_fix_freelist+0x61/0xa0 [xfs]
[<0>] __xfs_free_extent+0x6a/0x1c0 [xfs]
[<0>] xfs_trans_free_extent+0x3b/0xe0 [xfs]
[<0>] xfs_efi_item_recover+0x151/0x190 [xfs]
[<0>] xlog_recover_process_intents.isra.29+0x96/0x2c0 [xfs]
[<0>] xlog_recover_finish+0x17/0x100 [xfs]
[<0>] xfs_log_mount_finish+0xd5/0x150 [xfs]
[<0>] xfs_mountfs+0x62d/0x880 [xfs]
[<0>] xfs_fs_fill_super+0x496/0x710 [xfs]
[<0>] get_tree_bdev+0x165/0x260
[<0>] vfs_get_tree+0x22/0xd0
[<0>] path_mount+0x6e4/0x9b0
[<0>] do_mount+0x79/0x90
[<0>] __x64_sys_mount+0x86/0xe0
[<0>] do_syscall_64+0x58/0x80
[<0>] entry_SYSCALL_64_after_hwframe+0x66/0xd0
or OOM messages similar to:
[597933.872618] G1 Service invoked oom-killer: gfp_mask=0x1100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[597933.873206] CPU: 8 PID: 117373 Comm: G1 Service Kdump: loaded Not tainted 5.14.21-150500.55.65-default #1 SLE15-SP5 16a69e3ef5df74ca3142d0eb99487f215875bfbb
[597933.873630] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[597933.874061] Call Trace:
[597933.874260] <TASK>
[597933.874469] dump_stack_lvl+0x58/0x7b
[597933.874672] dump_header+0x4a/0x220
[597933.874860] oom_kill_process+0xe8/0x140
[597933.875045] out_of_memory+0x113/0x580
[597933.875226] __alloc_pages_slowpath.constprop.111+0x980/0xc70
[597933.875413] __alloc_pages+0x2d9/0x320
[597933.875594] pagecache_get_page+0x1aa/0x450
[597933.875772] filemap_fault+0x4c7/0xa40
[597933.875944] ? next_uptodate_page+0x11e/0x280
[597933.876115] __xfs_filemap_fault+0x5c/0x280 [xfs cc8a4f493b0740007523be2e6a62dd4c65168ac0]
[597933.876606] __do_fault+0x2e/0xc0
[597933.876777] __handle_mm_fault+0xd64/0x1230
[597933.876944] handle_mm_fault+0xd5/0x290
[597933.877110] do_user_addr_fault+0x1eb/0x730
[597933.877276] exc_page_fault+0x67/0x150
Resolution
To fix this issue please upgrade to SUSE Linux Enterprise Server (for SAP Applications) 15 SP6.
Please note: due to the complexity of the patch set SUSE decided against backporting the fixes to older releases to avoid possible regressions.
See the Additional Notes section for possible workarounds.
Cause
XFS soft lockups in xfs_extent_busy_trim
.
Additional Information
Possible workarounds:
1.) Boot the affected system using the SUSE Linux Enterprise Server 15 SP6 media and start the rescue system. The kernel will properly mount the XFS filesystem and replay the file system log. Once finished, reboot from the rescue system into the regular installation and the filesystem can be mounted again using the old kernel version.
2.) Mount the filesystem using the -o norecovery,ro
option. This will mount the filesystem read-only so files can be accessed and copied off to a new filesystem.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:000021632
- Creation Date: 02-Dec-2024
- Modified Date:02-Dec-2024
-
- SUSE Linux Enterprise Server
- SUSE Linux Enterprise Server for SAP Applications
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com