How to recover from BTRFS errors
This document (7018181) is provided subject to the disclaimer at the end of this document.
Environment
SUSE Linux Enterprise Server 12
SUSE Linux Enterprise Server 11
Situation
Filesystem errors are not uncommon, yet, need to be resolved to ensure a safe and stable system.
This document concentrates on errors seen with the BTRFS filesystem on SUSE Linux Enterprise
Note that at some of these errors we have to aim in two directions:
What actually caused the corruption?
Is there a bug in the btrfs tool or kernel driver to fix that corruption which prevents that?
Repairing a filesystem does not necessarily mean to recover the data, it means to fix the filesystem itself, not its content, at least not in every case.
Let's start with some best practices first.
Whenever a BTRFS filesystem contains errors, this TID is a good starting point.
Let's have a look on some typical errors seen in the past:
WARNING: CPU: 2 PID: 452 at ../fs/btrfs/extent-tree.c:3731 btrfs_free_reserved_data_space_noquota+0xe8/0x100 [btrfs]()
Good thing is, it's a WARNING, not a fatal error.
WARNINGs like this one, e.g. regarding quota, typically are runtime only things that are fixed by BTRFS after the WARNING is issued. Not a bad problem.
Yet, such an issue should be reported to SUSE Support for closer examination.
If you see a message like:
BTRFS: Transaction aborted (error -2)
followed by a stack trace which looks like:
[<ffffffffa041277b>] __btrfs_abort_transaction+0x4b/0x120 [btrfs] [<ffffffffa0445f87>] __btrfs_unlink_inode+0x367/0x3c0 [btrfs] [<ffffffffa04499e7>] btrfs_unlink_inode+0x17/0x40 [btrfs] [<ffffffffa0449a76>] btrfs_unlink+0x66/0xb0 [btrfs] kernel: BTRFS warning (device sdb3): __btrfs_unlink_inode:3802: Aborting unused transaction(No such entry).
mount -t btrfs -o recovery,ro /dev/<device_name> /<mount_point>
WARNING: Using '--repair' can further damage a filesystem instead of helping if it can't fix your particular issue.
It is extremely important that you ensure a backup has been created before invoking '--repair'. If any doubt open a support request first, before attempting a repair. Use this flag at your own risk. If you do not perform a backup and use an old kernel and old btrfs tool to attempt your repair, you may cause your data to be lost permanently. Use the latest quarterly update for the latest version of SUSE Linux Enterprise Server. For example, at the time of this writing that would be SLE-15-SP2-Online-x86_64-QU2-Media1.iso or SLE-15-SP2-Full-x86_64-QU2-Media1.iso.
btrfs check --repair /dev/<device_name> btrfs scrub start -Bf /dev/<device_name>
btrfs rescue zero-log /dev/<device_name>
More fatal issues are seen if the filesystem spits out tons of messages into the logs, slows down considerably or even goes readonly.
Resolution
What to do if:
- A bad tree root is found at mount time: use "-o recovery" This attempts to autocorrect that error.
- Weird ENOSPC issues seen: mount with "-o clear_cache" which will drop btrfs cache
- Quota issues prevent mounting: Needs the latest available btrfsprogs to fix that. See Section "Additional information"
- Quota issues seen during normal operation: run 'btrfs quota rescan'
- Only if everything else fails, run 'btrfs check' and research if repair could possibly fix this issue.
WARNING: If in doubt, open a case with Support. 'btrfs check --repair' run with a version which can't fix the particular issue might make things worse.
Additional Information
Example: For SLE15 GA, SP1 or SP2 it may work out to use the latest SP2 version of btrfsprogs.
Download the SP2 QU2 ISO image from the customer center and mount it to the /mnt directory on the system with the broken filesystem.
Extract the btrfs tool from the btrfsprogs RPM:
rpm2cpio /mnt/Module-Basesystem/x86_64/btrfsprogs-4.19.1-8.6.2.x86_64.rpm | cpio -id ./usr/sbin/btrfs
Then use ./usr/sbin/btrfs check --repair /dev/<defective btrfs device>
NOTE: This only works for a btrfs filesystem which is not mounted, it doesn't work for the root filesystem of a running system.
To repair that, reboot the system and boot the rescue system from latest quarterly update ISO for the latest major release of SUSE Linux Enterprise Server.
Further: as said above, use with care. In doubt, run "./usr/sbin/btrfs check /dev/<defective btrfs device>" (without --repair) and send the output to SUSE Support for advice.
Disclaimer
This Support Knowledgebase provides a valuable tool for SUSE customers and parties interested in our products and solutions to acquire information, ideas and learn from one another. Materials are provided for informational, personal or non-commercial use within your organization and are presented "AS IS" WITHOUT WARRANTY OF ANY KIND.
- Document ID:7018181
- Creation Date: 24-Oct-2016
- Modified Date:06-Nov-2023
-
- SUSE Linux Enterprise Server
For questions or concerns with the SUSE Knowledgebase please contact: tidfeedback[at]suse.com