Wiki

A scrub can fix everything that btrfs kernel code can recover from, i.e. if a disk in a btrfs RAID array is 100% corrupted while online, scrub can restore all of the data, including superblocks, without interrupting application activity on the filesystem. With RAID1/5/6/10 this includes all single-disk failures and non-malicious data corruption from disks (RAID6 does not have a corresponding 3-copy RAID1 profile for metadata yet, so RAID6 can't always survive 2 disk failures in practice).

Scrub is very effective at repairing data damage caused by disk failures in RAID arrays, and with DUP metadata on single-disk filesystem scrub can often recover from a few random UNC sectors. If something happens to the filesystem that scrub can't repair (e.g. damage caused by accidentally overwriting the btrfs partition with another filesystem, host RAM failures writing corrupted data to disks, hard drive firmware write caching bugs), the other tools usually can't repair it either.

Always use DUP or RAID1 or RAID10 for metadata. Do not use single, RAID0, RAID5, or RAID6–if there is a UNC sector error or data corruption in a metadata page, the filesystem will be broken, and data losses will be somewhere between “medium” and “severe”.

The other utilities like 'btrfs check –repair' are in an experimental state of development, and may make a damaged filesystem completely unreadable. They should only be used as a last resort, with expert guidance, after all other data recovery options have been tried, if at all. Often when a filesystem is damaged beyond the ability of scrub to recover, the only practical option is to mkfs and start over–but ask the mailing list first to be sure, doing so might help improve the tools so that this is no longer the case.

Usually a damaged btrfs can still be mounted read-only and some data can be recovered. Corrupted data blocks (with non-matching csums) are not allowed to be read through a mounted filesystem. 'btrfs restore' is required to read those.

'btrfs restore' can copy data from a damaged btrfs filesystem that is not mounted. It is able to work in some cases where mounting fails. When 'btrfs restore' copies data it does not verify csums. This can be used to recover corrupted data that would not be allowed to read through a mounted filesystem. On the other hand, if you want to avoid copying data with known corruption, you should mount the filesystem read-only and read it that way.

Use 'btrfs replace' to replace failed disks in a RAID1/RAID5/RAID6/RAID10 array. It can reconstruct data from other mirror disks quickly by simply copying the mirrored data without changing any of the filesystem structure that references the data. If the replacement disk is smaller than the disk it is meant to replace, then use 'btrfs dev add' followed by 'btrfs dev remove', but this is much slower as the data has to be moved one extent at a time, and all references to the data must be updated across the array.

There appear to be a few low-end hard drive firmwares in the field with severe write caching bugs. The first sign that you have one of those is that btrfs gets an unrecoverable 'parent transid verify failure' event after a few power failures. The only known fix for this is to turn off write caching on such drives (i.e. hdparm -W0), mkfs, and start over. If you think you have encountered one of these, please post to the mailing list with drive model and firmware revision.

'btrfs rescue' is useful for fixing bugs caused by old mkfs tools or kernel versions. It is not likely you will ever need it if you mkfs a new btrfs today (though there's always the possibility of new bugs in the future…).

(extract from a post of Zygo Blaxell on linux-btrfs@vger.kernel.org 3 july 2019 6:37AM)