Wiki

Scrub detects and (when using the DUP or RAID1/10/5/6 profiles) corrects errors introduced into the filesystem by failures in the underlying disks. You can run scrub as much or as little as you want, but the longer you go between scrubs, the longer errors can accumulate undetected, and the greater the risk that you'll have uncorrected errors on multiple disks when one of your disks fails and needs to be replaced. In that event, you will lose data or even the entire filesystem.

The ideal frequency for scrubs depends in part on how important your data is. If it's very important that you detect storage failures immediately, you can run scrub once a day. If the data is very unimportant–e.g. you have good backups, and you don't care about extended downtime to restore them–then you might not need to run scrub at all.

I run alternating SMART long self-tests and btrfs scrubs every 15 days (i.e. SMART long self-tests every 30 days, btrfs scrub 15 days after every SMART long self-test).

Note that after a power failure or unclean shutdown, you should run scrub as soon as possible after rebooting, regardless of the normal maintenance schedule. This is especially important for RAID5/6 profiles to regenerate parity blocks that may have been damaged by the parity raid write hole issue. A post-power-failure scrub can also detect some drive firmware bugs.

Pay attention to the output of scrub, especially the per-device statistics (btrfs scrub status -d and btrfs dev stat). Errors reported here will indicate which disks should be replaced in RAID arrays.

2. Watch the amount of “unallocated” space on the filesystem. If the “unallocated” space (shown in 'btrfs fi usage') drops below 1GB, you are at risk of running out of metadata space. If you run out of metadata space, btrfs becomes read-only and it can be difficult to recover.

To free some unused data space (convert it from “data” to “unallocated”):

btrfs balance start -dlimit=5 /path/to/fs

This usually doesn't need to be done more than once per day, but it depends on how busy your filesystem is. If you have hundreds of GB of unallocated space then you won't need to do this at all.

Never balance metadata (i.e. the -m option of btrfs balance) unless you are converting to a different RAID profile (e.g. -mconvert=raid1,soft). If there is sufficient metadata space allocated, then the filesystem can be filled with data without any problems. Balancing metadata can reduce the amount of space allocated for metadata, then the filesystem will be at risk of going read-only if it fills up.

Normally there will be some overallocation of metadata (roughly 3:2 allocated:used ratio). Leave it alone–if the filesystem allocated metadata space in the past, the filesystem may need it again in the future.

Scrub and balancing are the main requirements. Filesystems can operate with just those two maintenance actions for years (outlasting all of their original hardware), and recover from multiple disk failures (one at a time) along the way.

(extract from a post of Zygo Blaxell on linux-btrfs@vger.kernel.org 3 july 2010 6:37AM)