Skip to content

Instantly share code, notes, and snippets.

@lucas-mior
Last active June 8, 2024 16:54
Show Gist options
  • Save lucas-mior/9aa3dc9e9185f083750c91821b447216 to your computer and use it in GitHub Desktop.
Save lucas-mior/9aa3dc9e9185f083750c91821b447216 to your computer and use it in GitHub Desktop.
Btrfs and RAID

BTRFS and RAID

Basics

There is a lot of confusion about btrfs and its RAID features. Let's start from the basics:

  • RAID stands for "redundant array of inexpensive/independent disks". It is a way of organizing disks.
  • There are hardware and software RAID implementations, for instance:
    • mdadm is a software implementation found in the linux kernel
    • some CPUs have hardware RAID
    • there are specialized hardware RAID controllers
    • Some filesystems implement software RAID (zfs, btrfs)
    • all examples above are independent from each other
  • There are many RAID configurations. Not all are really redundant, and they affect performance (for better or worse) in different ways. The basic concept is:
    • RAID0 doesn't offer any redundancy at all, but may increase performance. If you lose one disk, you lose all your data.
    • RAID1 duplicates all data over your disks (redundancy), so that if one disk fails you can save your data and replace the faulty disk.
    • RAID10 (or RAID 1+0) offers redundancy and potential performance benefits, as it is a combination of RAID0 and RAID1.
    • RAID5 and RAID6 offer redundancy and parity checking.
  • Most RAID implementations don't do anything about corrupted data. They can warn about corruption if the drive's SMART detects some specific condition, but they typically can't fix anything.
  • RAID is not a backup. If anything happens to your RAID, you lose your data. However RAID could definetely be used as a backup that you keep separate from another copy.

BTRFS's peculiar RAID profiles

Many people complain that btrfs RAID isn't really RAID, because it is in fact very different from the tradicional RAID conception.

First things first, RAID5 and RAID6 are unstable and should not be used except for testing with throwaway data. It doesn't matter if you use RAID5 or 6 for years and haven't run in a problem. The moment when anything weird happens, blame is on you for using it despite all the warnings, even if the problem has apparently nothing to do with RAID.

The other RAID profiles are stable and can be used. Btrfs can apply different profiles for metadata and data. For instance, to add a device to your btrfs file system and then convert the metadata to raid1 and the data to raid0:

btrfs device add /dev/sdb1 /mnt
btrfs balance start -mconvert=raid1 -dconvert=raid0 /mnt

RAID0 works pretty much like tradicional RAID0. Stripping of data is used to achieve better performance. If one disk fails, you lose all data. Now, RAID1 and RAID10 is where things get different and the main reason people complain about btrfs raid. Tradicional RAID1 offers more redundancy as you add more disks to your array. That means that if you lose 2 disks of a 3 disk array, you still have all your data saved. Btrfs RAID1 only makes one redundant copy, even if you have more than 2 disks. That means that if you lose 2 disks or more, you are screwed. The same thing applies to RAID10. If you want more redundant copies, use btrfs RAID1C3 or RAIDD1C4 for 3 or 4 copies respectively.

Btrfs RAID10 requires 2 disks at minimum. All data is mirrored and also stripped. Which makes me wonder why anyone would choose btrfs RAID1 over btrfs RAID10, even if btrfs RAID10 performance isn't better because of the current implementation. Someone else asked this at https://lore.kernel.org/linux-btrfs/[email protected]/T/ but no one was able to answer it.

Another point of complain about btrfs is its inabiliy to mount the filesystem when there are missing disks. To do that, you have to supply the degraded mount option. Since RAID is supposed to improve the uptime of your storage, this design decision does not make any sense.

Why use btrfs RAID instead of a traditional RAID?

As I said before, most RAID implementations can't fix corrupted data, while btrfs can, but: Btrfs needs "block group profile with redundancy" to auto-repair corrupted data. You may also want to look into mdadm which provides fixing of corrupted data.

btrfs raid table summary

Profiles Redundant copies Parity Striping Space usage Min. devices
single 1 100% 1
DUP 2/1 device 50% 1
RAID0 1 1 to N 100% 1
RAID1 2 50% 2
RAID1C3 3 33% 3
RAID1C4 4 25% 4
RAID10 2 1 to N 50% 2
RAID5 1 1 2 to N-1 (N-1)/N 2
RAID6 1 2 3 to N-2 (N-2)/N 3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment