ZFS: Setting up a ZFS Mirror
Lately I have been performing upgrades to the hard drives in my computers. The
hard drive related project today is setting up a ZFS mirror on my desktop which
will contain my /home
directory.
ZFS Intro
I assume if you’re reading this then might you know was ZFS is, but in case you don’t, ZFS is an advanced file system developed by Sun Microsystems. It provides a lot of really attractive features for providing redundancy in your data. ZFS allows you to integrate many disks into a “zpool” for storage. There are many different configurations one could setup, but I’ll just focus on a simple ZFS mirror, where I have two disks, each of which will contain the same data, hence the name ‘mirror’.
This post is going to cover my experience setting up a ZFS mirror, if you want to know more about ZFS (and trust me, there is a lot more to know), I’ve really found this series from Aaron Toponce to be the best guide. That and the ArchWiki ZFS post of course.
Motivation
So why setup a ZFS mirror? Well, I’ve always been interested in setting up a RAID on my desktop. In doing some research on RAID I came across ZFS and it seemed to me like the better option. I’m pretty paranoid when it comes to keeping backups of my data, and while something like RAID or ZFS does not substitute for proper backups, the thought of redundant drives is appealing to me. Once setup, if we lose a disk due to hardware failure the pool will still run in a degraded state, giving us time to swap in a new drive.
Initial HDD Configuration
If you’re reading with the intention of setting up your own ZFS mirror we’re
probably starting in different initial configurations. My setup at the start
involves two SSDs and two HDDs. The two SSDs are 120 GB each; one boots
Windows, the other contains /
, /boot/
and swap
for my Arch linux
installation. (It probably shouldn’t have swap on it, but I’ll address that in
another post.)
The two HDDs are different capacities. Their is a 640GB drive, which holds my
/home
directory, and a 2TB drive which is split into two 1TB partitions, one
for Windows, one for linux, both for storage.
I did a lot of data shuffling to prep for setting up this ZFS mirror with the 2TB drive and a new 2TB drive I just purchased. So, I’m starting from a configuration where I’m booting from the linux SSD and I have no home directory (It’s backed up externally.) This gives us two empty 2TB drives to create the mirror on.
To get started we’ll need to install ZFS, and on Arch this is going to be from
the AUR. Install how you want, but I use yaourt
:
The kernel modules for ZFS are tied to a specific kernel, which required a downgrade for me at the time I installed. Once you exit dependency hell zfs will be installed. Before we jump into making a zpool we should take a second to think about the drives we’re using.
Advanced Format Drives
Something I learned about while setting this up (learning new things is another large part of the motivation for me here) is Advanced Format drives. Older drives will store data in 512 byte sectors. New “Advanced Format” drives store data in 4096 byte sectors.
Where this comes in with ZFS is a potential performance loss if you don’t
explicitly define the sector size when making the zpool. This is pointed out in
the ArchWiki,
and can be done by providing the -o ashift=12
option.
Mixing Drives Types
Part of what made me read more and more about Advanced Format drives is the fact that I didn’t know how to tell if the drives I had were Advanced Format or not and if the two could be mixed in a zpool. They’re both WD Black drives but they were bought almost two years apart. Fortunately, I found this (pdf) model number format guide which details the letters in each model number.
I have a new WD2003FZEX drive and an old WD2002FAEX drive. The difference being the new drive is Advanced Format and the old drive isn’t. So, can we mix the two?
Several people on a few forums suggest forcing the 4K sectors when mixing
drives, as not much performance is lost in forcing 512 byte sectors to align to
4096 byte sector, but there is definite performance lost the other way. So I’ll
be throwing that -o ashift=12
option when I create my zpool. If you’re using
Advanced Format drives you should do the same.
Setting Up the zpool
I wiped the data off my old drive and so both drives are unformatted (i.e. no file system on them). To see all your drives we can use parted,
I’m looking for my only two 2TB drives, which for me were /dev/sda
and
/dev/sdc
, your drives will probably be different. We’ll want the disk id’s to
make the zpool, which we can get just by looking in /dev/
,
So what we want to note is that sda
has the id
ata-WDC_WD2003FZEX-00Z4SA0_WD-WMC5H0DAU37A
and sdc
has the id
ata-WDC_WD2002FAEX-007BA0_WD-WCAY00770606
.
We’ll need this info to make the zpool, which we can do with the zpool create
command:
The options I’ve used are -f
, which forces the use of vdevs (virtual
devices), ours
will be a mirror, -o ashift=12
, which forces the 4096 byte sectors for
advanced format drives, and -m /home
, which sets the mountpoint for the
zpool. The rest of the command contains lotus
, the name of my new pool,
mirror
, the type of ZFS pool we wanted to create, and the two disk id’s.
Making a dataset
Now that we have the zpool we will create a dataset for our home directory:
Datasets allow you to do such things as setting disk usage quotas and maintaining individual snapshots per dataset.
Note that you’ll probably need to chown the new directory to use it,
Now we can start using our new zpool!
Finishing Up
This was a lot easier than I pictured years ago when thinking I’d like to setup a RAID, granted we used ZFS, which seems to be a lot nicer in many respects.
To wrap up we’ll do some quick tweaks. The first of which will be auto-mounting at boot. The zfs daemon is capable of loading a zpool and mounting it at boot. In order to do this we need to setup a cache file, then enable the zfs daemon.
Next, we’re going to enable relatime and turn on lz4 compression to decrease the number of superfluous writes we do to the zpool and to compress our data.
And that’s it, we’ve setup a ZFS mirror as our home directory! You can check
out your new zpool with a simple zpool status
:
(Note: On a fresh new pool you won’t see the “scan” line, this comes in from scrubbing your zpool, something that I’ll describe how to automate in a later post.)
Then all we need to do is rsync
our data to our home directory from our
backups and enjoy the new disk redundancy.
Further Adjustments
I really setup this pool about a month ago. There are many more things you can and should do with your zpool, such as having automated snapshots and scrubbing your zpool regularly. Setting all these things up at once takes a while on the first time through, and I was planning on including it all in this post. As a result, this already lengthy post was becoming way too long. So expect to see more ZFS related posts in the near future.