ZFS: Scrubbing with an Asyncronous Cronjob
If you’re following along, we’ve covered setting up a ZFS mirror and upgrading ZFS. Today we’ll discuss scrubbing our zpool.
Scrubbing
ZFS scrubbing is a process which detects and corrects silent data errors in the pool. During a scrub a checksum is being calculated and compared to a known value. If the values differ, then zfs tries to correct the error with good data in the pool.
We want to avoid silent data corruption, so we should scrub our zpool regularly. This can be done while the system is running and could easily be handled with a weekly cronjob. However, I suspend my desktop every night, so if I used a simple cronjob there would be a chance my computer would be asleep when the scrub is scheduled to run.
Anacron
Enter anacron, which stands for “anac(h)ronistic cron”. Anacron is like cron in that it schedules regularly occurring events, however, it does not assume the computer is on during the scheduled time. When the system is turned back on anacron checks the last time a task was run and compares it to when it should be run next (sometimes this will be in the past). If it’s been, say, more than a week since our last zfs scrub, it runs the scrub. On Arch Linux anacron comes with cronie.
Installation and Setup
First we need to install cronie
.
Now, to setup our zfs scrub we need to add the following line to
/etc/anacrontab
:
You’ll see the comments in your anacrontab describing the columns, but for
completeness the columns are the period in days, in this case we want to run
the zpool scrub once a week, so I’ve entered 7, the delay for starting the job
in minutes, we have set this to 10, the job-identifier, which is a unique name
given to the job and is used when storing the last run time for a job, this
could be anything, and finally the command we want to run, zpool scrub lotus
.
You’ll want to enter the name of your zpool where I’ve entered lotus
.
Anacron inserts a random delay to the start of the jobs it runs on top of the entered base delay in the second column. This is meant to prevent all the jobs from running at once, potentially putting a sudden large load on the system.
Finally we need to make sure the cronie.service
is running:
If cronie
is running you’ll see:
Scrub Status
To view the status of an on going scrub, or when you last finished a scrub you
can run sudo zpool status
, there will be a line with scan results on it
similar to this one.
During a scrub there will be a lot of I/O operations occurring, so reading and writing for normal usage will slow down a bit. I was curious about whether or not I could suspend or shutdown the computer during a scrub when I first set this up, as there is a chance I will do so, but I could not find a lot of information on the topic. I have not had issue with it yet, though I believe I have suspended during a scrub at least once since I set this up.
Now that we have automated scrubbing setup, one of the last automated maintenance tasks to setup is automated snapshots of the zpool. I’ll write about this hopefully next week, though it may get delayed by up to a month, as I have a rather involved, month long, exam for my degree beginning tomorrow.