Watch out, things break, stuff catches fire. Let’s talk about backups.
Last post, I stated that I’m going to switch focus away from NixOS commentary. This is still the plan. Today, I am still committed to NixOS thanks to technical debt created - migrations aren’t for free. Until then, enjoy my NixOS posting :).
Last fall, I wanted to reformat my laptop’s NixOS deployment from BTRFS (encased within LVM2 itself encased in LUKS) to a ZFS partition plus another swap partition. My Nix install is comprised of a few artifacts:
- My git repository with the
flake.nix
andflake.lock
files - The workstation’s
/secrets
folder, sensitive data for service accounts. - The workstation’s
/home
folder
Both /secrets
and /home
are backed up via borgmatic (using borg) on a
nightly basis via a crufty old nixos module that I wrote (example of usage).
Both folders were also snapshotted by BTRBK every 15 minutes (via this nixos
configuration). This frequent snapshotting policy will continue on the ZFS
reinstall powered by zfs-autosnapshot.
The first test was to verify the integrity of the backed up artifacts. I was
able to execute a full restore from backup from within a virtual machine. This
included adapting my laptop’s flake configuration to the VM, rebuilding, then
executing the borg extract
commands.
Fun fact:
borg mount
andrsync
is several times slower than runningborg extract
(using BorgBase). Keep that in mind when executing restores - if you need a full restore or a restore of a subdirectory, considerborg extract
. If you need to pick and choose many files, considerborg mount
.
After the successful test restore, it was time to execute a final backup. On
my setup that’s as simple as systemctl start backup
. Then
boot a NixOS installer. Invoking parted /dev/nvme0n1
, I came up with the
following partition layout:
Model: INTEL SSDPEKNU010TZ (nvme)
Disk /dev/nvme0n1: 1024GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 1000MB 999MB fat32 boot boot, esp
2 1000MB 1007GB 1006GB zfs
3 1007GB 1023GB 16.2GB swap
The swap partition is used in conjunction with NixOS’s
swapDevices.*.randomEncryption.enable
setting. This swap partition is
encrypted using LUKS. This encrypted device mapper device is used for
swap.
Then I followed my standard install instructions here on this blog.
§Recovery strategy
As part of this restore procedure, I tested my restore strategy. It turned out the thumb drive and QR code with the full disk encryption (FDE) key for the thumb drive were compromised. They simply didn’t work - the QR code was of a different key.
Had I needed to recover my setup in a data loss scenario, it is likely I would had lost data due to not having access to recovery material. I was at risk of data loss. Ooop!
Next I created new recovery material. It consisted of two components: A passphrase and an encrypted thumb drive. They live together; the passphrase is more of a “are you sure you want to open this?” than a security measure. The encrypted thumb drive contains my PGP private keys (encrypted with each key’s own passphrase) and password database encrypted against my “private use only” key.
In order to restore from this media, first open the LUKS container via
cryptsetup luksOpen /dev/disk/by-id/usb-...-part0 usbCrypt
then mount it via
mount /dev/mapper/usbCrypt /mnt/usb
. I can load the GPG keys into my gpg
then run gpg --decrypt --output - /mnt/usb/password-store/backups/stargate/passphrase
to get the backup borg
storage passphrase. I can then set up borg to access my backup via accessing
my backup service’s dashboard.
Finally I was able to run borg extract ...
to kick off the restore on the
laptop.
From start to finish the redeploy and restore took 3 hours for the data restore and another hour due to various tasks of how this procedure works. It’s not super automatic, but hey, it’s tested and it works!
§Well I guess the restore worked!
Test your backups. Until you do, they are but a speculative investment; you’re not sure if they work. In theory they say they should, however, who really knows. Nobody really knows. Go test your backups. Haven’t done it yet, well, then, buy this sledge hammer and apply it to your storage devices, because your data is worthless without tested backups - it could disappear at any time. Theft, fire, PEBKAC, Software Bug (like Steam’s infamous rm -rf script bug)… anything is possible.
§Want a T-shirt?
I’m selling T-shirts with that sledge hammer fellow on the back and “COMPUTERS WERE A MISTAKE” on the front. Of course it’s supposed to be cheeky and not too serious - we must laugh at technology before it destroys our human identity. And embrace the good parts. Computers are fun, if you let ’em. Have fun with ’em, but don’t let ’em control every aspect of your being.