I’m planning on setting up a nas/home server (primarily storage with some jellyfin and nextcloud and such mixed in) and since it is primarily for data storage I’d like to follow the data preservation rules of 3-2-1 backups. 3 copies on 2 mediums with 1 offsite - well actually I’m more trying to go for a 2-1 with 2 copies and one offsite, but that’s besides the point. Now I’m wondering how to do the offsite backup properly.
My main goal would be to have an automatic system that does full system backups at a reasonable rate (I assume daily would be a bit much considering it’s gonna be a few TB worth of HDDs which aren’t exactly fast, but maybe weekly?) and then have 2-3 of those backups offsite at once as a sort of version control, if possible.
This has two components, the local upload system and the offsite storage provider. First the local system:
What is good software to encrypt the data before/while it’s uploaded?
While I’d preferably upload the data to a provider I trust, accidents happen, and since they don’t need to access the data, I’d prefer them not being able to, maliciously or not, so what is a good way to encrypt the data before it leaves my system?
What is a good way to upload the data?
After it has been encrypted, it needs to be sent. Is there any good software that can upload backups automatically on regular intervals? Maybe something that also handles the encryption part on the way?
Then there’s the offsite storage provider. Personally I’d appreciate as many suggestions as possible, as there is of course no one size fits all, so if you’ve got good experiences with any, please do send their names. I’m basically just looking for network attached drives. I send my data to them, I leave it there and trust it stays there, and in case too many drives in my system fail for RAID-Z to handle, so 2, I’d like to be able to get the data off there after I’ve replaced my drives. That’s all I really need from them.
For reference, this is gonna be my first NAS/Server/Anything of this sort. I realize it’s mostly a regular computer and am familiar enough with Linux, so I can handle that basic stuff, but for the things you wouldn’t do with a normal computer I am quite unfamiliar, so if any questions here seem dumb, I apologize. Thank you in advance for any information!
- wireguard
- rsync
- zfs
Rsync to a Hetzner storage box. I dont do ALL my data, just the nextcloud data. The rest is…linux ISOs… so I can redownload at my convenience.
It’s not all my data but I use backblaze for offsite backup. One of the reasons I can’t drop Windows. I don’t have anywhere I travel often enough to do a physical drop off and when I tried setting a file server up at my parents but they would break shit by fucking with their router every time they had an internet outage or moving it around (despite repeated being told to call me first).
I can’t stand the fact that they don’t support Linux
Objdct storage is anyway something I prefer over their app. Restic(/rustic) does the backup client side. B2 or any other storage to just save the data. This way you also have no vendor lock.
Same - can sync snapshots from Truenas to Backblaze.
If you want to get real fancy you could stash an N40L cube server at your mom’s house where she will never find it and VPN back to your local network and replicate snapshots to it
I have a storage VPS and use Borg backup with Borgmatic. In my case, I have multiple systems in different repos on the remote. There are several providers, such as hetzner, borgbase, and rsync.net that offer borg storage, in the event you don’t want to manage the server yourself.
I have a job, and the office is 35km away. I get a locker in my office.
I have two backup drives, and every month or so, I will rotate them by taking one into the office and bringing the other home. I do this immediately after running a backup.
The drives are LUKS encrypted btrfs. Btrfs allows snapshots and compression. LUKS enables me to securely password protect the drive. My backup job is just a btrfs snapshot followed by an rsync command.
I don’t trust cloud backups. There was an event at work where Google Cloud accidentally deleted an entire company just as I was about to start a project there.
I just use
restic
.I’m pretty sure it uses checksums to verify data on the backup target, so it doesn’t need to copy all of the data there.
I use Linux, so encryption is easy with LUKS, and Free File Sync to drives that rotate to a safety deposit box at the bank for catastrophic event, such as a house fire. Usually anything from the last few months are still on my mobile devices.
I don’t 🙃
I tend to just store all my backups off-site in multiple geographically distant locations, seems to work well
So, you’re geocaching a bunch of USB drives then?
I bring 1 of my backup disks to my inlaws. I go there regularly so it’s a matter of swapping them when I’m there.
I’m just skipping that. How am I going to backup 48TB on an off-site backup?!
You ought to only be 3-2-1ing you irreplaceable/essential files like personal photos, videos, and documents. Unless you’re a huge photography guy i can believe that takes up 48TB
Only back up the essentials like photos and documents or rare media.
Don’t care about stuff like Avengers 4K that can easily be reaquiredI don’t have Avengers 4K. It’s all just Linux ISOs.
This is what I’m currently doing, I use backblaze b2 for basically everything that’s not movies/shows/music/roms, along with backing up my docker stacks etc to the same external drive my media’s currently on.
I’m looking at a few good steps to upgrade this but nothing excessive:
- NAS for media and storing local backups
- Regular backups of everything but media to a small USB drive
- Get a big ass external HDD that I’ll update once a month with everything and keep in my storage unit and ditch backblaze
Not the cleanest setup but it’ll do the job. The media backup is definitely gonna be more of a 2-1-Pray system LMAO but at least the important things will be regularly taken care of
a “poor mans” backup can be useful for things like this, movie/tv/music collections, and will only be a few MB instead of TB.
if things go south at least you can rebuild your collection in time. obviously if theres some rare files that were hard to get then you can backup those ones, but even at that it will probably still be a small backup
Yup. My “essential” backups are well under 1TB, the rest can be reacquired.
Get a tiny ITX box with a couple 20TB refurbished HDDs, stick it at a friend’s house
In theory. But I already spent my pension for those 64TB drives (raidz2) xD. Getting off-site backup for all of that feels like such a waste of money (until you regret it). I know it isn’t a backup, but I’m praying the Raidz2 will be enough protection.
The cost of storage is always more than double the sticker price. The hidden fee is that you need a second and maybe a third one and a system to put it all in. Most our operational lab cost is backups. I can’t replace the data if it’s lost.
Just a friendly reminder that RAID is not a backup…
Just consider if something accidentally overwrites some / all your files. This is a perfectly legit action and the checksums will happily match that new data, but your file(s) are gone…
I do weekly ZFS snapshots though and I’m very diligent on my smart tests and scrubs. I also have a UPS and a lot of power surge protection. And ECC Ram. It’s as safe as it gets. But having a backup would definitely be better, you’re right. I just can’t afford it for this much storage.
That’s a great use-case for snapshots! RAID still isn’t a backup, but it can be quite robust.
Do you have to back up everything off site?
Maybe there are just a few critical files you need a disaster recovery plan for, and the rest is just covered by your raidz
Understanding the risks is half the battle, but we can only do what we can do.
Syncthing to a pi at my parents place.
using a meshVPN like tailscale or netbird would another option as well. it would allow you to use proper backup software like restic or whatever, and with tailscale on both devices, it would allow restic to be able to find the pi device even if the other person moved to a new house. (although a pi with ethernet would be preferable so all they have to do is plug it in to their new network and everything would be good. if it was a pi zero then someone would have to update the wifi password)
A pi with multiple terabytes of storage?
My most critical data is only ~2-3TB, including backups of all my documents and family photos, so I have a 4TB ssd attached which the pi also boots from. I have ~40TB of other Linux isos that have 2-drive redundancy, but no backups. If I lose those, i can always redownload.
Huh, that’s a pretty good idea. I already have a Raspberry Pi setup at home, and it wouldn’t be hard to duplicate in other location.
Low power server in a friends basement running syncthing
But doesn’t that sync in real-time? Making it not a true backup?
You could use scheduled snapshots to provide the backup portion.
In theory you could setup a cron with a docker compose to fire up a container, sync and once all endpoint jobs are synced to shut down.
As it seemingly has an API it should be possible.Have it sync the backup files from the -2- part. You can then copy them out of the syncthing folder to a local one with a cron to rotate them. That way you get the sync offsite and you can keep them out of the rotation as long as you want.
Agreed. I have it configured on a delay and with multiple file versions. I also have another pi running rsnapshot (rsync tool).
How’d you do that?
For the delay, I just reduce how often it checks for new files instead of instantaneously.
Edit the share, enable file versioning, choose which flavor.
Hetzner Storagebox
Just recently moved from an S3 cloud provider to a storagebox. Prices are ok and sub accounts help clean things up.
There’s some really good options in this thread, just remember that whatever you pick. Unless you test your backups, they are as good as not existing.
How does one realistically test their backups, if they are doing the 3-2-1 backup plan?
I validate (or whatever the term used is) my backups, once a month, and trust that it means something 😰
Untill you test a backup it’s not complete, how you test it is up to you.
If you upload to a remote location, pull it down and unpack it. Check that you can open import files, if you can’t open it then the backup is not worth the dick space
Heh
Deploy the backup (or some part of it) to a test system. If it can boot or you can get the files back, they work.
For context, I have a single Synology NAS, so recovering and testing the entire backup set would not be practical in my case.
I have been able to test single files or entire folders and they work fine, but obviously I’d have no way of testing the entire backup set due to the above consideration. It is my understanding that the verify feature that Synology uses is to ensure that there’s no bit rot and that the file integrity is intact. My hope is that because of how many isolated backups I do keep, the chance of not being able to recover is slim to none.
Is there some good automated way of doing that? What would it look like, something that compares hashes?
I don’t trust automation for restoring from backup, so I keep the restoration process extremely simple:
- automate recreating services - have my podman files in a repository
- manually download and extract data to a standard location
- restart everything and verify that each service works properly
Do that once/year in a VM or something and you should be good. If things are simple enough, it shouldn’t take long (well under an hour).
That very much depends on your backup of choice, that’s also the point. How do you recover your backup?
Start with a manual recover a backup and unpack it, check import files open. Write down all the steps you did, how do you automate them.
Veeam Backup&Replication with a NFR license for me.
My personal setup:
First backup: Just a back up to a virtual drive stored on my NAS
Offsite backup: Essentially an export of what is available and then creates a full or incremental backup to an external USB drive.
I have two of those. One I keep at home in case my NAS explodes. The second is at my work place.
The off-site only contains my most important pieces of data.
As for frequency: As often as I remember to make one as it requires manual interaction.Our clients have (depending on their size) the following setups:
2 or more endpoints (excluding exceptions):
Veeam BR Server
First backup to NAS
Second backup (copy of the first) to USB drives (min. of 3. 1 connected, 2 somewhere stored in the business, 3 at home/off-site. Daily rotation)
Optionally a S3 compatible cloud backup.Bigger customers maybe have mirroring but we have those cases very rarely.
Edit: The backups can be encrypted at all steps (first backup or backup copys)
Edit 2: Veeam B/R is not (F)OSS but very reasonable for the free community edition. Has support for Windows, mac and Linux (some distros, only x64/x86). The NFR license can be aquired relatively easy (from here and they didn’t check me in any way.
I like the software as it’s very powerful and versatile. Both geared towards Fortune>500 and small shops/deployments.
And the next version will see a full linux version both as a single install and a virtual appliance.
They also have a setup for hardened repositories.