I’m planning on setting up a nas/home server (primarily storage with some jellyfin and nextcloud and such mixed in) and since it is primarily for data storage I’d like to follow the data preservation rules of 3-2-1 backups. 3 copies on 2 mediums with 1 offsite - well actually I’m more trying to go for a 2-1 with 2 copies and one offsite, but that’s besides the point. Now I’m wondering how to do the offsite backup properly.

My main goal would be to have an automatic system that does full system backups at a reasonable rate (I assume daily would be a bit much considering it’s gonna be a few TB worth of HDDs which aren’t exactly fast, but maybe weekly?) and then have 2-3 of those backups offsite at once as a sort of version control, if possible.

This has two components, the local upload system and the offsite storage provider. First the local system:

What is good software to encrypt the data before/while it’s uploaded?

While I’d preferably upload the data to a provider I trust, accidents happen, and since they don’t need to access the data, I’d prefer them not being able to, maliciously or not, so what is a good way to encrypt the data before it leaves my system?

What is a good way to upload the data?

After it has been encrypted, it needs to be sent. Is there any good software that can upload backups automatically on regular intervals? Maybe something that also handles the encryption part on the way?

Then there’s the offsite storage provider. Personally I’d appreciate as many suggestions as possible, as there is of course no one size fits all, so if you’ve got good experiences with any, please do send their names. I’m basically just looking for network attached drives. I send my data to them, I leave it there and trust it stays there, and in case too many drives in my system fail for RAID-Z to handle, so 2, I’d like to be able to get the data off there after I’ve replaced my drives. That’s all I really need from them.

For reference, this is gonna be my first NAS/Server/Anything of this sort. I realize it’s mostly a regular computer and am familiar enough with Linux, so I can handle that basic stuff, but for the things you wouldn’t do with a normal computer I am quite unfamiliar, so if any questions here seem dumb, I apologize. Thank you in advance for any information!

  • Bassman1805@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    1 day ago

    The easiest offsite backup would be any cloud platform. Downside is that you aren’t gonna own your own data like if you deployed your own system.

    Next option is an external SSD that you leave at your work desk and take home once a week or so to update.

    The most robust solution would be to find a friend or relative willing to let you set up a server in their house. Might need to cover part of their electric bill if your machine is hungry.

  • AustralianSimon@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    1 day ago

    I have two large (8 Bay) Synology NAS. They backup certain data between each other and replicate internally and push to Back blaze. $6/mo.

  • merthyr1831@lemmy.ml
    link
    fedilink
    English
    arrow-up
    9
    ·
    2 days ago

    Rsync to a Hetzner storage box. I dont do ALL my data, just the nextcloud data. The rest is…linux ISOs… so I can redownload at my convenience.

  • toe@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 day ago

    LTO8 in box elsewhere

    The price per terabyte became viable when a drive was on sale for half off at a local retailer.

    Works well and it was a fun learning experience.

  • pHr34kY@lemmy.world
    link
    fedilink
    English
    arrow-up
    14
    ·
    edit-2
    2 days ago

    I have a job, and the office is 35km away. I get a locker in my office.

    I have two backup drives, and every month or so, I will rotate them by taking one into the office and bringing the other home. I do this immediately after running a backup.

    The drives are LUKS encrypted btrfs. Btrfs allows snapshots and compression. LUKS enables me to securely password protect the drive. My backup job is just a btrfs snapshot followed by an rsync command.

    I don’t trust cloud backups. There was an event at work where Google Cloud accidentally deleted an entire company just as I was about to start a project there.

  • randombullet@programming.dev
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 day ago

    My friend has 1G/1G Internet. I have a rsync cron job backing up there 2 times a week.

    It has a 8TB NVMe drive that I use bulk data backup and a 2TB os drive for VM stuff.

  • TrumpetX@programming.dev
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 day ago

    Look into storj and tardigrade. It’s a crypto thing, but don’t get scared. You back up to S3 compatible endpoints and it’s super cheap (and pay with USD credit card)

  • thecoffeehobbit@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 days ago

    I have an external storage unit a couple kilometers away and two 8TB hard drives with luks+btrfs. One of them is always in the box and after taking backups, when I feel like it, I detach the drive and bike to the box to switch. I’m currently researching btrbk for updating the backup drive on my pc automatically, it’s pretty manual atm. For most scenarios the automatic btrfs snapshots on my main disks are going to be enough anyway.

      • huquad@lemmy.ml
        link
        fedilink
        English
        arrow-up
        8
        ·
        3 days ago

        Agreed. I have it configured on a delay and with multiple file versions. I also have another pi running rsnapshot (rsync tool).

      • Appoxo@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 days ago

        In theory you could setup a cron with a docker compose to fire up a container, sync and once all endpoint jobs are synced to shut down.
        As it seemingly has an API it should be possible.

      • thejml@lemm.ee
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 days ago

        Have it sync the backup files from the -2- part. You can then copy them out of the syncthing folder to a local one with a cron to rotate them. That way you get the sync offsite and you can keep them out of the rotation as long as you want.

      • huquad@lemmy.ml
        link
        fedilink
        English
        arrow-up
        8
        ·
        edit-2
        2 days ago

        My most critical data is only ~2-3TB, including backups of all my documents and family photos, so I have a 4TB ssd attached which the pi also boots from. I have ~40TB of other Linux isos that have 2-drive redundancy, but no backups. If I lose those, i can always redownload.

    • dave@lemmy.wtf
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      using a meshVPN like tailscale or netbird would another option as well. it would allow you to use proper backup software like restic or whatever, and with tailscale on both devices, it would allow restic to be able to find the pi device even if the other person moved to a new house. (although a pi with ethernet would be preferable so all they have to do is plug it in to their new network and everything would be good. if it was a pi zero then someone would have to update the wifi password)

      • huquad@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        2 days ago

        Funny you mention it. This is exactly what I do. Don’t use the relay servers for syncthing, just my tailnet for device to device networking.

    • Matriks404@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      2 days ago

      Huh, that’s a pretty good idea. I already have a Raspberry Pi setup at home, and it wouldn’t be hard to duplicate in other location.

  • Ulrich@feddit.org
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 days ago

    I assume daily would be a bit much considering it’s gonna be a few TB worth of HDDs which aren’t exactly fast

    What is the concern here?

  • glizzyguzzler@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    1 day ago

    I got my parents to get a NAS box, stuck it in their basement. They need to back up their stuff anyway. I put in 2 18 TB drives (mirrored BTRFS raid1) from server part deals (peeps have said that site has jacked their prices, look for alts). They only need like 4 TB at most. I made a backup samba share for myself. It’s the cheapest symbology box possible, their software to make a samba share with a quota.

    I then set up a wireguard connection on an RPi, taped that to the NAS, and wireguard to the local network with a batch script. Mount the samba share and then use restic to back up my data. It works great. Restic is encrypted, I don’t have to pay for storage monthly, their electricity is cheap af, they have backups, I keep tabs on it, everyone wins.

    Next step is to go the opposite way for them, but no rush on that goal, I don’t think their basement would get totaled in a fire and I don’t think their house (other than the basement) would get totaled in a flood.

    If you don’t have a friend or relative to do a box-at-their-house (peeps might be enticed with reciprocal backups), restic still fits the bill. Destination is encrypted, has simple commands to check data for validity.

    Rclone crypt is not good enough. Too many issues (path length limits, password “obscured” but otherwise there, file structure preserved even if names are encrypted). On a VPS I use rclone to be a pass-through for restic to backup a small amount of data to a goog drive. Works great. Just don’t fuck with the rclone crypt for major stuff.

    Lastly I do use rclone crypt to upload a copy of the restic binary to the destination, as the crypt means the binary can’t be fucked with and the binary there means that is all you need to recover the data (in addition to the restic password you stored safely!).

  • rutrum@programming.dev
    link
    fedilink
    English
    arrow-up
    31
    ·
    3 days ago

    I use borg backup. It, and another tool called restic, are meant for creating encrypted backups. Further, it can create backups regularly and only backup differences. This means you could take a daily backup without making new copies of your entire library. They also allow you to, as part of compressing and encrypting, make a backup to a remote machine over ssh. I think you should start with either of those.

    One provider thats built for being a cloud backup is borgbase. It can be a location you backup a borg (or restic I think) repository. There are others that are made to be easily accessed with these backup tools.

    Lastly, I’ll mention that borg handles making a backup, but doesn’t handle the scheduling. Borgmatic is another tool that, given a yml configuration file, will perform the borgbackup commands on a schedule with the defined arguments. You could also use something like systemd/cron to run a schedule.

    Personally, I use borgbackup configured in NixOS (which makes the systemd units for making daily backups) and I back up to a different computer in my house and to borgbase. I have 3 copies, 1 cloud and 2 in my home.

  • Matt The Horwood@lemmy.horwood.cloud
    link
    fedilink
    English
    arrow-up
    24
    ·
    3 days ago

    There’s some really good options in this thread, just remember that whatever you pick. Unless you test your backups, they are as good as not existing.

    • dave@hal9000@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      2 days ago

      Is there some good automated way of doing that? What would it look like, something that compares hashes?

      • Matt The Horwood@lemmy.horwood.cloud
        link
        fedilink
        English
        arrow-up
        6
        ·
        2 days ago

        That very much depends on your backup of choice, that’s also the point. How do you recover your backup?

        Start with a manual recover a backup and unpack it, check import files open. Write down all the steps you did, how do you automate them.

      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        2 days ago

        I don’t trust automation for restoring from backup, so I keep the restoration process extremely simple:

        1. automate recreating services - have my podman files in a repository
        2. manually download and extract data to a standard location
        3. restart everything and verify that each service works properly

        Do that once/year in a VM or something and you should be good. If things are simple enough, it shouldn’t take long (well under an hour).

    • Showroom7561@lemmy.ca
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 days ago

      How does one realistically test their backups, if they are doing the 3-2-1 backup plan?

      I validate (or whatever the term used is) my backups, once a month, and trust that it means something 😰

      • Matt The Horwood@lemmy.horwood.cloud
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 days ago

        Untill you test a backup it’s not complete, how you test it is up to you.

        If you upload to a remote location, pull it down and unpack it. Check that you can open import files, if you can’t open it then the backup is not worth the dick space

      • Appoxo@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 days ago

        Deploy the backup (or some part of it) to a test system. If it can boot or you can get the files back, they work.

        • Showroom7561@lemmy.ca
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 days ago

          For context, I have a single Synology NAS, so recovering and testing the entire backup set would not be practical in my case.

          I have been able to test single files or entire folders and they work fine, but obviously I’d have no way of testing the entire backup set due to the above consideration. It is my understanding that the verify feature that Synology uses is to ensure that there’s no bit rot and that the file integrity is intact. My hope is that because of how many isolated backups I do keep, the chance of not being able to recover is slim to none.