I know that for data storage the best bet is a NAS and RAID1 or something in that vein, but what about all the docker containers you are running, carefully configured services on your rpi, installed *arr services on your PC, etc.?

Do you have a simple way to automate backups and re-installs of these as well or are you just resigned to having to eventually reconfigure them all when the SD card fails, your OS needs a reinstall or the disk dies?

  • rentar42@kbin.social
    link
    fedilink
    arrow-up
    18
    ·
    10 months ago

    There’s lots of very good approaches in the comments.

    But I’d like to play the devil’s advocate: how many of you have actually recovered from a disaster that way? Ideally as a test, of course.

    A backup system that has never done a restore operations must be assumed to be broken. similar logic should be applied to disaster recovery.

    And no: I use Ansible/Docker combined approach that I’m reasonably sure could quite easily recover most stuff, but I’ve not yet fully rebuilt from just that yet.

    • Human Crayon@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 months ago

      I have (more than I’d like to admit) recovered entirely from backups.

      I run proxmox, everything else in a VM. All VMs get backed up to three different places once a week, backups are tested monthly on a rando proxmox box to make sure they still work. I do like the backup system built into it, serves my needs well.

      Proxmox could die and it wouldn’t make much of a difference. I reinstall proxmox, restore the VMs and I’m good to go again.

    • Dandroid@dandroid.app
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 months ago

      I restored from a backup when I swapped to a bigger SSD. Worked perfectly first try. I use rsnapshot for backups.

  • CameronDev@programming.dev
    link
    fedilink
    English
    arrow-up
    9
    ·
    10 months ago

    I rsync my root and everything under it to a NAS, will hopefully save my data. I wrote some scripts manually to do that.

    I think the next best thing to do is to doco your setup as mich as possible. Either by typed up notes, or ansible/packer/whatever, any documentation is better than nothing if you have to rebuild.

    • darvocet@infosec.pub
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 months ago

      I run history and then clean it up so i have a guide to follow on the next setup. It’s not even so much for drive failure but to move to the newer OS versions when available.

      The ‘data’ is backed up by scripts that tar folders up and scp them off to another server.

    • foggy@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      10 months ago

      I have a 16tb USB HDD that syncs to my NAS whenever my workstation is idle for 20 minutes.

  • friend_of_satan@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    ·
    10 months ago

    I’ve had a complete drive failure twice within the last year (really old hardware) and my ansible + docker + backup made it really easy to recover from. I got new hardware and was back up and running within a few hours.

    All of your services setup should be automated (through docker-compose or ansible or whatever) and all your configuration data should be backed up. This should make it easy to migrate services from one machine to another, and also to recover from a disaster.

  • dr_robot@kbin.social
    link
    fedilink
    arrow-up
    6
    ·
    10 months ago

    My configuration and deployment is managed entirely via an Ansible playbook repository. In case of absolute disaster, I just have to redeploy the playbook. I do run all my stuff on top of mirrored drives so a single failure isn’t disastrous if I replace the drive quickly enough.

    For when that’s not enough, the data itself is backed up hourly (via ZFS snapshots) to a spare pair of drives and nightly to S3 buckets in the cloud (via restic). Everything automated with systemd timers and some scripts. The configuration for these backups is part of the playbooks of course. I test the backups every 6 months by trying to reproduce all the services in a test VM. This has identified issues with my restoration procedure (mostly due to potential UID mismatches).

    And yes, I have once been forced to reinstall from scratch and I managed to do that rather quickly through a combination of playbooks and well tested backups.

  • Eskuero@lemmy.fromshado.ws
    link
    fedilink
    English
    arrow-up
    5
    ·
    10 months ago

    My docker containers are all configured via docker compose so I just tar the .yml files and the outside data volumes and backup that to an external drive.

    For configs living in /etc you can also backup all of them but I guess its harder to remember what you modified and where so this is why you document your setup step by step.

    Something nice and easy I use for personal documentations is mdbooks.

    • Kaldo@kbin.socialOP
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      10 months ago

      Ahh, so the best docker practice is to always just use outside data volumes and backup those separately, seems kinda obvious in retrospect. What about mounting them directly to the NAS (or even running docker from NAS?), for local networks the performance is probably good enough? That way I wouldn’t have to schedule regular syncs and transfers between “local” device storage and NAS? Dunno if it would have a negative effect on drive longevity compared to just running a daily backup.

      • Adam@doomscroll.n8e.dev
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        If you’ve got a good network path NFS mounts work great. Don’t forget to also back up your compose files. Then bringing a machine back up is just a case of running them.

  • tetris11@lemmy.ml
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    10 months ago

    Radical suggestion:

    • Once a year you buy a hard drive that can handle all of your data.
    • rsync everything to it
    • unplug it, put it back in cold storage
    • atzanteol@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      4
      ·
      10 months ago

      Once a… year? There’s a lot that can change in a year. Cloud storage can be pretty cheap these days. Backup to something like backblaze, S3 or Glacier nightly instead.

    • Haystack@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      For real, saves so much space that would be used for VM backups.

      Aside from that, I have anything important backed up to my NAS, and Duplicati backs up from there to Backblaze B2.

  • namelivia@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    10 months ago

    I have all my configuration as Ansible and Terraform code, so everything can be destroyed and recreated with no effort.

    When it comes to the data, I made some bash script to copy, compress, encrypt and upload them encrypted. Not sure if this is the best but it is how I’m dealing with it right now.

    • rentar42@kbin.social
      link
      fedilink
      arrow-up
      3
      ·
      10 months ago

      I’ve got a similar setup, but use Kopia for backup which does all that you describe but also handles deduplication of data very well.

      For example I’ve added older less structured backups to my “good” backup now and since there is a lot of duplication between a 4 year old backup and a 5 year old backup it barely increased the storage space usage.

  • atzanteol@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    10 months ago
    1. Most systems are provisioned in proxmox with terraform.
    2. Configuration and setup is handled via ansible playbooks after the server is available. 2.a) Do NOT make changes on the server without updating your ansible scripts - except during troubleshooting. 2.b) Once troubleshooting is done delete and re-create the VM from scratch using only scripts to ensure it works.
    3. VM storage is considered to be ephemeral. All long-term data/config that can’t be re-created with ansible is either stored on an NFS server with a RAID5 dive configuration or backed up to that same file-server using rsnapshot.
    4. NFS server is backed-up nightly to backblaze using duplicacy.
    5. Any other non-VM systems like personal laptops and the like are backed up nightly to the file-server using rsnapshot. Those snapshots are then backed up to backblaze using duplicacy.
  • CarbonatedPastaSauce@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    10 months ago

    I actually run everything in VMs and have two hypervisors that sync everything to each other constantly, so I have hot failover capability. They also back up their live VMs to each other every day or week depending on the criticality of the VM. That way I also have some protection against OS issues or a wonky update.

    Probably overkill for a self hosted setup but I’d rather spend money than time fixing shit because I’m lazy.

    • surewhynotlem@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      10 months ago

      HA is not redundancy. It may protect from a drive failure but it completely ignores data corruption issues.

      I learned this the hard way when my cryptomator decided to corrupt some of my files, and I noticed but didn’t have backups.

  • adONis@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    10 months ago

    Most of the docker services use mounted folders/files, which I usually store in the users home folder /home/username/Docker/servicename.

    Now, my personal habit of choice is to have user folders on a separate drive and mount them into /home/username. Additionally, one can also mount /var/lib/docker this way. I also spin up all of these services with portainer. The benefit is, if the system breaks, I don’t care that much, since everything is on a separate drive. In case of needing to re-setup everything again, I just spin up portainer again which does the rest.

    However, this is not a backup, which should be done separately in one way or the other. But it’s for sure safer than putting all the trust into one drive/sdcard etc.

  • ikidd@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    10 months ago

    I run everything on a 2 node proxmox cluster with ZFS mirror volumes and replication of the VMs and CTs between them, run PBS with hourly snapshots, and sync that to multuple USB drives I swap off site.

    The docker VM can be ZFS snapshotted before major updates so I can rollback.

    • twei@feddit.de
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      You should get another node, otherwise when node1 fails node2 will reboot itself and then do nothing because it has no quorum

        • twei@feddit.de
          link
          fedilink
          English
          arrow-up
          2
          ·
          10 months ago

          I know, but every time I had to do that it felt like it’s a jank solution. If you have a raspberry pi or smth like that you can also set it up as a qdevice.

          …and if you’re completely fine with how it is you can also just leave it like it is

          • ikidd@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            10 months ago

            So I started to write a reply that said basically that I was OK doing that manually, but thought that “hell, I have a PBS box on the network that would do that fine”. So it took about 3 minutes to install the corosync-qdevice packages on all three and enable it. Good to go.

            Thanks for the kick in the ass.

          • ikidd@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            10 months ago

            So since I now had a “quorate” cluster again, I thought I’d try out HA. I’d always been under the impression that unless you had a shared storage LUN, you couldn’t HA anything. But I thought I’d trigger a replication and then down the 2nd node just as a test. And lo and behold, the first node brought up my OPNsense VM from the replicated image about 2 minutes after the second node lost contact, and internet starts working again.

            I’m really excited about having that feature working now. This was a good night, thank you.

            • twei@feddit.de
              link
              fedilink
              English
              arrow-up
              2
              ·
              10 months ago

              If you need another thing to do, you could try to make your opnsense HA and never have your internet stop working while rebooting a node. It’s pretty simple to set up, you might finish it in 1-2 evenings. Happy clustering!

              • ikidd@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                10 months ago

                I’ll look into that. I did see the option in opnsense once upon a time but never investigated it.

  • lemmyvore@feddit.nl
    link
    fedilink
    English
    arrow-up
    2
    ·
    10 months ago
    • Install Debian stable with the ssh server included.
    • Keep a list of the packages that were installed after (there aren’t many but still).
    • All docker containers have their compose files and persistent user data on a RAID1 array.
    • Have a backup running that rsyncs once a day /etc, /home/user and /mnt/array1/docker to another RAID1 to daily/, from daily/ once a week rsync to weekly/, from weekly/ once a monthb timestamped tarball to monthly/. Once a month I also bring out a HDD from the drawer and do a backup of monthly/ with Borg.

    For recovery:

    • Reinstall Debian + extra packages.
    • Restore the docker compose and persistent files.
    • Run docker compose on containers.

    Note that some data may need additional handling, for example databases should be dumped not rsunced.