# Backup Plan — Decision Document > **Status**: Planning — awaiting decisions on open questions before implementation > **Last updated**: 2026-03-13 > **Related**: [backup-strategies.md](backup-strategies.md) (aspirational doc, mostly not yet deployed) --- ## Current State (Honest) | What | Status | |---|---| | Synology Hyper Backup (Atlantis → Calypso) | ✅ Running, configured in DSM GUI | | Synology Hyper Backup (Atlantis → Setillo) | ✅ Running, configured in DSM GUI | | Syncthing docker config sync (Atlantis/Calypso/Setillo) | ✅ Running | | Synology snapshots for media volumes | ✅ Adequate — decided, no change needed | | Scheduled database backups | ❌ Not deployed (Firefly sidecar is the only exception) | | Docker volume backups for non-Synology hosts | ❌ Not deployed | | Cloud (Backblaze B2) | ❌ Account exists, nothing uploading yet | | Unified backup monitoring / alerting | ❌ Not deployed | The migration scripts (`backup-matrix.sh`, `backup-mastodon.sh`, `backup.sh`) are one-off migration artifacts — not scheduled, not monitored. --- ## Recommended Tool: Borgmatic Borgmatic wraps BorgBackup (deduplicated, encrypted, compressed backups) with a single YAML config file that handles scheduling, database hooks, and alerting. | Concern | How Borgmatic addresses it | |---|---| | Deduplication | BorgBackup — only changed chunks stored; daily full runs are cheap | | Encryption | AES-256 at rest, passphrase-protected repo | | Database backups | Native `postgresql_databases` and `mysql_databases` hooks — calls pg_dump/mysqldump before each run, streams output into the Borg repo | | Scheduling | Built-in cron expression in config, or run as a container with the `borgmatic-cron` image | | Alerting | Native ntfy / healthchecks.io / email hooks — fires on failure | | Restoration | `borgmatic extract` or direct `borg extract` — well-documented | | Complexity | Low — one YAML file per host, one Docker container | ### Why not the alternatives | Tool | Reason not chosen | |---|---| | Restic | No built-in DB hooks, no built-in scheduler — needs cron + wrapper scripts | | Kopia | Newer, less battle-tested at this scale; no native DB hooks | | Duplicati | Unstable history of bugs; no DB hooks; GUI-only config | | rclone | Sync tool, not a backup tool — no dedup, no versioning, no DB hooks | | Raw rsync | No dedup, no encryption, no DB hooks, fragile for large trees | Restic is the closest alternative and would be acceptable if Borgmatic hits issues, but Borgmatic's native DB hooks are the deciding factor. --- ## Proposed Architecture ### What to back up per host **Atlantis** (primary NAS, highest value — do first) - `/volume2/metadata/docker2/` — all container config/data dirs (~194GB used) - Databases via hooks: - `immich-db` (PostgreSQL) — photo metadata - `vaultwarden` (SQLite) — passwords, via pre-hook tar - `sonarr`, `radarr`, `prowlarr`, `bazarr`, `lidarr` (SQLite) — via pre-hook - `tdarr` (SQLite + JSON) — transcode config - `/volume1/data/media/` — **covered by Synology snapshots, excluded from Borg** **Calypso** (secondary NAS) - `/volume1/docker/` — all container config/data dirs - Databases via hooks: - `paperless-db` (PostgreSQL) - `authentik-db` (PostgreSQL) - `immich-db` (PostgreSQL, Calypso instance) - `seafile-db` (MySQL) - `gitea-db` (PostgreSQL) — see open question #5 below **homelab-vm** (this machine, `100.67.40.126`) - Docker named volumes — scrutiny, ntfy, syncthing, archivebox, openhands, hoarder, monitoring stack - Mostly config-weight data, no large databases **NUC (concord)** - Docker named volumes — homeassistant, adguard, syncthing, invidious **Pi-5** - Docker named volumes — uptime-kuma (SQLite), glances, diun **Setillo (Seattle VM)** — lower priority, open question (see below) --- ## Options — Borg Repo Destination All hosts need a repo to write to. Three options: ### Option A — Atlantis as central repo host (simplest) ``` Atlantis (local) → /volume1/backups/borg/atlantis/ Calypso → SSH → Atlantis:/volume1/backups/borg/calypso/ homelab-vm → SSH → Atlantis:/volume1/backups/borg/homelab-vm/ NUC → SSH → Atlantis:/volume1/backups/borg/nuc/ Pi-5 → SSH → Atlantis:/volume1/backups/borg/rpi5/ ``` Pros: - Atlantis already gets Hyper Backup → Calypso + rsync → Setillo, so all Borg repos get carried offsite for free with no extra work - Single place to manage retention policies - 46TB free on Atlantis — ample room Cons: - Atlantis is a single point of failure for all repos ### Option B — Atlantis ↔ Calypso cross-backup (more resilient) ``` Atlantis → SSH → Calypso:/volume1/backups/borg/atlantis/ Calypso → SSH → Atlantis:/volume1/backups/borg/calypso/ Other hosts → Atlantis (same as Option A) ``` Pros: - If Atlantis dies completely, Calypso independently holds Atlantis's backup - True cross-backup between the two most critical hosts Cons: - Two SSH trust relationships to set up and maintain - Calypso Borg repo would not be on Atlantis, so it doesn't get carried to Setillo via the existing Hyper Backup job unless the job is updated to include it ### Option C — Local repo per host, then push to Atlantis - Each host writes a local repo first, then pushes to Atlantis - Adds a local copy for fast restores without SSH - Doubles storage use on each host - Probably unnecessary given Synology's local snapshot coverage on Atlantis/Calypso **Recommendation: Option A** if simplicity is the priority; **Option B** if you want Atlantis and Calypso to be truly independent backup failure domains. --- ## Options — Backblaze B2 B2 account exists. The question is what to push there. ### Option 1 — Borg repos via rclone (recommended) ``` Atlantis (weekly cron): rclone sync /volume1/backups/borg/ b2:homelab-borg/ ``` - BorgBackup's chunk-based dedup means only new/changed chunks upload each week - Estimated size: initial ~50–200GB (configs + DBs only, media excluded), then small incrementals - rclone runs as a container or cron job on Atlantis after the daily Borg runs complete - Cost at B2 rates ($0.006/GB/month): ~$1–1.20/month for 200GB ### Option 2 — DB dumps only to B2 - Simpler — just upload the daily pg_dump files - No dedup — each upload is a full dump - Less efficient at scale but trivially easy to implement ### Option 3 — Skip B2 for now - Setillo offsite rsync is sufficient for current risk tolerance - Add B2 once monitoring is in place and Borgmatic is proven stable **Recommendation: Option 1** — the dedup makes it cheap and the full Borg repo in B2 means any host can be restored from cloud without needing Setillo to be online. --- ## Open Questions These must be answered before implementation starts. ### 1. Which hosts to cover? - [ ] Atlantis - [ ] Calypso - [ ] homelab-vm - [ ] NUC - [ ] Pi-5 - [ ] Setillo (Seattle VM) ### 2. Borg repo destination - [ ] Option A: Atlantis only (simplest) - [ ] Option B: Atlantis ↔ Calypso cross-backup (more resilient) - [ ] Option C: Local first, then push to Atlantis ### 3. B2 scope - [ ] Option 1: Borg repos via rclone (recommended) - [ ] Option 2: DB dumps only - [ ] Option 3: Skip for now ### 4. Secrets management Borgmatic configs need: Borg passphrase, SSH private key (to reach Atlantis repo), B2 app key (if B2 enabled). Option A — **Portainer env vars** (consistent with rest of homelab) - Passphrase injected at deploy time, never in git - SSH keys stored as host-mounted files, path referenced in config Option B — **Files on host only** - Drop secrets to e.g. `/volume1/docker/borgmatic/secrets/` per host - Mount read-only into borgmatic container - Nothing in git, nothing in Portainer Option C — **Ansible vault** - Encrypt secrets in git — fully tracked and reproducible - More setup overhead - [ ] Option A: Portainer env vars - [ ] Option B: Files on host only - [ ] Option C: Ansible vault ### 5. Gitea chicken-and-egg CI runs on Gitea. If Borgmatic on Calypso backs up `gitea-db` and Calypso/Gitea goes down, restoring Gitea is a manual procedure outside of CI — which is acceptable. The alternative is to exclude `gitea-db` from Borgmatic and back it up separately (e.g. a simple daily pg_dump cron on Calypso that Hyper Backup then carries). - [ ] Include gitea-db in Borgmatic (manual restore procedure documented) - [ ] Exclude from Borgmatic, use separate pg_dump cron ### 6. Alerting ntfy topic Borgmatic can push failure alerts to the existing ntfy stack on homelab-vm. - [ ] Confirm ntfy topic name to use (e.g. `homelab-backups` or `homelab`) - [ ] Confirm ntfy internal URL (e.g. `http://100.67.40.126:`) --- ## Implementation Phases (draft, not yet started) Once decisions above are made, implementation follows these phases in order: **Phase 1 — Atlantis** 1. Create `hosts/synology/atlantis/borgmatic.yaml` 2. Config: backs up `/volume2/metadata/docker2`, DB hooks for all postgres/sqlite containers 3. Repo destination per decision on Q2 4. Alert on failure via ntfy **Phase 2 — Calypso** 1. Create `hosts/synology/calypso/borgmatic.yaml` 2. Config: backs up `/volume1/docker`, DB hooks for paperless/authentik/immich/seafile/(gitea) 3. Repo: SSH to Atlantis (or cross-backup per Q2) **Phase 3 — homelab-vm, NUC, Pi-5** 1. Create borgmatic stack per host 2. Mount `/var/lib/docker/volumes` read-only into container 3. Repos: SSH to Atlantis 4. Staggered schedule: 02:00 Atlantis / 03:00 Calypso / 04:00 homelab-vm / 04:30 NUC / 05:00 Pi-5 **Phase 4 — B2 cloud egress** (if Option 1 or 2 chosen) 1. Add rclone container or cron on Atlantis 2. Weekly sync of Borg repos → `b2:homelab-borg/` **Phase 5 — Monitoring** 1. Borgmatic ntfy hook per host — fires on any failure 2. Uptime Kuma push monitor per host — borgmatic pings after each successful run 3. Alert if no ping received in 25h --- ## Borgmatic Config Skeleton (reference) ```yaml # /etc/borgmatic/config.yaml (inside container) # This is illustrative — actual configs will be generated per host repositories: - path: ssh://borg@100.83.230.112/volume1/backups/borg/calypso label: atlantis-remote source_directories: - /mnt/docker # host /volume1/docker mounted here exclude_patterns: - '*/cache' - '*/transcode' - '*/thumbs' - '*.tmp' - '*.log' postgresql_databases: - name: paperless hostname: paperless-db username: paperless password: "REDACTED_PASSWORD" format: custom - name: authentik hostname: authentik-db username: authentik password: "REDACTED_PASSWORD" format: custom retention: keep_daily: 14 keep_weekly: 8 keep_monthly: 6 ntfy: topic: homelab-backups server: http://100.67.40.126:2586 states: - fail encryption_passphrase: ${BORG_PASSPHRASE} ``` --- ## Related Docs - [backup-strategies.md](backup-strategies.md) — existing aspirational doc (partially outdated) - [portainer-backup.md](portainer-backup.md) — Portainer-specific backup notes - [disaster-recovery.md](../troubleshooting/disaster-recovery.md)