Files
homelab-optimized/docs/admin/backup-plan.md
Gitea Mirror Bot 179f6c59da
Some checks failed
Documentation / Build Docusaurus (push) Failing after 17m41s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-03-23 10:17:31 UTC
2026-03-23 10:17:31 +00:00

11 KiB
Raw Blame History

Backup Plan — Decision Document

Status: Planning — awaiting decisions on open questions before implementation Last updated: 2026-03-13 Related: backup-strategies.md (aspirational doc, mostly not yet deployed)


Current State (Honest)

What Status
Synology Hyper Backup (Atlantis → Calypso) Running, configured in DSM GUI
Synology Hyper Backup (Atlantis → Setillo) Running, configured in DSM GUI
Syncthing docker config sync (Atlantis/Calypso/Setillo) Running
Synology snapshots for media volumes Adequate — decided, no change needed
Scheduled database backups Not deployed (Firefly sidecar is the only exception)
Docker volume backups for non-Synology hosts Not deployed
Cloud (Backblaze B2) Account exists, nothing uploading yet
Unified backup monitoring / alerting Not deployed

The migration scripts (backup-matrix.sh, backup-mastodon.sh, backup.sh) are one-off migration artifacts — not scheduled, not monitored.


Borgmatic wraps BorgBackup (deduplicated, encrypted, compressed backups) with a single YAML config file that handles scheduling, database hooks, and alerting.

Concern How Borgmatic addresses it
Deduplication BorgBackup — only changed chunks stored; daily full runs are cheap
Encryption AES-256 at rest, passphrase-protected repo
Database backups Native postgresql_databases and mysql_databases hooks — calls pg_dump/mysqldump before each run, streams output into the Borg repo
Scheduling Built-in cron expression in config, or run as a container with the borgmatic-cron image
Alerting Native ntfy / healthchecks.io / email hooks — fires on failure
Restoration borgmatic extract or direct borg extract — well-documented
Complexity Low — one YAML file per host, one Docker container

Why not the alternatives

Tool Reason not chosen
Restic No built-in DB hooks, no built-in scheduler — needs cron + wrapper scripts
Kopia Newer, less battle-tested at this scale; no native DB hooks
Duplicati Unstable history of bugs; no DB hooks; GUI-only config
rclone Sync tool, not a backup tool — no dedup, no versioning, no DB hooks
Raw rsync No dedup, no encryption, no DB hooks, fragile for large trees

Restic is the closest alternative and would be acceptable if Borgmatic hits issues, but Borgmatic's native DB hooks are the deciding factor.


Proposed Architecture

What to back up per host

Atlantis (primary NAS, highest value — do first)

  • /volume2/metadata/docker2/ — all container config/data dirs (~194GB used)
  • Databases via hooks:
    • immich-db (PostgreSQL) — photo metadata
    • vaultwarden (SQLite) — passwords, via pre-hook tar
    • sonarr, radarr, prowlarr, bazarr, lidarr (SQLite) — via pre-hook
    • tdarr (SQLite + JSON) — transcode config
  • /volume1/data/media/covered by Synology snapshots, excluded from Borg

Calypso (secondary NAS)

  • /volume1/docker/ — all container config/data dirs
  • Databases via hooks:
    • paperless-db (PostgreSQL)
    • authentik-db (PostgreSQL)
    • immich-db (PostgreSQL, Calypso instance)
    • seafile-db (MySQL)
    • gitea-db (PostgreSQL) — see open question #5 below

homelab-vm (this machine, 100.67.40.126)

  • Docker named volumes — scrutiny, ntfy, syncthing, archivebox, openhands, hoarder, monitoring stack
  • Mostly config-weight data, no large databases

NUC (concord)

  • Docker named volumes — homeassistant, adguard, syncthing, invidious

Pi-5

  • Docker named volumes — uptime-kuma (SQLite), glances, diun

Setillo (Seattle VM) — lower priority, open question (see below)


Options — Borg Repo Destination

All hosts need a repo to write to. Three options:

Option A — Atlantis as central repo host (simplest)

Atlantis (local)  → /volume1/backups/borg/atlantis/
Calypso           → SSH → Atlantis:/volume1/backups/borg/calypso/
homelab-vm        → SSH → Atlantis:/volume1/backups/borg/homelab-vm/
NUC               → SSH → Atlantis:/volume1/backups/borg/nuc/
Pi-5              → SSH → Atlantis:/volume1/backups/borg/rpi5/

Pros:

  • Atlantis already gets Hyper Backup → Calypso + rsync → Setillo, so all Borg repos get carried offsite for free with no extra work
  • Single place to manage retention policies
  • 46TB free on Atlantis — ample room

Cons:

  • Atlantis is a single point of failure for all repos

Option B — Atlantis ↔ Calypso cross-backup (more resilient)

Atlantis → SSH → Calypso:/volume1/backups/borg/atlantis/
Calypso  → SSH → Atlantis:/volume1/backups/borg/calypso/
Other hosts → Atlantis (same as Option A)

Pros:

  • If Atlantis dies completely, Calypso independently holds Atlantis's backup
  • True cross-backup between the two most critical hosts

Cons:

  • Two SSH trust relationships to set up and maintain
  • Calypso Borg repo would not be on Atlantis, so it doesn't get carried to Setillo via the existing Hyper Backup job unless the job is updated to include it

Option C — Local repo per host, then push to Atlantis

  • Each host writes a local repo first, then pushes to Atlantis
  • Adds a local copy for fast restores without SSH
  • Doubles storage use on each host
  • Probably unnecessary given Synology's local snapshot coverage on Atlantis/Calypso

Recommendation: Option A if simplicity is the priority; Option B if you want Atlantis and Calypso to be truly independent backup failure domains.


Options — Backblaze B2

B2 account exists. The question is what to push there.

Atlantis (weekly cron):
  rclone sync /volume1/backups/borg/ b2:homelab-borg/
  • BorgBackup's chunk-based dedup means only new/changed chunks upload each week
  • Estimated size: initial ~50200GB (configs + DBs only, media excluded), then small incrementals
  • rclone runs as a container or cron job on Atlantis after the daily Borg runs complete
  • Cost at B2 rates ($0.006/GB/month): ~$11.20/month for 200GB

Option 2 — DB dumps only to B2

  • Simpler — just upload the daily pg_dump files
  • No dedup — each upload is a full dump
  • Less efficient at scale but trivially easy to implement

Option 3 — Skip B2 for now

  • Setillo offsite rsync is sufficient for current risk tolerance
  • Add B2 once monitoring is in place and Borgmatic is proven stable

Recommendation: Option 1 — the dedup makes it cheap and the full Borg repo in B2 means any host can be restored from cloud without needing Setillo to be online.


Open Questions

These must be answered before implementation starts.

1. Which hosts to cover?

  • Atlantis
  • Calypso
  • homelab-vm
  • NUC
  • Pi-5
  • Setillo (Seattle VM)

2. Borg repo destination

  • Option A: Atlantis only (simplest)
  • Option B: Atlantis ↔ Calypso cross-backup (more resilient)
  • Option C: Local first, then push to Atlantis

3. B2 scope

  • Option 1: Borg repos via rclone (recommended)
  • Option 2: DB dumps only
  • Option 3: Skip for now

4. Secrets management

Borgmatic configs need: Borg passphrase, SSH private key (to reach Atlantis repo), B2 app key (if B2 enabled).

Option A — Portainer env vars (consistent with rest of homelab)

  • Passphrase injected at deploy time, never in git
  • SSH keys stored as host-mounted files, path referenced in config

Option B — Files on host only

  • Drop secrets to e.g. /volume1/docker/borgmatic/secrets/ per host
  • Mount read-only into borgmatic container
  • Nothing in git, nothing in Portainer

Option C — Ansible vault

  • Encrypt secrets in git — fully tracked and reproducible

  • More setup overhead

  • Option A: Portainer env vars

  • Option B: Files on host only

  • Option C: Ansible vault

5. Gitea chicken-and-egg

CI runs on Gitea. If Borgmatic on Calypso backs up gitea-db and Calypso/Gitea goes down, restoring Gitea is a manual procedure outside of CI — which is acceptable. The alternative is to exclude gitea-db from Borgmatic and back it up separately (e.g. a simple daily pg_dump cron on Calypso that Hyper Backup then carries).

  • Include gitea-db in Borgmatic (manual restore procedure documented)
  • Exclude from Borgmatic, use separate pg_dump cron

6. Alerting ntfy topic

Borgmatic can push failure alerts to the existing ntfy stack on homelab-vm.

  • Confirm ntfy topic name to use (e.g. homelab-backups or homelab)
  • Confirm ntfy internal URL (e.g. http://100.67.40.126:<port>)

Implementation Phases (draft, not yet started)

Once decisions above are made, implementation follows these phases in order:

Phase 1 — Atlantis

  1. Create hosts/synology/atlantis/borgmatic.yaml
  2. Config: backs up /volume2/metadata/docker2, DB hooks for all postgres/sqlite containers
  3. Repo destination per decision on Q2
  4. Alert on failure via ntfy

Phase 2 — Calypso

  1. Create hosts/synology/calypso/borgmatic.yaml
  2. Config: backs up /volume1/docker, DB hooks for paperless/authentik/immich/seafile/(gitea)
  3. Repo: SSH to Atlantis (or cross-backup per Q2)

Phase 3 — homelab-vm, NUC, Pi-5

  1. Create borgmatic stack per host
  2. Mount /var/lib/docker/volumes read-only into container
  3. Repos: SSH to Atlantis
  4. Staggered schedule: 02:00 Atlantis / 03:00 Calypso / 04:00 homelab-vm / 04:30 NUC / 05:00 Pi-5

Phase 4 — B2 cloud egress (if Option 1 or 2 chosen)

  1. Add rclone container or cron on Atlantis
  2. Weekly sync of Borg repos → b2:homelab-borg/

Phase 5 — Monitoring

  1. Borgmatic ntfy hook per host — fires on any failure
  2. Uptime Kuma push monitor per host — borgmatic pings after each successful run
  3. Alert if no ping received in 25h

Borgmatic Config Skeleton (reference)

# /etc/borgmatic/config.yaml (inside container)
# This is illustrative — actual configs will be generated per host

repositories:
  - path: ssh://borg@100.83.230.112/volume1/backups/borg/calypso
    label: atlantis-remote

source_directories:
  - /mnt/docker  # host /volume1/docker mounted here

exclude_patterns:
  - '*/cache'
  - '*/transcode'
  - '*/thumbs'
  - '*.tmp'
  - '*.log'

postgresql_databases:
  - name: paperless
    hostname: paperless-db
    username: paperless
    password: "REDACTED_PASSWORD"
    format: custom
  - name: authentik
    hostname: authentik-db
    username: authentik
    password: "REDACTED_PASSWORD"
    format: custom

retention:
  keep_daily: 14
  keep_weekly: 8
  keep_monthly: 6

ntfy:
  topic: homelab-backups
  server: http://100.67.40.126:2586
  states:
    - fail

encryption_passphrase: ${BORG_PASSPHRASE}