Sanitized mirror from private repository - 2026-03-21 06:37:51 UTC
Some checks failed
Documentation / Build Docusaurus (push) Failing after 4m58s
Documentation / Deploy to GitHub Pages (push) Has been skipped

This commit is contained in:
Gitea Mirror Bot
2026-03-21 06:37:51 +00:00
commit f273b940ad
1235 changed files with 306105 additions and 0 deletions

324
docs/admin/backup-plan.md Normal file
View File

@@ -0,0 +1,324 @@
# Backup Plan — Decision Document
> **Status**: Planning — awaiting decisions on open questions before implementation
> **Last updated**: 2026-03-13
> **Related**: [backup-strategies.md](backup-strategies.md) (aspirational doc, mostly not yet deployed)
---
## Current State (Honest)
| What | Status |
|---|---|
| Synology Hyper Backup (Atlantis → Calypso) | ✅ Running, configured in DSM GUI |
| Synology Hyper Backup (Atlantis → Setillo) | ✅ Running, configured in DSM GUI |
| Syncthing docker config sync (Atlantis/Calypso/Setillo) | ✅ Running |
| Synology snapshots for media volumes | ✅ Adequate — decided, no change needed |
| Scheduled database backups | ❌ Not deployed (Firefly sidecar is the only exception) |
| Docker volume backups for non-Synology hosts | ❌ Not deployed |
| Cloud (Backblaze B2) | ❌ Account exists, nothing uploading yet |
| Unified backup monitoring / alerting | ❌ Not deployed |
The migration scripts (`backup-matrix.sh`, `backup-mastodon.sh`, `backup.sh`) are
one-off migration artifacts — not scheduled, not monitored.
---
## Recommended Tool: Borgmatic
Borgmatic wraps BorgBackup (deduplicated, encrypted, compressed backups) with a
single YAML config file that handles scheduling, database hooks, and alerting.
| Concern | How Borgmatic addresses it |
|---|---|
| Deduplication | BorgBackup — only changed chunks stored; daily full runs are cheap |
| Encryption | AES-256 at rest, passphrase-protected repo |
| Database backups | Native `postgresql_databases` and `mysql_databases` hooks — calls pg_dump/mysqldump before each run, streams output into the Borg repo |
| Scheduling | Built-in cron expression in config, or run as a container with the `borgmatic-cron` image |
| Alerting | Native ntfy / healthchecks.io / email hooks — fires on failure |
| Restoration | `borgmatic extract` or direct `borg extract` — well-documented |
| Complexity | Low — one YAML file per host, one Docker container |
### Why not the alternatives
| Tool | Reason not chosen |
|---|---|
| Restic | No built-in DB hooks, no built-in scheduler — needs cron + wrapper scripts |
| Kopia | Newer, less battle-tested at this scale; no native DB hooks |
| Duplicati | Unstable history of bugs; no DB hooks; GUI-only config |
| rclone | Sync tool, not a backup tool — no dedup, no versioning, no DB hooks |
| Raw rsync | No dedup, no encryption, no DB hooks, fragile for large trees |
Restic is the closest alternative and would be acceptable if Borgmatic hits issues,
but Borgmatic's native DB hooks are the deciding factor.
---
## Proposed Architecture
### What to back up per host
**Atlantis** (primary NAS, highest value — do first)
- `/volume2/metadata/docker2/` — all container config/data dirs (~194GB used)
- Databases via hooks:
- `immich-db` (PostgreSQL) — photo metadata
- `vaultwarden` (SQLite) — passwords, via pre-hook tar
- `sonarr`, `radarr`, `prowlarr`, `bazarr`, `lidarr` (SQLite) — via pre-hook
- `tdarr` (SQLite + JSON) — transcode config
- `/volume1/data/media/`**covered by Synology snapshots, excluded from Borg**
**Calypso** (secondary NAS)
- `/volume1/docker/` — all container config/data dirs
- Databases via hooks:
- `paperless-db` (PostgreSQL)
- `authentik-db` (PostgreSQL)
- `immich-db` (PostgreSQL, Calypso instance)
- `seafile-db` (MySQL)
- `gitea-db` (PostgreSQL) — see open question #5 below
**homelab-vm** (this machine, `100.67.40.126`)
- Docker named volumes — scrutiny, ntfy, syncthing, archivebox, openhands, hoarder, monitoring stack
- Mostly config-weight data, no large databases
**NUC (concord)**
- Docker named volumes — homeassistant, adguard, syncthing, invidious
**Pi-5**
- Docker named volumes — uptime-kuma (SQLite), glances, diun
**Setillo (Seattle VM)** — lower priority, open question (see below)
---
## Options — Borg Repo Destination
All hosts need a repo to write to. Three options:
### Option A — Atlantis as central repo host (simplest)
```
Atlantis (local) → /volume1/backups/borg/atlantis/
Calypso → SSH → Atlantis:/volume1/backups/borg/calypso/
homelab-vm → SSH → Atlantis:/volume1/backups/borg/homelab-vm/
NUC → SSH → Atlantis:/volume1/backups/borg/nuc/
Pi-5 → SSH → Atlantis:/volume1/backups/borg/rpi5/
```
Pros:
- Atlantis already gets Hyper Backup → Calypso + rsync → Setillo, so all Borg
repos get carried offsite for free with no extra work
- Single place to manage retention policies
- 46TB free on Atlantis — ample room
Cons:
- Atlantis is a single point of failure for all repos
### Option B — Atlantis ↔ Calypso cross-backup (more resilient)
```
Atlantis → SSH → Calypso:/volume1/backups/borg/atlantis/
Calypso → SSH → Atlantis:/volume1/backups/borg/calypso/
Other hosts → Atlantis (same as Option A)
```
Pros:
- If Atlantis dies completely, Calypso independently holds Atlantis's backup
- True cross-backup between the two most critical hosts
Cons:
- Two SSH trust relationships to set up and maintain
- Calypso Borg repo would not be on Atlantis, so it doesn't get carried to Setillo
via the existing Hyper Backup job unless the job is updated to include it
### Option C — Local repo per host, then push to Atlantis
- Each host writes a local repo first, then pushes to Atlantis
- Adds a local copy for fast restores without SSH
- Doubles storage use on each host
- Probably unnecessary given Synology's local snapshot coverage on Atlantis/Calypso
**Recommendation: Option A** if simplicity is the priority; **Option B** if you want
Atlantis and Calypso to be truly independent backup failure domains.
---
## Options — Backblaze B2
B2 account exists. The question is what to push there.
### Option 1 — Borg repos via rclone (recommended)
```
Atlantis (weekly cron):
rclone sync /volume1/backups/borg/ b2:homelab-borg/
```
- BorgBackup's chunk-based dedup means only new/changed chunks upload each week
- Estimated size: initial ~50200GB (configs + DBs only, media excluded), then small incrementals
- rclone runs as a container or cron job on Atlantis after the daily Borg runs complete
- Cost at B2 rates ($0.006/GB/month): ~$11.20/month for 200GB
### Option 2 — DB dumps only to B2
- Simpler — just upload the daily pg_dump files
- No dedup — each upload is a full dump
- Less efficient at scale but trivially easy to implement
### Option 3 — Skip B2 for now
- Setillo offsite rsync is sufficient for current risk tolerance
- Add B2 once monitoring is in place and Borgmatic is proven stable
**Recommendation: Option 1** — the dedup makes it cheap and the full Borg repo in B2
means any host can be restored from cloud without needing Setillo to be online.
---
## Open Questions
These must be answered before implementation starts.
### 1. Which hosts to cover?
- [ ] Atlantis
- [ ] Calypso
- [ ] homelab-vm
- [ ] NUC
- [ ] Pi-5
- [ ] Setillo (Seattle VM)
### 2. Borg repo destination
- [ ] Option A: Atlantis only (simplest)
- [ ] Option B: Atlantis ↔ Calypso cross-backup (more resilient)
- [ ] Option C: Local first, then push to Atlantis
### 3. B2 scope
- [ ] Option 1: Borg repos via rclone (recommended)
- [ ] Option 2: DB dumps only
- [ ] Option 3: Skip for now
### 4. Secrets management
Borgmatic configs need: Borg passphrase, SSH private key (to reach Atlantis repo),
B2 app key (if B2 enabled).
Option A — **Portainer env vars** (consistent with rest of homelab)
- Passphrase injected at deploy time, never in git
- SSH keys stored as host-mounted files, path referenced in config
Option B — **Files on host only**
- Drop secrets to e.g. `/volume1/docker/borgmatic/secrets/` per host
- Mount read-only into borgmatic container
- Nothing in git, nothing in Portainer
Option C — **Ansible vault**
- Encrypt secrets in git — fully tracked and reproducible
- More setup overhead
- [ ] Option A: Portainer env vars
- [ ] Option B: Files on host only
- [ ] Option C: Ansible vault
### 5. Gitea chicken-and-egg
CI runs on Gitea. If Borgmatic on Calypso backs up `gitea-db` and Calypso/Gitea
goes down, restoring Gitea is a manual procedure outside of CI — which is acceptable.
The alternative is to exclude `gitea-db` from Borgmatic and back it up separately
(e.g. a simple daily pg_dump cron on Calypso that Hyper Backup then carries).
- [ ] Include gitea-db in Borgmatic (manual restore procedure documented)
- [ ] Exclude from Borgmatic, use separate pg_dump cron
### 6. Alerting ntfy topic
Borgmatic can push failure alerts to the existing ntfy stack on homelab-vm.
- [ ] Confirm ntfy topic name to use (e.g. `homelab-backups` or `homelab`)
- [ ] Confirm ntfy internal URL (e.g. `http://100.67.40.126:<port>`)
---
## Implementation Phases (draft, not yet started)
Once decisions above are made, implementation follows these phases in order:
**Phase 1 — Atlantis**
1. Create `hosts/synology/atlantis/borgmatic.yaml`
2. Config: backs up `/volume2/metadata/docker2`, DB hooks for all postgres/sqlite containers
3. Repo destination per decision on Q2
4. Alert on failure via ntfy
**Phase 2 — Calypso**
1. Create `hosts/synology/calypso/borgmatic.yaml`
2. Config: backs up `/volume1/docker`, DB hooks for paperless/authentik/immich/seafile/(gitea)
3. Repo: SSH to Atlantis (or cross-backup per Q2)
**Phase 3 — homelab-vm, NUC, Pi-5**
1. Create borgmatic stack per host
2. Mount `/var/lib/docker/volumes` read-only into container
3. Repos: SSH to Atlantis
4. Staggered schedule: 02:00 Atlantis / 03:00 Calypso / 04:00 homelab-vm / 04:30 NUC / 05:00 Pi-5
**Phase 4 — B2 cloud egress** (if Option 1 or 2 chosen)
1. Add rclone container or cron on Atlantis
2. Weekly sync of Borg repos → `b2:homelab-borg/`
**Phase 5 — Monitoring**
1. Borgmatic ntfy hook per host — fires on any failure
2. Uptime Kuma push monitor per host — borgmatic pings after each successful run
3. Alert if no ping received in 25h
---
## Borgmatic Config Skeleton (reference)
```yaml
# /etc/borgmatic/config.yaml (inside container)
# This is illustrative — actual configs will be generated per host
repositories:
- path: ssh://borg@100.83.230.112/volume1/backups/borg/calypso
label: atlantis-remote
source_directories:
- /mnt/docker # host /volume1/docker mounted here
exclude_patterns:
- '*/cache'
- '*/transcode'
- '*/thumbs'
- '*.tmp'
- '*.log'
postgresql_databases:
- name: paperless
hostname: paperless-db
username: paperless
password: "REDACTED_PASSWORD"
format: custom
- name: authentik
hostname: authentik-db
username: authentik
password: "REDACTED_PASSWORD"
format: custom
retention:
keep_daily: 14
keep_weekly: 8
keep_monthly: 6
ntfy:
topic: homelab-backups
server: http://100.67.40.126:2586
states:
- fail
encryption_passphrase: ${BORG_PASSPHRASE}
```
---
## Related Docs
- [backup-strategies.md](backup-strategies.md) — existing aspirational doc (partially outdated)
- [portainer-backup.md](portainer-backup.md) — Portainer-specific backup notes
- [disaster-recovery.md](../troubleshooting/disaster-recovery.md)