Sanitized mirror from private repository - 2026-03-21 09:31:10 UTC
This commit is contained in:
152
docs/infrastructure/backup-strategy.md
Normal file
152
docs/infrastructure/backup-strategy.md
Normal file
@@ -0,0 +1,152 @@
|
||||
# Backup Strategy
|
||||
|
||||
Last updated: 2026-03-21
|
||||
|
||||
## Overview
|
||||
|
||||
The homelab follows a **3-2-1+ backup strategy**: 3 copies of data, 2 different storage types, 1 offsite location, plus cloud backup to Backblaze B2.
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||||
│ BACKUP FLOW │
|
||||
│ │
|
||||
│ Atlantis (Primary) ──── Hyper Backup (weekly) ──── Calypso (Local copy) │
|
||||
│ │ │
|
||||
│ ├── Syncthing (real-time) ──── Setillo (Tucson, offsite) │
|
||||
│ │ │
|
||||
│ └── Hyper Backup S3 (weekly) ──── Backblaze B2 (cloud) │
|
||||
│ │ │
|
||||
│ Calypso ──── Hyper Backup S3 (daily) ─────┘ │
|
||||
│ │
|
||||
│ Guava ──── No backup (risk) │
|
||||
│ Jellyfish ──── No backup (risk) │
|
||||
└─────────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Backup Tasks
|
||||
|
||||
### Atlantis → Backblaze B2 (Cloud)
|
||||
|
||||
| Setting | Value |
|
||||
|---------|-------|
|
||||
| **Task name** | Backblaze b2 |
|
||||
| **Schedule** | Weekly, Sundays 00:00 |
|
||||
| **Destination** | `s3.us-west-004.backblazeb2.com` |
|
||||
| **Bucket** | `vk-atlantis` |
|
||||
| **Encrypted** | Yes (client-side) |
|
||||
| **Versioned** | Yes (Smart Recycle) |
|
||||
| **Rotation** | Keep daily for 3 days, weekly for 4 weeks, then weekly indefinitely |
|
||||
|
||||
**What's backed up:**
|
||||
- `/archive` — long-term cold storage
|
||||
- `/documents/msi_uqiyoe` — PC sync documents
|
||||
- `/documents/pc_sync_documents` — PC sync documents
|
||||
- `/downloads` — download staging
|
||||
- `/photo` — Synology Photos library
|
||||
- `/homes/vish/Photos` — user photo library
|
||||
- Apps: SynologyPhotos, SynologyDrive, FileStation, HyperBackup, SynoFinder
|
||||
|
||||
**What's NOT backed up to cloud:**
|
||||
- `/volume1/media` (~60TB) — too large for cloud backup, replicated to Setillo instead
|
||||
- `/volume1/docker` — container data (stateless, can be redeployed from git)
|
||||
|
||||
### Calypso → Backblaze B2 (Cloud)
|
||||
|
||||
| Setting | Value |
|
||||
|---------|-------|
|
||||
| **Task name** | Backblaze S3 |
|
||||
| **Schedule** | Daily, 00:00 |
|
||||
| **Destination** | `s3.us-west-004.backblazeb2.com` |
|
||||
| **Bucket** | `vk-concord-1` |
|
||||
| **Encrypted** | Yes (client-side) |
|
||||
| **Versioned** | Yes (Smart Recycle) |
|
||||
|
||||
**What's backed up:**
|
||||
- `/docker/authentik` — SSO provider data (critical)
|
||||
- `/docker/gitea` — Git hosting data (critical)
|
||||
- `/docker/headscale` — VPN control plane (critical)
|
||||
- `/docker/immich` — Photo management DB
|
||||
- `/docker/nginx-proxy-manager` — old NPM config (historical)
|
||||
- `/docker/paperlessngx` — Document management DB
|
||||
- `/docker/retro_site` — Personal website
|
||||
- `/docker/seafile` — File storage data
|
||||
- `/data/media/misc` — miscellaneous media
|
||||
- `/data/media/music` — music library
|
||||
- `/data/media/photos` — photo library
|
||||
- Apps: Gitea, MariaDB10, CloudSync, Authentik, Immich, Paperless, HyperBackup
|
||||
|
||||
### Atlantis → Calypso (Local Copy)
|
||||
|
||||
| Setting | Value |
|
||||
|---------|-------|
|
||||
| **Method** | Hyper Backup |
|
||||
| **Schedule** | Weekly |
|
||||
| **Destination** | Calypso `/volume1/backups/` |
|
||||
| **What** | Media, photos, documents |
|
||||
| **Encrypted** | Yes |
|
||||
|
||||
### Atlantis/Calypso → Setillo (Offsite)
|
||||
|
||||
| Setting | Value |
|
||||
|---------|-------|
|
||||
| **Method** | Syncthing (real-time replication) |
|
||||
| **Destination** | Setillo `/volume1/syncthing/` (Tucson, AZ) |
|
||||
| **Distance** | ~1,000 miles from primary site |
|
||||
| **What** | Docker configs, critical data |
|
||||
|
||||
### Disabled Tasks
|
||||
|
||||
| Task | Host | Reason |
|
||||
|------|------|--------|
|
||||
| Backblaze S3 Atlantis (ID 12) | Atlantis | Old task, replaced by "Backblaze b2" (ID 20) |
|
||||
|
||||
## Hosts Without Backup
|
||||
|
||||
| Host | Data at Risk | Mitigation |
|
||||
|------|-------------|------------|
|
||||
| **Guava** (TrueNAS) | 3TB personal data, 204GB Jellyfin, 159GB photos, 64GB LLM models | ZFS mirror provides drive-failure protection but no offsite/cloud backup |
|
||||
| **Jellyfish** (RPi 5) | 1.8TB photos (LUKS2 encrypted NVMe) | LUKS encryption protects at rest, but no redundancy beyond the single drive |
|
||||
| **Homelab VM** | Docker data, monitoring databases | Stateless — all compose files in git, data is regenerable. NetBox/Semaphore DBs are the main risk |
|
||||
| **Concord NUC** | Home Assistant config, AdGuard | Container data is relatively small and rebuildable |
|
||||
|
||||
**Recommendation:** Set up Backblaze B2 backup for Guava (personal data, photos) and Jellyfish (photo archive). Both have irreplaceable data.
|
||||
|
||||
## Recovery Procedures
|
||||
|
||||
### Full NAS Recovery (Atlantis)
|
||||
|
||||
1. Replace failed hardware / reinstall DSM
|
||||
2. Restore from Calypso (fastest — local, weekly copy)
|
||||
3. Or restore from Backblaze B2 (slower — download over internet)
|
||||
4. Redeploy Docker stacks from git (all GitOps-managed)
|
||||
|
||||
### Service Recovery (Any Host)
|
||||
|
||||
1. All Docker stacks are in git (`hosts/` directory)
|
||||
2. Portainer GitOps auto-deploys on push
|
||||
3. Just create the Portainer stack pointing to the compose file
|
||||
4. Service-specific data may need restore from backup
|
||||
|
||||
### Critical Service Priority
|
||||
|
||||
| Priority | Service | Backup Source | Recovery Time |
|
||||
|----------|---------|--------------|---------------|
|
||||
| 1 | Authentik (SSO) | Calypso B2 daily | ~30 min |
|
||||
| 2 | Gitea (Git) | Calypso B2 daily | ~30 min |
|
||||
| 3 | NPM (Reverse Proxy) | Calypso B2 daily / matrix-ubuntu local | ~5 min (redeploy) |
|
||||
| 4 | Plex (Media) | Atlantis B2 weekly | ~1 hr (metadata only, media on disk) |
|
||||
| 5 | Paperless (Documents) | Calypso B2 daily | ~30 min |
|
||||
|
||||
## Monitoring
|
||||
|
||||
- **DIUN**: Monitors container image updates (weekly, ntfy notification)
|
||||
- **Uptime Kuma**: Monitors service availability (97 monitors)
|
||||
- **HyperBackup**: Sends DSM notification on backup success/failure
|
||||
- **Backblaze B2**: Dashboard at `https://secure.backblaze.com/b2_buckets.htm`
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Storage Topology](../diagrams/storage-topology.md) — detailed storage layout per host
|
||||
- [Image Update Guide](../admin/IMAGE_UPDATE_GUIDE.md) — how services are updated
|
||||
- [Offline & Remote Access](offline-and-remote-access.md) — accessing services when internet is down
|
||||
- [Ansible Playbook Guide](../admin/ANSIBLE_PLAYBOOK_GUIDE.md) — `backup_configs.yml` and `backup_databases.yml` playbooks
|
||||
Reference in New Issue
Block a user