Files
homelab-optimized/docs/admin/portainer-backup.md
Gitea Mirror Bot 29e47b18e9
Some checks failed
Documentation / Build Docusaurus (push) Failing after 13m3s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-03-31 10:10:42 UTC
2026-03-31 10:10:43 +00:00

349 lines
8.8 KiB
Markdown

# 🔄 Portainer Backup & Recovery Plan
**Last Updated**: 2026-01-27
This document outlines the backup strategy for Portainer and all managed Docker infrastructure.
---
## Overview
Portainer manages **5 endpoints** with **130+ containers** across the homelab. A comprehensive backup strategy ensures quick recovery from failures.
### Current Backup Configuration ✅
| Setting | Value |
|---------|-------|
| **Destination** | Backblaze B2 (`vk-portainer` bucket) |
| **Schedule** | Daily at 3:00 AM |
| **Retention** | 30 days (auto-delete lifecycle rule) |
| **Encryption** | Yes (AES-256) |
| **Backup Size** | ~30 MB per backup |
| **Max Storage** | ~900 MB |
| **Monthly Cost** | ~$0.005 |
### What's Backed Up
| Component | Location | Backup Method | Frequency |
|-----------|----------|---------------|-----------|
| Portainer DB | Atlantis:/portainer | **Backblaze B2** | Daily 3AM |
| Stack definitions | Git repo | Already versioned | On change |
| Container volumes | Per-host | Scheduled rsync | Daily |
| Secrets/Env vars | Portainer | Included in B2 backup | Daily |
---
## Portainer Server Backup
### Active Configuration: Backblaze B2 ✅
Automatic backups are configured via Portainer UI:
- **Settings → Backup configuration → S3 Compatible**
**Current Settings:**
```
S3 Host: https://s3.us-west-004.backblazeb2.com
Bucket: vk-portainer
Region: us-west-004
Schedule: 0 3 * * * (daily at 3 AM)
Encryption: Enabled
```
### Manual Backup via API
```bash
# Trigger immediate backup
curl -X POST "http://vishinator.synology.me:10000/api/backup/s3/execute" \
-H "X-API-Key: "REDACTED_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"accessKeyID": "004d35b7f4bf4300000000001",
"secretAccessKey": "K004SyhG7s+Xv/LDB32SAJFLKhe5dj0",
"region": "us-west-004",
"bucketName": "vk-portainer",
"password": "portainer-backup-2026",
"s3CompatibleHost": "https://s3.us-west-004.backblazeb2.com"
}'
# Download backup locally
curl -X GET "http://vishinator.synology.me:10000/api/backup" \
-H "X-API-Key: "REDACTED_API_KEY" \
-o portainer-backup-$(date +%Y%m%d).tar.gz
```
### Option 2: Volume Backup (Manual)
```bash
# On Atlantis (where Portainer runs)
# Stop Portainer temporarily
docker stop portainer
# Backup the data volume
tar -czvf /volume1/backups/portainer/portainer-$(date +%Y%m%d).tar.gz \
/volume1/docker/portainer/data
# Restart Portainer
docker start portainer
```
### Option 3: Scheduled Backup Script
Create `/volume1/scripts/backup-portainer.sh`:
```bash
#!/bin/bash
BACKUP_DIR="/volume1/backups/portainer"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30
# Create backup directory
mkdir -p $BACKUP_DIR
# Backup Portainer data (hot backup - no downtime)
docker run --rm \
-v portainer_data:/data \
-v $BACKUP_DIR:/backup \
alpine tar -czvf /backup/portainer-$DATE.tar.gz /data
# Cleanup old backups
find $BACKUP_DIR -name "portainer-*.tar.gz" -mtime +$RETENTION_DAYS -delete
echo "Backup completed: portainer-$DATE.tar.gz"
```
Add to crontab:
```bash
# Daily at 3 AM
0 3 * * * /volume1/scripts/backup-portainer.sh >> /var/log/portainer-backup.log 2>&1
```
---
## Stack Definitions Backup
All stack definitions are stored in Git (git.vish.gg/Vish/homelab), providing:
- ✅ Version history
- ✅ Change tracking
- ✅ Easy rollback
- ✅ Multi-location redundancy
### Git Repository Structure
```
homelab/
├── Atlantis/ # Atlantis stack configs
├── Calypso/ # Calypso stack configs
├── homelab_vm/ # Homelab VM configs
│ ├── monitoring.yaml
│ ├── openhands.yaml
│ ├── ntfy.yaml
│ └── prometheus_grafana_hub/
│ └── alerting/
├── concord_nuc/ # NUC configs
└── docs/ # Documentation
```
### Backup Git Repo Locally
```bash
# Clone full repo with history
git clone --mirror https://git.vish.gg/Vish/homelab.git homelab-backup.git
# Update existing mirror
cd homelab-backup.git && git remote update
```
---
## Container Volume Backup Strategy
### Critical Volumes to Backup
| Service | Volume Path | Priority | Size |
|---------|-------------|----------|------|
| Grafana | /var/lib/grafana | High | ~500MB |
| Prometheus | /prometheus | Medium | ~2GB |
| ntfy | /var/cache/ntfy | Low | ~100MB |
| Alertmanager | /alertmanager | Medium | ~50MB |
### Backup Script for Homelab VM
Create `/home/homelab/scripts/backup-volumes.sh`:
```bash
#!/bin/bash
BACKUP_DIR="/home/homelab/backups"
DATE=$(date +%Y%m%d)
REMOTE="atlantis:/volume1/backups/homelab-vm"
# Create local backup
mkdir -p $BACKUP_DIR/$DATE
# Backup critical volumes
for vol in grafana prometheus alertmanager; do
docker run --rm \
-v ${vol}_data:/data \
-v $BACKUP_DIR/$DATE:/backup \
alpine tar -czvf /backup/${vol}.tar.gz /data
done
# Sync to remote (Atlantis NAS)
rsync -av --delete $BACKUP_DIR/$DATE/ $REMOTE/$DATE/
# Keep last 7 days locally
find $BACKUP_DIR -maxdepth 1 -type d -mtime +7 -exec rm -rf {} \;
echo "Backup completed: $DATE"
```
---
## Disaster Recovery Procedures
### Scenario 1: Portainer Server Failure
**Recovery Steps:**
1. Deploy new Portainer instance on Atlantis
2. Restore from backup
3. Re-add edge agents (they will auto-reconnect)
```bash
# Deploy fresh Portainer
docker run -d -p 10000:9000 -p 8000:8000 \
--name portainer --restart always \
-v /var/run/docker.sock:/var/run/docker.sock \
-v portainer_data:/data \
portainer/portainer-ee:latest
# Restore from backup
docker stop portainer
tar -xzvf portainer-backup.tar.gz -C /
docker start portainer
```
### Scenario 2: Edge Agent Failure (e.g., Homelab VM)
**Recovery Steps:**
1. Reinstall Docker on the host
2. Install Portainer agent
3. Redeploy stacks from Git
```bash
# Install Portainer Edge Agent
docker run -d \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /var/lib/docker/volumes:/var/lib/docker/volumes \
-v portainer_agent_data:/data \
--name portainer_edge_agent \
--restart always \
-e EDGE=1 \
-e EDGE_ID=<edge-id> \
-e EDGE_KEY=<edge-key> \
-e EDGE_INSECURE_POLL=1 \
portainer/agent:latest
# Stacks will auto-deploy from Git (if AutoUpdate enabled)
# Or manually trigger via Portainer API
```
### Scenario 3: Complete Infrastructure Loss
**Recovery Priority:**
1. Network (router, switch)
2. Atlantis NAS (Portainer server)
3. Git server (Gitea on Calypso)
4. Edge agents
**Full Recovery Checklist:**
- [ ] Restore network connectivity
- [ ] Boot Atlantis, restore Portainer backup
- [ ] Boot Calypso, verify Gitea accessible
- [ ] Start edge agents on each host
- [ ] Verify all stacks deployed from Git
- [ ] Test alerting notifications
- [ ] Verify monitoring dashboards
---
## Portainer API Backup Commands
### Export All Stack Definitions
```bash
#!/bin/bash
API_KEY=REDACTED_API_KEY
BASE_URL="http://vishinator.synology.me:10000"
OUTPUT_DIR="./portainer-export-$(date +%Y%m%d)"
mkdir -p $OUTPUT_DIR
# Get all stacks
curl -s -H "X-API-Key: $API_KEY" "$BASE_URL/api/stacks" | \
jq -r '.[] | "\(.Id) \(.Name) \(.EndpointId)"' | \
while read id name endpoint; do
echo "Exporting stack: $name (ID: $id)"
curl -s -H "X-API-Key: $API_KEY" \
"$BASE_URL/api/stacks/$id/file" | \
jq -r '.REDACTED_APP_PASSWORD' > "$OUTPUT_DIR/${name}.yaml"
done
echo "Exported to $OUTPUT_DIR"
```
### Export Endpoint Configuration
```bash
curl -s -H "X-API-Key: $API_KEY" \
"$BASE_URL/api/endpoints" | jq > endpoints-backup.json
```
---
## Automated Backup Schedule
| Backup Type | Frequency | Retention | Location |
|-------------|-----------|-----------|----------|
| Portainer DB | Daily 3AM | 30 days | Atlantis NAS |
| Git repo mirror | Daily 4AM | Unlimited | Calypso NAS |
| Container volumes | Daily 5AM | 7 days local, 30 days remote | Atlantis NAS |
| Full export | Weekly Sunday | 4 weeks | Off-site (optional) |
---
## Verification & Testing
### Monthly Backup Test Checklist
- [ ] Verify Portainer backup file integrity
- [ ] Test restore to staging environment
- [ ] Verify Git repo clone works
- [ ] Test volume restore for one service
- [ ] Document any issues found
### Backup Monitoring
Add to Prometheus alerting:
```yaml
- alert: BackupFailed
expr: time() - backup_last_success_timestamp > 86400
for: 1h
labels:
severity: warning
annotations:
summary: "Backup hasn't run in 24 hours"
```
---
## Quick Reference
### Backup Locations
```
Atlantis:/volume1/backups/
├── portainer/ # Portainer DB backups
├── homelab-vm/ # Homelab VM volume backups
├── calypso/ # Calypso volume backups
└── git-mirrors/ # Git repository mirrors
```
### Important Files
- Portainer API Key: `ptr_REDACTED_PORTAINER_TOKEN`
- Git repo: `https://git.vish.gg/Vish/homelab`
- Edge agent keys: Stored in Portainer (Settings → Environments)
### Emergency Contacts
- Synology Support: 1-425-952-7900
- Portainer Support: https://www.portainer.io/support