8.8 KiB
8.8 KiB
🔄 Portainer Backup & Recovery Plan
Last Updated: 2026-01-27
This document outlines the backup strategy for Portainer and all managed Docker infrastructure.
Overview
Portainer manages 5 endpoints with 130+ containers across the homelab. A comprehensive backup strategy ensures quick recovery from failures.
Current Backup Configuration ✅
| Setting | Value |
|---|---|
| Destination | Backblaze B2 (vk-portainer bucket) |
| Schedule | Daily at 3:00 AM |
| Retention | 30 days (auto-delete lifecycle rule) |
| Encryption | Yes (AES-256) |
| Backup Size | ~30 MB per backup |
| Max Storage | ~900 MB |
| Monthly Cost | ~$0.005 |
What's Backed Up
| Component | Location | Backup Method | Frequency |
|---|---|---|---|
| Portainer DB | Atlantis:/portainer | Backblaze B2 | Daily 3AM |
| Stack definitions | Git repo | Already versioned | On change |
| Container volumes | Per-host | Scheduled rsync | Daily |
| Secrets/Env vars | Portainer | Included in B2 backup | Daily |
Portainer Server Backup
Active Configuration: Backblaze B2 ✅
Automatic backups are configured via Portainer UI:
- Settings → Backup configuration → S3 Compatible
Current Settings:
S3 Host: https://s3.us-west-004.backblazeb2.com
Bucket: vk-portainer
Region: us-west-004
Schedule: 0 3 * * * (daily at 3 AM)
Encryption: Enabled
Manual Backup via API
# Trigger immediate backup
curl -X POST "http://vishinator.synology.me:10000/api/backup/s3/execute" \
-H "X-API-Key: "REDACTED_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"accessKeyID": "004d35b7f4bf4300000000001",
"secretAccessKey": "K004SyhG7s+Xv/LDB32SAJFLKhe5dj0",
"region": "us-west-004",
"bucketName": "vk-portainer",
"password": "portainer-backup-2026",
"s3CompatibleHost": "https://s3.us-west-004.backblazeb2.com"
}'
# Download backup locally
curl -X GET "http://vishinator.synology.me:10000/api/backup" \
-H "X-API-Key: "REDACTED_API_KEY" \
-o portainer-backup-$(date +%Y%m%d).tar.gz
Option 2: Volume Backup (Manual)
# On Atlantis (where Portainer runs)
# Stop Portainer temporarily
docker stop portainer
# Backup the data volume
tar -czvf /volume1/backups/portainer/portainer-$(date +%Y%m%d).tar.gz \
/volume1/docker/portainer/data
# Restart Portainer
docker start portainer
Option 3: Scheduled Backup Script
Create /volume1/scripts/backup-portainer.sh:
#!/bin/bash
BACKUP_DIR="/volume1/backups/portainer"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30
# Create backup directory
mkdir -p $BACKUP_DIR
# Backup Portainer data (hot backup - no downtime)
docker run --rm \
-v portainer_data:/data \
-v $BACKUP_DIR:/backup \
alpine tar -czvf /backup/portainer-$DATE.tar.gz /data
# Cleanup old backups
find $BACKUP_DIR -name "portainer-*.tar.gz" -mtime +$RETENTION_DAYS -delete
echo "Backup completed: portainer-$DATE.tar.gz"
Add to crontab:
# Daily at 3 AM
0 3 * * * /volume1/scripts/backup-portainer.sh >> /var/log/portainer-backup.log 2>&1
Stack Definitions Backup
All stack definitions are stored in Git (git.vish.gg/Vish/homelab), providing:
- ✅ Version history
- ✅ Change tracking
- ✅ Easy rollback
- ✅ Multi-location redundancy
Git Repository Structure
homelab/
├── Atlantis/ # Atlantis stack configs
├── Calypso/ # Calypso stack configs
├── homelab_vm/ # Homelab VM configs
│ ├── monitoring.yaml
│ ├── openhands.yaml
│ ├── ntfy.yaml
│ └── prometheus_grafana_hub/
│ └── alerting/
├── concord_nuc/ # NUC configs
└── docs/ # Documentation
Backup Git Repo Locally
# Clone full repo with history
git clone --mirror https://git.vish.gg/Vish/homelab.git homelab-backup.git
# Update existing mirror
cd homelab-backup.git && git remote update
Container Volume Backup Strategy
Critical Volumes to Backup
| Service | Volume Path | Priority | Size |
|---|---|---|---|
| Grafana | /var/lib/grafana | High | ~500MB |
| Prometheus | /prometheus | Medium | ~2GB |
| ntfy | /var/cache/ntfy | Low | ~100MB |
| Alertmanager | /alertmanager | Medium | ~50MB |
Backup Script for Homelab VM
Create /home/homelab/scripts/backup-volumes.sh:
#!/bin/bash
BACKUP_DIR="/home/homelab/backups"
DATE=$(date +%Y%m%d)
REMOTE="atlantis:/volume1/backups/homelab-vm"
# Create local backup
mkdir -p $BACKUP_DIR/$DATE
# Backup critical volumes
for vol in grafana prometheus alertmanager; do
docker run --rm \
-v ${vol}_data:/data \
-v $BACKUP_DIR/$DATE:/backup \
alpine tar -czvf /backup/${vol}.tar.gz /data
done
# Sync to remote (Atlantis NAS)
rsync -av --delete $BACKUP_DIR/$DATE/ $REMOTE/$DATE/
# Keep last 7 days locally
find $BACKUP_DIR -maxdepth 1 -type d -mtime +7 -exec rm -rf {} \;
echo "Backup completed: $DATE"
Disaster Recovery Procedures
Scenario 1: Portainer Server Failure
Recovery Steps:
- Deploy new Portainer instance on Atlantis
- Restore from backup
- Re-add edge agents (they will auto-reconnect)
# Deploy fresh Portainer
docker run -d -p 10000:9000 -p 8000:8000 \
--name portainer --restart always \
-v /var/run/docker.sock:/var/run/docker.sock \
-v portainer_data:/data \
portainer/portainer-ee:latest
# Restore from backup
docker stop portainer
tar -xzvf portainer-backup.tar.gz -C /
docker start portainer
Scenario 2: Edge Agent Failure (e.g., Homelab VM)
Recovery Steps:
- Reinstall Docker on the host
- Install Portainer agent
- Redeploy stacks from Git
# Install Portainer Edge Agent
docker run -d \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /var/lib/docker/volumes:/var/lib/docker/volumes \
-v portainer_agent_data:/data \
--name portainer_edge_agent \
--restart always \
-e EDGE=1 \
-e EDGE_ID=<edge-id> \
-e EDGE_KEY=<edge-key> \
-e EDGE_INSECURE_POLL=1 \
portainer/agent:latest
# Stacks will auto-deploy from Git (if AutoUpdate enabled)
# Or manually trigger via Portainer API
Scenario 3: Complete Infrastructure Loss
Recovery Priority:
- Network (router, switch)
- Atlantis NAS (Portainer server)
- Git server (Gitea on Calypso)
- Edge agents
Full Recovery Checklist:
- Restore network connectivity
- Boot Atlantis, restore Portainer backup
- Boot Calypso, verify Gitea accessible
- Start edge agents on each host
- Verify all stacks deployed from Git
- Test alerting notifications
- Verify monitoring dashboards
Portainer API Backup Commands
Export All Stack Definitions
#!/bin/bash
API_KEY=REDACTED_API_KEY
BASE_URL="http://vishinator.synology.me:10000"
OUTPUT_DIR="./portainer-export-$(date +%Y%m%d)"
mkdir -p $OUTPUT_DIR
# Get all stacks
curl -s -H "X-API-Key: $API_KEY" "$BASE_URL/api/stacks" | \
jq -r '.[] | "\(.Id) \(.Name) \(.EndpointId)"' | \
while read id name endpoint; do
echo "Exporting stack: $name (ID: $id)"
curl -s -H "X-API-Key: $API_KEY" \
"$BASE_URL/api/stacks/$id/file" | \
jq -r '.REDACTED_APP_PASSWORD' > "$OUTPUT_DIR/${name}.yaml"
done
echo "Exported to $OUTPUT_DIR"
Export Endpoint Configuration
curl -s -H "X-API-Key: $API_KEY" \
"$BASE_URL/api/endpoints" | jq > endpoints-backup.json
Automated Backup Schedule
| Backup Type | Frequency | Retention | Location |
|---|---|---|---|
| Portainer DB | Daily 3AM | 30 days | Atlantis NAS |
| Git repo mirror | Daily 4AM | Unlimited | Calypso NAS |
| Container volumes | Daily 5AM | 7 days local, 30 days remote | Atlantis NAS |
| Full export | Weekly Sunday | 4 weeks | Off-site (optional) |
Verification & Testing
Monthly Backup Test Checklist
- Verify Portainer backup file integrity
- Test restore to staging environment
- Verify Git repo clone works
- Test volume restore for one service
- Document any issues found
Backup Monitoring
Add to Prometheus alerting:
- alert: BackupFailed
expr: time() - backup_last_success_timestamp > 86400
for: 1h
labels:
severity: warning
annotations:
summary: "Backup hasn't run in 24 hours"
Quick Reference
Backup Locations
Atlantis:/volume1/backups/
├── portainer/ # Portainer DB backups
├── homelab-vm/ # Homelab VM volume backups
├── calypso/ # Calypso volume backups
└── git-mirrors/ # Git repository mirrors
Important Files
- Portainer API Key:
ptr_REDACTED_PORTAINER_TOKEN - Git repo:
https://git.vish.gg/Vish/homelab - Edge agent keys: Stored in Portainer (Settings → Environments)
Emergency Contacts
- Synology Support: 1-425-952-7900
- Portainer Support: https://www.portainer.io/support