7.0 KiB
7.0 KiB
Homelab Disaster Recovery Guide
🚨 Avoiding the Chicken and Egg Problem
This guide ensures you can recover your homelab services even if some infrastructure is down.
🎯 Recovery Priority Order
Phase 1: Core Infrastructure (No Dependencies)
- Router/Network - Physical access required
- Calypso Server - Direct console/SSH access
- Basic Docker - Local container management
Phase 2: Essential Services (Minimal Dependencies)
- Nginx Proxy Manager - Enables external access
- Gitea - Code repository access
- DNS/DHCP - Network services
Phase 3: Application Services (Depends on Phase 1+2)
- Reactive Resume v5 - Depends on NPM for external access
- Other applications - Can be restored after core services
🔧 Emergency Access Methods
If Gitea is Down
# Access via direct IP (bypass DNS)
ssh Vish@192.168.0.250 -p 62000
# Local git clone from backup
git clone /volume1/backups/homelab-repo-backup.git
# Manual deployment from local files
scp -P 62000 docker-compose.yml Vish@192.168.0.250:/volume1/docker/service/
If NPM is Down
# Direct service access via IP:PORT
http://192.168.0.250:9751 # Reactive Resume
http://192.168.0.250:3000 # Gitea
http://192.168.0.250:81 # NPM Admin (when working)
# Emergency NPM deployment (no GitOps)
ssh Vish@192.168.0.250 -p 62000
sudo /usr/local/bin/docker run -d \
--name nginx-proxy-manager-emergency \
-p 8880:80 -p 8443:443 -p 81:81 \
-v /volume1/docker/nginx-proxy-manager/data:/data \
-v /volume1/docker/nginx-proxy-manager/letsencrypt:/etc/letsencrypt \
jc21/nginx-proxy-manager:latest
If DNS is Down
# Use IP addresses directly
192.168.0.250 # Calypso
192.168.0.1 # Router
8.8.8.8 # Google DNS
# Edit local hosts file
echo "192.168.0.250 calypso.local git.local" >> /etc/hosts
📦 Offline Deployment Packages
Create Emergency Deployment Kit
# Create offline deployment package
mkdir -p /volume1/backups/emergency-kit
cd /home/homelab/organized/repos/homelab
# Package NPM deployment
tar -czf /volume1/backups/emergency-kit/npm-deployment.tar.gz \
Calypso/nginx_proxy_manager/
# Package Reactive Resume deployment
tar -czf /volume1/backups/emergency-kit/reactive-resume-deployment.tar.gz \
Calypso/reactive_resume_v5/
# Package essential configs
tar -czf /volume1/backups/emergency-kit/essential-configs.tar.gz \
Calypso/*.yaml Calypso/*.yml
Use Emergency Kit
# Extract and deploy without Git
ssh Vish@192.168.0.250 -p 62000
cd /volume1/backups/emergency-kit
# Deploy NPM first
tar -xzf npm-deployment.tar.gz
cd nginx_proxy_manager
chmod +x deploy.sh
./deploy.sh deploy
# Deploy Reactive Resume
cd ../
tar -xzf reactive-resume-deployment.tar.gz
cd reactive_resume_v5
chmod +x deploy.sh
./deploy.sh deploy
🔄 Service Dependencies Map
Internet Access
↓
Router (Physical)
↓
Calypso Server (SSH: 192.168.0.250:62000)
↓
Docker Engine (Local)
↓
┌─────────────────┬─────────────────┐
│ NPM (Port 81) │ Gitea (Port 3000) │ ← Independent services
└─────────────────┴─────────────────┘
↓ ↓
External Access Code Repository
↓ ↓
Reactive Resume v5 ← GitOps Deployment
🚀 Bootstrap Procedures
Complete Infrastructure Loss
- Physical Access: Console to Calypso
- Network Setup: Configure static IP if DHCP down
- Docker Start:
sudo systemctl start docker - Manual NPM: Deploy NPM container directly
- Git Access: Clone from backup or external source
- GitOps Resume: Use deployment scripts
Partial Service Loss
# If only applications are down (NPM working)
cd /home/homelab/organized/repos/homelab/Calypso/reactive_resume_v5
./deploy.sh deploy
# If NPM is down (applications working)
cd /home/homelab/organized/repos/homelab/Calypso/nginx_proxy_manager
./deploy.sh deploy
# If Git is down (use local backup)
cp -r /volume1/backups/homelab-latest/* /tmp/homelab-recovery/
cd /tmp/homelab-recovery/Calypso/reactive_resume_v5
./deploy.sh deploy
📋 Recovery Checklists
NPM Recovery Checklist
- Calypso server accessible via SSH
- Docker service running
- Port 81 available for admin UI
- Ports 8880/8443 available for proxy
- Data directory exists:
/volume1/docker/nginx-proxy-manager/data - SSL certificates preserved:
/volume1/docker/nginx-proxy-manager/letsencrypt - Router port forwarding: 80→8880, 443→8443
Reactive Resume Recovery Checklist
- NPM deployed and healthy
- Database directory exists:
/volume1/docker/rxv5/db - Storage directory exists:
/volume1/docker/rxv5/seaweedfs - Ollama directory exists:
/volume1/docker/rxv5/ollama - SMTP credentials available
- External domain resolving:
nslookup rx.vish.gg - NPM proxy hosts configured
🔐 Emergency Credentials
Default Service Credentials
# NPM Default (change immediately)
Email: admin@example.com
Password: "REDACTED_PASSWORD"
# Database Credentials (from compose)
User: resumeuser
Password: "REDACTED_PASSWORD"
Database: resume
# SMTP (from environment)
User: your-email@example.com
Password: "REDACTED_PASSWORD" # Stored in compose file
SSH Access
# Primary access
ssh Vish@192.168.0.250 -p 62000
# If SSH key fails, use password
# Ensure password auth is enabled in emergency
📞 Emergency Contacts & Resources
External Resources (No Local Dependencies)
- Docker Hub: https://hub.docker.com/
- Ollama Models: https://ollama.ai/library
- GitHub Backup: https://github.com/yourusername/homelab-backup
- Documentation: This file (print/save offline)
Recovery Commands Reference
# Check what's running
sudo /usr/local/bin/docker ps -a
# Emergency container cleanup
sudo /usr/local/bin/docker system prune -af
# Network troubleshooting
ping 8.8.8.8
nslookup rx.vish.gg
curl -I http://192.168.0.250:81
# Service health checks
curl http://192.168.0.250:9751/health
curl http://192.168.0.250:11434/api/tags
🎯 Prevention Strategies
Regular Backups
# Weekly automated backup
0 2 * * 0 /usr/local/bin/backup-homelab.sh
# Backup script creates:
# - Git repository backup
# - Docker volume backups
# - Configuration exports
# - Emergency deployment kits
Health Monitoring
# Daily health checks
0 8 * * * /usr/local/bin/health-check.sh
# Alerts on:
# - Service failures
# - Disk space issues
# - Network connectivity problems
# - SSL certificate expiration
Documentation Maintenance
- Keep this file updated with any infrastructure changes
- Test recovery procedures quarterly
- Maintain offline copies of critical documentation
- Document any custom configurations or passwords
Last Updated: 2026-02-16
Tested: Recovery procedures verified
Next Review: 2026-05-16