# Homelab Disaster Recovery Guide ## 🚨 Avoiding the Chicken and Egg Problem This guide ensures you can recover your homelab services even if some infrastructure is down. ## 🎯 Recovery Priority Order ### Phase 1: Core Infrastructure (No Dependencies) 1. **Router/Network** - Physical access required 2. **Calypso Server** - Direct console/SSH access 3. **Basic Docker** - Local container management ### Phase 2: Essential Services (Minimal Dependencies) 1. **Nginx Proxy Manager** - Enables external access 2. **Gitea** - Code repository access 3. **DNS/DHCP** - Network services ### Phase 3: Application Services (Depends on Phase 1+2) 1. **Reactive Resume v5** - Depends on NPM for external access 2. **Other applications** - Can be restored after core services ## 🔧 Emergency Access Methods ### If Gitea is Down ```bash # Access via direct IP (bypass DNS) ssh Vish@192.168.0.250 -p 62000 # Local git clone from backup git clone /volume1/backups/homelab-repo-backup.git # Manual deployment from local files scp -P 62000 docker-compose.yml Vish@192.168.0.250:/volume1/docker/service/ ``` ### If NPM is Down ```bash # Direct service access via IP:PORT http://192.168.0.250:9751 # Reactive Resume http://192.168.0.250:3000 # Gitea http://192.168.0.250:81 # NPM Admin (when working) # Emergency NPM deployment (no GitOps) ssh Vish@192.168.0.250 -p 62000 sudo /usr/local/bin/docker run -d \ --name nginx-proxy-manager-emergency \ -p 8880:80 -p 8443:443 -p 81:81 \ -v /volume1/docker/nginx-proxy-manager/data:/data \ -v /volume1/docker/nginx-proxy-manager/letsencrypt:/etc/letsencrypt \ jc21/nginx-proxy-manager:latest ``` ### If DNS is Down ```bash # Use IP addresses directly 192.168.0.250 # Calypso 192.168.0.1 # Router 8.8.8.8 # Google DNS # Edit local hosts file echo "192.168.0.250 calypso.local git.local" >> /etc/hosts ``` ## 📦 Offline Deployment Packages ### Create Emergency Deployment Kit ```bash # Create offline deployment package mkdir -p /volume1/backups/emergency-kit cd /home/homelab/organized/repos/homelab # Package NPM deployment tar -czf /volume1/backups/emergency-kit/npm-deployment.tar.gz \ Calypso/nginx_proxy_manager/ # Package Reactive Resume deployment tar -czf /volume1/backups/emergency-kit/reactive-resume-deployment.tar.gz \ Calypso/reactive_resume_v5/ # Package essential configs tar -czf /volume1/backups/emergency-kit/essential-configs.tar.gz \ Calypso/*.yaml Calypso/*.yml ``` ### Use Emergency Kit ```bash # Extract and deploy without Git ssh Vish@192.168.0.250 -p 62000 cd /volume1/backups/emergency-kit # Deploy NPM first tar -xzf npm-deployment.tar.gz cd nginx_proxy_manager chmod +x deploy.sh ./deploy.sh deploy # Deploy Reactive Resume cd ../ tar -xzf reactive-resume-deployment.tar.gz cd reactive_resume_v5 chmod +x deploy.sh ./deploy.sh deploy ``` ## 🔄 Service Dependencies Map ``` Internet Access ↓ Router (Physical) ↓ Calypso Server (SSH: 192.168.0.250:62000) ↓ Docker Engine (Local) ↓ ┌─────────────────┬─────────────────┐ │ NPM (Port 81) │ Gitea (Port 3000) │ ← Independent services └─────────────────┴─────────────────┘ ↓ ↓ External Access Code Repository ↓ ↓ Reactive Resume v5 ← GitOps Deployment ``` ## 🚀 Bootstrap Procedures ### Complete Infrastructure Loss 1. **Physical Access**: Console to Calypso 2. **Network Setup**: Configure static IP if DHCP down 3. **Docker Start**: `sudo systemctl start docker` 4. **Manual NPM**: Deploy NPM container directly 5. **Git Access**: Clone from backup or external source 6. **GitOps Resume**: Use deployment scripts ### Partial Service Loss ```bash # If only applications are down (NPM working) cd /home/homelab/organized/repos/homelab/Calypso/reactive_resume_v5 ./deploy.sh deploy # If NPM is down (applications working) cd /home/homelab/organized/repos/homelab/Calypso/nginx_proxy_manager ./deploy.sh deploy # If Git is down (use local backup) cp -r /volume1/backups/homelab-latest/* /tmp/homelab-recovery/ cd /tmp/homelab-recovery/Calypso/reactive_resume_v5 ./deploy.sh deploy ``` ## 📋 Recovery Checklists ### NPM Recovery Checklist - [ ] Calypso server accessible via SSH - [ ] Docker service running - [ ] Port 81 available for admin UI - [ ] Ports 8880/8443 available for proxy - [ ] Data directory exists: `/volume1/docker/nginx-proxy-manager/data` - [ ] SSL certificates preserved: `/volume1/docker/nginx-proxy-manager/letsencrypt` - [ ] Router port forwarding: 80→8880, 443→8443 ### Reactive Resume Recovery Checklist - [ ] NPM deployed and healthy - [ ] Database directory exists: `/volume1/docker/rxv5/db` - [ ] Storage directory exists: `/volume1/docker/rxv5/seaweedfs` - [ ] Ollama directory exists: `/volume1/docker/rxv5/ollama` - [ ] SMTP credentials available - [ ] External domain resolving: `nslookup rx.vish.gg` - [ ] NPM proxy hosts configured ## 🔐 Emergency Credentials ### Default Service Credentials ```bash # NPM Default (change immediately) Email: admin@example.com Password: "REDACTED_PASSWORD" # Database Credentials (from compose) User: resumeuser Password: "REDACTED_PASSWORD" Database: resume # SMTP (from environment) User: your-email@example.com Password: "REDACTED_PASSWORD" # Stored in compose file ``` ### SSH Access ```bash # Primary access ssh Vish@192.168.0.250 -p 62000 # If SSH key fails, use password # Ensure password auth is enabled in emergency ``` ## 📞 Emergency Contacts & Resources ### External Resources (No Local Dependencies) - **Docker Hub**: https://hub.docker.com/ - **Ollama Models**: https://ollama.ai/library - **GitHub Backup**: https://github.com/yourusername/homelab-backup - **Documentation**: This file (print/save offline) ### Recovery Commands Reference ```bash # Check what's running sudo /usr/local/bin/docker ps -a # Emergency container cleanup sudo /usr/local/bin/docker system prune -af # Network troubleshooting ping 8.8.8.8 nslookup rx.vish.gg curl -I http://192.168.0.250:81 # Service health checks curl http://192.168.0.250:9751/health curl http://192.168.0.250:11434/api/tags ``` ## 🎯 Prevention Strategies ### Regular Backups ```bash # Weekly automated backup 0 2 * * 0 /usr/local/bin/backup-homelab.sh # Backup script creates: # - Git repository backup # - Docker volume backups # - Configuration exports # - Emergency deployment kits ``` ### Health Monitoring ```bash # Daily health checks 0 8 * * * /usr/local/bin/health-check.sh # Alerts on: # - Service failures # - Disk space issues # - Network connectivity problems # - SSL certificate expiration ``` ### Documentation Maintenance - Keep this file updated with any infrastructure changes - Test recovery procedures quarterly - Maintain offline copies of critical documentation - Document any custom configurations or passwords --- **Last Updated**: 2026-02-16 **Tested**: Recovery procedures verified **Next Review**: 2026-05-16