# 🚨 Disaster Recovery Guide **🔴 Advanced Guide** This guide covers critical disaster recovery scenarios for your homelab, including complete router failure, network reconfiguration, and service restoration procedures. ## 🎯 Disaster Scenarios Covered 1. **🔥 Router Failure** - Complete router replacement and reconfiguration 2. **🌐 Network Reconfiguration** - ISP changes, subnet changes, IP conflicts 3. **🔌 Power Outage Recovery** - Bringing services back online in correct order 4. **💾 Storage Failure** - Data recovery and service restoration 5. **🔐 Password Manager Outage** - Accessing credentials when Vaultwarden is down --- ## 🔥 Router Failure Recovery ### 📋 **Pre-Disaster Preparation (Do This Now!)** #### 1. **Document Current Network Configuration** ```bash # Create network documentation file mkdir -p ~/homelab-recovery cat > ~/homelab-recovery/network-config.md << 'EOF' # Network Configuration Backup ## Router Information - **Model**: [Your Router Model] - **Firmware**: [Version] - **Admin URL**: http://192.168.1.1 - **Admin User**: admin - **Admin Password**: [Document in password manager] ## Network Settings - **WAN Type**: DHCP / Static / PPPoE - **ISP Settings**: [Document ISP-specific settings] - **Subnet**: 192.168.1.0/24 - **DHCP Range**: 192.168.1.100-192.168.1.200 - **DNS Servers**: 1.1.1.1, 8.8.8.8 ## Static IP Assignments EOF # Document all static IPs echo "## Static IP Assignments" >> ~/homelab-recovery/network-config.md ``` #### 2. **Export Router Configuration** ```bash # Most routers allow config export # Login to router web interface # Look for: System → Backup/Restore → Export Configuration # Save to: ~/homelab-recovery/router-backup-$(date +%Y%m%d).bin ``` #### 3. **Document Port Forwarding Rules** ```bash cat > ~/homelab-recovery/port-forwarding.md << 'EOF' # Port Forwarding Rules ## Essential Services | External Port | Internal IP | Internal Port | Protocol | Service | |---------------|-------------|---------------|----------|---------| | 51820 | 192.168.1.100 | 51820 | UDP | WireGuard (Atlantis) | | 51820 | 192.168.1.102 | 51820 | UDP | WireGuard (Concord) | | 80 | 192.168.1.100 | 8341 | TCP | HTTP (Nginx Proxy) | | 443 | 192.168.1.100 | 8766 | TCP | HTTPS (Nginx Proxy) | ## Gaming Services (Optional) | External Port | Internal IP | Internal Port | Protocol | Service | |---------------|-------------|---------------|----------|---------| | 7777 | 192.168.1.103 | 7777 | TCP/UDP | Satisfactory | | 27015 | 192.168.1.103 | 27015 | TCP/UDP | L4D2 Server | ## Dynamic DNS - **Service**: [Your DDNS Provider] - **Hostname**: vishinator.synology.me - **Update URL**: [Document update mechanism] EOF ``` ### 🛠️ **Router Replacement Procedure** #### **Step 1: Physical Setup** ```bash # 1. Connect new router to modem # 2. Connect computer directly to router via Ethernet # 3. Power on router and wait for boot (2-3 minutes) ``` #### **Step 2: Basic Network Configuration** ```bash # Access router admin interface # Default is usually: http://192.168.1.1 or http://192.168.0.1 # For TP-Link Archer BE800 v1.6: http://192.168.0.1 or http://tplinkwifi.net # Default login: admin/admin # If different subnet, find router IP: ip route | grep default # or arp -a | grep -E "(router|gateway)" ``` **Router Configuration Checklist:** ```bash # ✅ Set admin password (use password manager) # ✅ Configure WAN connection (DHCP/Static/PPPoE) # ✅ Set WiFi SSID and password # ✅ Configure subnet: 192.168.1.0/24 # ✅ Set DHCP range: 192.168.1.100-192.168.1.200 # ✅ Configure DNS servers: 1.1.1.1, 8.8.8.8 # ✅ Enable UPnP (if needed) # ✅ Disable WPS (security) ``` **📖 For TP-Link Archer BE800 v1.6 specific instructions, see: [TP-Link Archer BE800 Setup Guide](../infrastructure/tplink-archer-be800-setup.md)** #### **Step 3: Static IP Assignment** **Critical Static IPs (Configure First):** ```bash # In router DHCP reservation settings: # Primary Infrastructure atlantis.vish.local → 192.168.1.100 # MAC: [Document MAC] calypso.vish.local → 192.168.1.101 # MAC: [Document MAC] concord-nuc.vish.local → 192.168.1.102 # MAC: [Document MAC] # Virtual Machines homelab-vm.vish.local → 192.168.1.103 # MAC: [Document MAC] chicago-vm.vish.local → 192.168.1.104 # MAC: [Document MAC] bulgaria-vm.vish.local → 192.168.1.105 # MAC: [Document MAC] # Specialized Hosts anubis.vish.local → 192.168.1.106 # MAC: [Document MAC] guava.vish.local → 192.168.1.107 # MAC: [Document MAC] setillo.vish.local → 192.168.1.108 # MAC: [Document MAC] # Raspberry Pi Cluster rpi-vish.vish.local → 192.168.1.109 # MAC: [Document MAC] rpi-kevin.vish.local → 192.168.1.110 # MAC: [Document MAC] # Edge Devices nvidia-shield.vish.local → 192.168.1.111 # MAC: [Document MAC] ``` **Find MAC Addresses:** ```bash # On each host, run: ip link show | grep -E "(ether|link)" # or cat /sys/class/net/eth0/address # From router, check DHCP client list # Or use network scanner: nmap -sn 192.168.1.0/24 arp -a ``` #### **Step 4: Port Forwarding Configuration** **Essential Port Forwards (Configure Immediately):** ```bash # VPN Access (Highest Priority) External: 51820/UDP → Internal: 192.168.1.100:51820 (Atlantis WireGuard) External: 51821/UDP → Internal: 192.168.1.102:51820 (Concord WireGuard) # Web Services (If needed) External: 80/TCP → Internal: 192.168.1.100:8341 (HTTP) External: 443/TCP → Internal: 192.168.1.100:8766 (HTTPS) ``` **Gaming Services (If hosting public games):** ```bash # Satisfactory Server External: 7777/TCP → Internal: 192.168.1.103:7777 External: 7777/UDP → Internal: 192.168.1.103:7777 # Left 4 Dead 2 Server External: 27015/TCP → Internal: 192.168.1.103:27015 External: 27015/UDP → Internal: 192.168.1.103:27015 External: 27020/UDP → Internal: 192.168.1.103:27020 External: 27005/UDP → Internal: 192.168.1.103:27005 ``` #### **Step 5: Dynamic DNS Configuration** **Update DDNS Settings:** ```bash # Method 1: Router Built-in DDNS # Configure in router: Advanced → Dynamic DNS # Service: [Your provider] # Hostname: vishinator.synology.me # Username: [Your DDNS username] # Password: "REDACTED_PASSWORD" DDNS password] # Method 2: Manual Update (if router doesn't support your provider) # SSH to a homelab host and run: curl -u "username:password" \ "https://your-ddns-provider.com/update?hostname=vishinator.synology.me&myip=$(curl -s ifconfig.me)" ``` **Test DDNS:** ```bash # Wait 5-10 minutes, then test: nslookup vishinator.synology.me dig vishinator.synology.me # Should return your new external IP curl ifconfig.me # Compare with DDNS result ``` ### 🔧 **Service Recovery Order** **Phase 1: Core Infrastructure (First 30 minutes)** ```bash # 1. Verify network connectivity ping 8.8.8.8 ping google.com # 2. Check all hosts are reachable ping atlantis.vish.local ping calypso.vish.local ping concord-nuc.vish.local # 3. Verify DNS resolution nslookup atlantis.vish.local ``` **Phase 2: Essential Services (Next 30 minutes)** ```bash # 4. Check VPN services # Test WireGuard from external device # Verify Tailscale connectivity # 5. Verify password manager curl -I https://atlantis.vish.local:8222 # Vaultwarden # 6. Check monitoring curl -I https://atlantis.vish.local:3000 # Grafana curl -I https://atlantis.vish.local:3001 # Uptime Kuma ``` **Phase 3: Media and Applications (Next hour)** ```bash # 7. Media services curl -I https://atlantis.vish.local:32400 # Plex curl -I https://calypso.vish.local:2283 # Immich # 8. Communication services curl -I https://homelab-vm.vish.local:8065 # Mattermost # 9. Development services curl -I https://atlantis.vish.local:8929 # GitLab ``` ### 📱 **Mobile Hotspot Emergency Access** If your internet is down but you need to configure the router: ```bash # 1. Connect phone to new router WiFi # 2. Enable mobile hotspot on another device # 3. Connect computer to mobile hotspot # 4. Access router via: http://192.168.1.1 # 5. Configure WAN settings to use mobile hotspot temporarily ``` --- ## 🌐 Network Reconfiguration Scenarios ### **ISP Changes (New Modem/Different Settings)** #### **Scenario 1: New Cable Modem** ```bash # 1. Connect new modem to router WAN port # 2. Power cycle both devices (modem first, then router) # 3. Check WAN connection in router interface # 4. Update DDNS if external IP changed # 5. Test port forwarding from external network ``` #### **Scenario 2: Fiber Installation** ```bash # 1. Configure router for new connection type # 2. May need PPPoE credentials from ISP # 3. Update MTU settings if required (usually 1500 for fiber) # 4. Test speed and latency # 5. Update monitoring dashboards with new metrics ``` #### **Scenario 3: Subnet Change Required** ```bash # If you need to change from 192.168.1.x to different subnet: # 1. Plan new IP scheme # Old: 192.168.1.0/24 # New: 192.168.2.0/24 (example) # 2. Update router DHCP settings # 3. Update static IP reservations # 4. Update all service configurations # 5. Update Tailscale subnet routes # 6. Update monitoring configurations # 7. Update documentation ``` ### **IP Conflict Resolution** ```bash # If new router uses different default subnet: # 1. Identify conflicts nmap -sn 192.168.0.0/24 # Scan new subnet nmap -sn 192.168.1.0/24 # Scan old subnet # 2. Choose resolution strategy: # Option A: Change router to use 192.168.1.x # Option B: Reconfigure all devices for new subnet # 3. Update all static configurations # 4. Update firewall rules # 5. Update service discovery ``` --- ## 🔌 Power Outage Recovery ### **Startup Sequence (Critical Order)** ```bash # Phase 1: Infrastructure (0-5 minutes) # 1. Modem/Internet connection # 2. Router/Switch # 3. NAS devices (Atlantis, Calypso) - these take longest to boot # Phase 2: Core Services (5-10 minutes) # 4. Primary compute hosts (concord-nuc) # 5. Virtual machine hosts # Phase 3: Applications (10-15 minutes) # 6. Raspberry Pi devices # 7. Edge devices # 8. Verify all services are running ``` **Automated Startup Script:** ```bash #!/bin/bash # ~/homelab-recovery/startup-sequence.sh echo "🔌 Starting homelab recovery sequence..." # Wait for network echo "⏳ Waiting for network connectivity..." while ! ping -c 1 8.8.8.8 >/dev/null 2>&1; do sleep 5 done echo "✅ Network is up" # Check each host hosts=( "atlantis.vish.local" "calypso.vish.local" "concord-nuc.vish.local" "homelab-vm.vish.local" "chicago-vm.vish.local" "bulgaria-vm.vish.local" ) for host in "${hosts[@]}"; do echo "🔍 Checking $host..." if ping -c 1 "$host" >/dev/null 2>&1; then echo "✅ $host is responding" else echo "❌ $host is not responding" fi done echo "🎯 Recovery sequence complete" ``` --- ## 💾 Storage Failure Recovery ### **Backup Verification** ```bash # Before disaster strikes, verify backups exist: # 1. Docker volume backups ls -la /volume1/docker/*/ du -sh /volume1/docker/*/ # 2. Configuration backups find ~/homelab-recovery -name "*.yml" -o -name "*.yaml" # 3. Database backups ls -la /volume1/docker/*/backup/ ls -la /volume1/docker/*/db_backup/ ``` ### **Service Restoration Priority** ```bash # 1. Password Manager (Vaultwarden) - Need passwords for everything else # 2. DNS/DHCP (Pi-hole) - Network services # 3. Monitoring (Grafana/Prometheus) - Visibility into recovery # 4. VPN (WireGuard/Tailscale) - Remote access # 5. Media services - Lower priority # 6. Development services - Lowest priority ``` --- ## 🔧 Emergency Toolkit ### **Essential Recovery Files** Create and maintain these files: ```bash # Create recovery directory mkdir -p ~/homelab-recovery/{configs,scripts,docs,backups} # Network configuration ~/homelab-recovery/docs/network-config.md ~/homelab-recovery/docs/port-forwarding.md ~/homelab-recovery/docs/static-ips.md # Service configurations ~/homelab-recovery/configs/docker-compose-essential.yml ~/homelab-recovery/configs/nginx-proxy-manager.conf ~/homelab-recovery/configs/wireguard-configs/ # Recovery scripts ~/homelab-recovery/scripts/startup-sequence.sh ~/homelab-recovery/scripts/test-connectivity.sh ~/homelab-recovery/scripts/restore-services.sh # Backup files ~/homelab-recovery/backups/router-config-$(date +%Y%m%d).bin ~/homelab-recovery/backups/vaultwarden-backup.json ~/homelab-recovery/backups/essential-passwords.txt.gpg ``` ### **Emergency Contact Information** ```bash cat > ~/homelab-recovery/docs/emergency-contacts.md << 'EOF' # Emergency Contacts ## ISP Support - **Provider**: [Your ISP] - **Phone**: [Support number] - **Account**: [Account number] - **Service Address**: [Your address] ## Hardware Vendors - **Router**: [Manufacturer support] - **NAS**: Synology Support - **Server**: [Hardware vendor] ## Service Providers - **Domain Registrar**: [Your registrar] - **DDNS Provider**: [Your DDNS service] - **Cloud Backup**: [Your backup service] EOF ``` ### **Quick Reference Commands** ```bash # Network diagnostics ping 8.8.8.8 # Internet connectivity nslookup google.com # DNS resolution ip route # Routing table arp -a # ARP table netstat -rn # Network routes # Service checks docker ps # Running containers systemctl status tailscaled # Tailscale status systemctl status docker # Docker status # Port checks nmap -p 22,80,443,51820 localhost telnet hostname port nc -zv hostname port ``` --- ## 📋 Recovery Checklists ### **🔥 Router Failure Checklist** ```bash ☐ Physical setup (modem → router → computer) ☐ Access router admin interface ☐ Configure basic settings (SSID, password, subnet) ☐ Set static IP reservations for all hosts ☐ Configure port forwarding rules ☐ Update DDNS settings ☐ Test VPN connectivity ☐ Verify all services accessible ☐ Update documentation with any changes ☐ Test from external network ``` ### **🌐 Network Change Checklist** ```bash ☐ Document old configuration ☐ Plan new IP scheme ☐ Update router settings ☐ Update static IP reservations ☐ Update service configurations ☐ Update Tailscale subnet routes ☐ Update monitoring dashboards ☐ Update documentation ☐ Test all services ☐ Update backup scripts ``` ### **🔌 Power Outage Checklist** ```bash ☐ Wait for stable power (use UPS if available) ☐ Start devices in correct order ☐ Verify network connectivity ☐ Check all hosts are responding ☐ Verify essential services are running ☐ Check for any corrupted data ☐ Update monitoring dashboards ☐ Document any issues encountered ``` --- ## 🚨 Emergency Procedures ### **If Everything is Down** ```bash # 1. Stay calm and work systematically # 2. Check physical connections first # 3. Verify power to all devices # 4. Check internet connectivity with direct connection # 5. Work through recovery checklists step by step # 6. Document everything for future reference ``` ### **If You're Locked Out** ```bash # 1. Try default router credentials (often admin/admin) # 2. Look for reset button on router (hold 10-30 seconds) # 3. Check router label for default WiFi password # 4. Use mobile hotspot for internet access during recovery # 5. Access password manager from mobile device if needed ``` ### **If Services Won't Start** ```bash # 1. Check Docker daemon is running systemctl status docker # 2. Check disk space df -h # 3. Check for port conflicts netstat -tulpn | grep :port # 4. Check container logs docker logs container-name # 5. Try starting services individually docker-compose up service-name ``` --- ## 📚 Related Documentation - [Tailscale Setup Guide](../infrastructure/tailscale-setup-guide.md) - Alternative access method - [Port Forwarding Guide](../infrastructure/port-forwarding-guide.md) - Detailed port configuration - [Security Model](../infrastructure/security.md) - Security considerations during recovery - [Offline Password Access](offline-password-access.md) - Accessing passwords when Vaultwarden is down - [Authentik SSO Rebuild](authentik-sso-rebuild.md) - Complete SSO/OAuth2 disaster recovery - [Authentik SSO Setup](../infrastructure/authentik-sso.md) - SSO configuration reference --- **💡 Pro Tip**: Practice these procedures when everything is working! Run through the checklists quarterly to ensure your documentation is current and you're familiar with the process. A disaster is not the time to learn these procedures for the first time.