Files
homelab-optimized/docs/runbooks/service-migration.md
Gitea Mirror Bot 082633dad9
Some checks failed
Documentation / Deploy to GitHub Pages (push) Has been cancelled
Documentation / Build Docusaurus (push) Has been cancelled
Sanitized mirror from private repository - 2026-04-05 10:50:43 UTC
2026-04-05 10:50:43 +00:00

560 lines
14 KiB
Markdown

# Service Migration Runbook
## Overview
This runbook guides you through migrating a containerized service from one host to another in the homelab. The procedure minimizes downtime and ensures data integrity throughout the migration.
## Prerequisites
- [ ] SSH access to both source and target hosts
- [ ] Sufficient disk space on target host
- [ ] Network connectivity between hosts (Tailscale recommended)
- [ ] Service backup completed and verified
- [ ] Maintenance window scheduled (if downtime required)
- [ ] Portainer access for both hosts
## Metadata
- **Estimated Time**: 1-3 hours (depending on data size)
- **Risk Level**: Medium-High (data migration involved)
- **Requires Downtime**: Yes (typically 15-60 minutes)
- **Reversible**: Yes (can roll back to source host)
- **Tested On**: 2026-02-14
## When to Migrate Services
Common reasons for service migration:
| Scenario | Example | Recommended Target |
|----------|---------|-------------------|
| **Resource constraints** | NAS running out of CPU | Move to NUC or VM |
| **Storage constraints** | Running out of disk space | Move to larger NAS |
| **Performance issues** | High I/O affecting other services | Move to dedicated host |
| **Host consolidation** | Reducing number of active hosts | Consolidate to primary hosts |
| **Hardware maintenance** | Planned hardware upgrade | Temporary or permanent move |
| **Improved organization** | Group related services | Move to appropriate host |
## Migration Types
### Type 1: Simple Migration (Stateless Service)
- No persistent data
- Can be redeployed from scratch
- Example: Nginx, static web servers
- **Downtime**: Minimal (5-15 minutes)
### Type 2: Standard Migration (Small Data)
- Persistent data < 10GB
- Configuration and databases
- Example: Uptime Kuma, AdGuard Home
- **Downtime**: 15-30 minutes
### Type 3: Large Data Migration
- Persistent data > 10GB
- Media libraries, large databases
- Example: Plex, Immich, Jellyfin
- **Downtime**: 1-4 hours (depending on size)
## Pre-Migration Planning
### Step 1: Assess the Service
```bash
# SSH to source host
ssh [source-host]
# Identify container and volumes
docker ps | grep [service-name]
docker inspect [service-name] | grep -A 10 Mounts
# Check data size
docker exec [service-name] du -sh /config /data
# List all volumes used by service
docker volume ls | grep [service-name]
# Check volume sizes
docker system df -v | grep [service-name]
```
Document findings:
- Container name: ___________
- Image and tag: ___________
- Data size: ___________
- Volume count: ___________
- Network dependencies: ___________
- Port mappings: ___________
### Step 2: Check Target Host Capacity
```bash
# SSH to target host
ssh [target-host]
# Check available resources
df -h # Disk space
free -h # RAM
nproc # CPU cores
docker ps | wc -l # Current container count
# Check port conflicts
netstat -tlnp | grep [required-port]
```
### Step 3: Create Migration Plan
**Downtime Window**:
- Start: ___________
- End: ___________
- Duration: ___________
**Dependencies**:
- Services that depend on this: ___________
- Services this depends on: ___________
**Notification**:
- Who to notify: ___________
- When to notify: ___________
## Migration Procedure
### Method A: GitOps Migration (Recommended)
Best for: Most services with proper version control
#### Step 1: Backup Current Service
```bash
# SSH to source host
ssh [source-host]
# Create backup
docker stop [service-name]
docker export [service-name] > /tmp/[service-name]-backup.tar
# Backup volumes
for vol in $(docker volume ls -q | grep [service-name]); do
docker run --rm -v $vol:/source -v /tmp:/backup alpine tar czf /backup/$vol.tar.gz -C /source .
done
# Copy backups to safe location
scp /tmp/[service-name]*.tar* [backup-location]:~/backups/
```
#### Step 2: Export Configuration
```bash
# Get current docker-compose configuration
cd ~/Documents/repos/homelab
cat hosts/[source-host]/[service-name].yaml > /tmp/service-config.yaml
# Note environment variables
docker inspect [service-name] | grep -A 50 Env
```
#### Step 3: Copy Data to Target Host
**For Small Data (< 10GB)**: Use SCP
```bash
# From your workstation
scp -r [source-host]:/volume1/docker/[service-name] /tmp/
scp -r /tmp/[service-name] [target-host]:/path/to/docker/
```
**For Large Data (> 10GB)**: Use Rsync
```bash
# From source host to target host via Tailscale
ssh [source-host]
rsync -avz --progress /volume1/docker/[service-name]/ \
[target-host-tailscale-ip]:/path/to/docker/[service-name]/
# Monitor progress
watch -n 5 'du -sh /path/to/docker/[service-name]'
```
**For Very Large Data (> 100GB)**: Consider physical transfer
```bash
# Copy to USB drive, physically move, then copy to target
# Or use network-attached storage as intermediate
```
#### Step 4: Stop Service on Source Host
```bash
# SSH to source host
ssh [source-host]
# Stop the container
docker stop [service-name]
# Verify it's stopped
docker ps -a | grep [service-name]
```
#### Step 5: Update Git Configuration
```bash
# On your workstation
cd ~/Documents/repos/homelab
# Move service definition to new host
git mv hosts/[source-host]/[service-name].yaml \
hosts/[target-host]/[service-name].yaml
# Update paths in the configuration file if needed
nano hosts/[target-host]/[service-name].yaml
# Update volume paths for target host
# Atlantis/Calypso: /volume1/docker/[service-name]
# NUC/VM: /home/user/docker/[service-name]
# Raspberry Pi: /home/pi/docker/[service-name]
# Commit changes
git add hosts/[target-host]/[service-name].yaml
git commit -m "Migrate [service-name] from [source-host] to [target-host]
- Move service configuration
- Update volume paths for target host
- Migration date: $(date +%Y-%m-%d)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>"
git push origin main
```
#### Step 6: Deploy on Target Host
**Via Portainer UI**:
1. Open Portainer → Select target host endpoint
2. Go to **Stacks****Add stack****Git Repository**
3. Configure:
- Repository URL: Your git repository
- Compose path: `hosts/[target-host]/[service-name].yaml`
- Enable GitOps (optional)
4. Click **Deploy the stack**
**Via GitOps Auto-Sync**:
- Wait 5-10 minutes for automatic deployment
- Monitor Portainer for new stack appearance
#### Step 7: Verify Migration
```bash
# SSH to target host
ssh [target-host]
# Check container is running
docker ps | grep [service-name]
# Check logs for errors
docker logs [service-name] --tail 100
# Test service accessibility
curl http://localhost:[port] # Internal
curl https://[service].vish.gg # External (if applicable)
# Verify data integrity
docker exec [service-name] ls -lah /config
docker exec [service-name] ls -lah /data
# Check resource usage
docker stats [service-name] --no-stream
```
#### Step 8: Update DNS/Reverse Proxy (If Applicable)
```bash
# Update Nginx Proxy Manager or reverse proxy configuration
# Point [service].vish.gg to new host IP
# Update Cloudflare DNS if using Cloudflare Tunnels
# Update local DNS (AdGuard Home) if applicable
```
#### Step 9: Remove from Source Host
**Only after verifying target is working correctly!**
```bash
# SSH to source host
ssh [source-host]
# Remove container and volumes
docker stop [service-name]
docker rm [service-name]
# Optional: Remove volumes (only if data copied successfully)
# docker volume rm $(docker volume ls -q | grep [service-name])
# Remove data directory
rm -rf /volume1/docker/[service-name] # BE CAREFUL!
# Remove from Portainer if manually managed
# Portainer UI → Stacks → Remove stack
```
### Method B: Manual Export/Import
Best for: Quick migrations without git changes, or when testing
#### Step 1: Stop and Export
```bash
# SSH to source host
ssh [source-host]
# Stop service
docker stop [service-name]
# Export container and volumes
docker run --rm \
-v [service-name]_data:/source \
-v /tmp:/backup \
alpine tar czf /backup/[service-name]-data.tar.gz -C /source .
# Export configuration
docker inspect [service-name] > /tmp/[service-name]-config.json
```
#### Step 2: Transfer to Target
```bash
# Copy data to target host
scp /tmp/[service-name]-data.tar.gz [target-host]:/tmp/
scp /tmp/[service-name]-config.json [target-host]:/tmp/
```
#### Step 3: Import on Target
```bash
# SSH to target host
ssh [target-host]
# Create volume
docker volume create [service-name]_data
# Import data
docker run --rm \
-v [service-name]_data:/target \
-v /tmp:/backup \
alpine tar xzf /backup/[service-name]-data.tar.gz -C /target
# Create and start container using saved configuration
# Adjust paths and ports as needed
docker create --name [service-name] \
[options-from-config.json] \
[image:tag]
docker start [service-name]
```
## Post-Migration Tasks
### Update Documentation
```bash
# Update service inventory
nano docs/services/VERIFIED_SERVICE_INVENTORY.md
# Update the host column for migrated service
# | Service | Host | Port | URL | Status |
# | Service | [NEW-HOST] | 8080 | https://service.vish.gg | ✅ Active |
```
### Update Monitoring
```bash
# Update Prometheus configuration if needed
nano prometheus/prometheus.yml
# Update target host IP for scraped metrics
# Restart Prometheus if configuration changed
```
### Test Backups
```bash
# Verify backups work on new host
./backup.sh --test
# Ensure service data is included in backup
ls -lah /path/to/backups/[service-name]
```
### Performance Baseline
```bash
# Document baseline performance on new host
docker stats [service-name] --no-stream
# Monitor for 24 hours to ensure stability
```
## Verification Checklist
- [ ] Service running on target host: `docker ps`
- [ ] All data migrated correctly
- [ ] Configuration preserved
- [ ] Logs show no errors: `docker logs [service]`
- [ ] External access works (if applicable)
- [ ] Internal service connectivity works
- [ ] Reverse proxy updated (if applicable)
- [ ] DNS records updated (if applicable)
- [ ] Monitoring updated
- [ ] Documentation updated
- [ ] Backups include new location
- [ ] Old host cleaned up
- [ ] Users notified of any URL changes
## Rollback Procedure
If migration fails or causes issues:
### Quick Rollback (Within 24 hours)
```bash
# SSH to source host
ssh [source-host]
# Restore from backup
docker import /tmp/[service-name]-backup.tar [service-name]:backup
# Or redeploy from git (revert git changes)
cd ~/Documents/repos/homelab
git revert HEAD
git push origin main
# Restart service on source host
# Via Portainer or:
docker start [service-name]
```
### Full Rollback (After cleanup)
```bash
# Restore from backup
./restore.sh [backup-date]
# Redeploy to original host
# Follow original deployment procedure
```
## Troubleshooting
### Issue: Data Transfer Very Slow
**Symptoms**: Rsync taking hours for moderate data
**Solutions**:
```bash
# Use compression for better network performance
rsync -avz --compress-level=6 --progress /source/ [target]:/dest/
# Or use parallel transfer tools
# Install: sudo apt-get install parallel
find /source -type f | parallel -j 4 scp {} [target]:/dest/{}
# For extremely large transfers, consider:
# 1. Physical USB drive transfer
# 2. NFS mount between hosts
# 3. Transfer during off-peak hours
```
### Issue: Service Won't Start on Target Host
**Symptoms**: Container starts then immediately exits
**Solutions**:
```bash
# Check logs
docker logs [service-name]
# Common issues:
# 1. Path issues - Update volume paths in compose file
# 2. Permission issues - Check PUID/PGID
# 3. Port conflicts - Check if port already in use
# 4. Missing dependencies - Ensure all required services running
# Fix permissions
docker exec [service-name] chown -R 1000:1000 /config /data
```
### Issue: Lost Configuration Data
**Symptoms**: Service starts but settings are default
**Solutions**:
```bash
# Check if volumes mounted correctly
docker inspect [service-name] | grep -A 10 Mounts
# Restore configuration from backup
docker stop [service-name]
docker run --rm -v [service-name]_config:/target -v /tmp:/backup alpine \
tar xzf /backup/config-backup.tar.gz -C /target
docker start [service-name]
```
### Issue: Network Connectivity Problems
**Symptoms**: Service can't reach other services
**Solutions**:
```bash
# Check network configuration
docker network ls
docker network inspect [network-name]
# Add service to required networks
docker network connect [network-name] [service-name]
# Verify DNS resolution
docker exec [service-name] ping [other-service]
```
## Migration Examples
### Example 1: Migrate Uptime Kuma from Calypso to Homelab VM
```bash
# 1. Backup on Calypso
ssh calypso
docker stop uptime-kuma
tar czf /tmp/uptime-kuma-data.tar.gz /volume1/docker/uptime-kuma
# 2. Transfer
scp /tmp/uptime-kuma-data.tar.gz homelab-vm:/tmp/
# 3. Update git
cd ~/Documents/repos/homelab
git mv hosts/synology/calypso/uptime-kuma.yaml \
hosts/vms/homelab-vm/uptime-kuma.yaml
# Update paths in file
sed -i 's|/volume1/docker/uptime-kuma|/home/user/docker/uptime-kuma|g' \
hosts/vms/homelab-vm/uptime-kuma.yaml
# 4. Deploy on target
git add . && git commit -m "Migrate Uptime Kuma to Homelab VM" && git push
# 5. Verify and cleanup Calypso
```
### Example 2: Migrate AdGuard Home between Hosts
```bash
# AdGuard Home requires DNS configuration updates
# 1. Note current DNS settings on clients
# 2. Migrate service (as above)
# 3. Update client DNS to point to new host IP
# 4. Test DNS resolution from clients
```
## Related Documentation
- [Add New Service](add-new-service.md)
- [Infrastructure Overview](../infrastructure/INFRASTRUCTURE_OVERVIEW.md)
- [Backup Strategies](../admin/backup-strategies.md)
- [Deployment Workflow](../admin/DEPLOYMENT_WORKFLOW.md)
## Change Log
- 2026-02-14 - Initial creation with multiple migration methods
- 2026-02-14 - Added large data migration strategies