Sanitized mirror from private repository - 2026-03-21 05:42:25 UTC
Some checks failed
Documentation / Build Docusaurus (push) Failing after 4m57s
Documentation / Deploy to GitHub Pages (push) Has been skipped

This commit is contained in:
Gitea Mirror Bot
2026-03-21 05:42:25 +00:00
commit 2fcf09efcf
1235 changed files with 306088 additions and 0 deletions

View File

@@ -0,0 +1,275 @@
# NPM Migration: Calypso → matrix-ubuntu
**Status:** COMPLETE
**Completed:** 2026-03-20
**Risk:** Medium (all proxied services briefly down during cutover)
## Overview
Migrate Nginx Proxy Manager from Calypso (Synology DS723+) to matrix-ubuntu VM (192.168.0.154) to enable split-horizon DNS. Synology's built-in nginx occupies ports 80/443 and can't be easily moved, so NPM gets a new home where it can bind 80/443 directly.
## Current State
```
Internet → Router:443 → Calypso:8443 (NPM) → backends
Internet → Router:80 → Calypso:8880 (NPM) → backends
```
| Component | Location | Ports |
|-----------|----------|-------|
| NPM | Calypso (192.168.0.250) | 8880/8443/81 |
| Host nginx | matrix-ubuntu (192.168.0.154) | 443 (mastodon, matrix, mattermost) |
| Synology nginx | Calypso (192.168.0.250) | 80/443 (DSM redirect, can't remove) |
## Target State
```
Internet → Router:443 → matrix-ubuntu:443 (NPM) → backends
Internet → Router:80 → matrix-ubuntu:80 (NPM) → backends
LAN → AdGuard → matrix-ubuntu:443 (NPM) → backends (split-horizon)
```
| Component | Location | Ports |
|-----------|----------|-------|
| NPM | matrix-ubuntu (192.168.0.154) | **80/443/81** |
| Host nginx | **removed** (NPM handles all routing) | — |
| Synology nginx | Calypso (unchanged) | 80/443 (irrelevant, not used) |
## Pre-Migration Checklist
- [x] Back up Calypso NPM data (`/home/homelab/backups/npm-migration-20260320/npm-backup-20260320.tar.gz`)
- [x] Back up matrix-ubuntu nginx config (`/home/homelab/backups/npm-migration-20260320/nginx-backup-20260320.tar.gz`)
- [x] Verify matrix-ubuntu has sufficient resources (7.7GB RAM, 25GB disk free)
- [x] Verify port 80 is free on matrix-ubuntu
- [x] Port 443 freed — host nginx stopped and disabled during migration
## Services Currently on matrix-ubuntu's Host Nginx
These 3 services use host nginx on port 443 with SNI-based routing:
| Domain | Backend | nginx Config |
|--------|---------|-------------|
| mastodon.vish.gg | localhost:3000 (Mastodon web) | `/etc/nginx/sites-enabled/mastodon` |
| mx.vish.gg | localhost:8008 (Synapse) on 443, localhost:8018 on 8082 | `/etc/nginx/sites-enabled/matrix` |
| mm.crista.love | localhost:8065 (Mattermost) | `/etc/nginx/sites-enabled/mattermost` |
**These must be re-created as NPM proxy hosts** before removing host nginx.
Additional matrix-ubuntu nginx services on non-443 ports (can coexist or migrate):
| Domain | Port | Backend |
|--------|------|---------|
| matrix.thevish.io | 8081 | localhost:8008 |
| mx.vish.gg (federation) | 8082 | localhost:8018 |
| mx.vish.gg (client) | 8080 | localhost:8008 |
## Migration Steps
### Phase 1: Install NPM on matrix-ubuntu
```bash
# Create NPM data directory
ssh matrix-ubuntu "sudo mkdir -p /opt/npm/{data,letsencrypt}"
# Deploy NPM via docker compose (initially on temp ports to avoid conflict)
# Use ports 8880/8443/81 while host nginx still runs on 443
```
Compose file to create at `hosts/vms/matrix-ubuntu/nginx-proxy-manager.yaml`:
```yaml
services:
nginx-proxy-manager:
image: jc21/nginx-proxy-manager:latest
container_name: nginx-proxy-manager
ports:
- "80:80" # HTTP
- "443:443" # HTTPS
- "81:81" # Admin UI
environment:
TZ: America/Los_Angeles
volumes:
- /opt/npm/data:/data
- /opt/npm/letsencrypt:/etc/letsencrypt
restart: unless-stopped
```
### Phase 2: Migrate NPM Data
```bash
# Copy NPM data from Calypso to matrix-ubuntu
scp /home/homelab/backups/npm-migration-20260320/npm-backup-20260320.tar.gz matrix-ubuntu:/tmp/
# Extract to NPM directory
ssh matrix-ubuntu "sudo tar xzf /tmp/npm-backup-20260320.tar.gz -C /opt/npm/data/"
```
This brings over all 36 proxy hosts, SSL certs, access lists, and configuration.
### Phase 3: Update Proxy Host Backends
Several proxy hosts currently point to `192.168.0.250` (Calypso LAN IP) for services still on Calypso. These stay the same — NPM on matrix-ubuntu will proxy to Calypso's IP just like before.
Proxy hosts that currently point to `100.67.40.126` (homelab-vm Tailscale) should be updated to LAN IPs for better performance:
| Domain | Current Backend | New Backend |
|--------|----------------|-------------|
| gf.vish.gg | 100.67.40.126:3300 | 192.168.0.210:3300 |
| nb.vish.gg | 100.67.40.126:8443 | 192.168.0.210:8443 |
| ntfy.vish.gg | 100.67.40.126:8081 | 192.168.0.210:8081 |
| scrutiny.vish.gg | 100.67.40.126:8090 | 192.168.0.210:8090 |
| hoarder.thevish.io | 100.67.40.126:3482 | 192.168.0.210:3482 |
| binterest.thevish.io | 100.67.40.126:21544 | 192.168.0.210:21544 |
Add new proxy hosts for services currently handled by host nginx:
| Domain | Backend | SSL |
|--------|---------|-----|
| mastodon.vish.gg | http://127.0.0.1:3000 | *.vish.gg cert |
| mx.vish.gg | http://127.0.0.1:8008 | *.vish.gg cert |
| mm.crista.love | http://127.0.0.1:8065 | *.crista.love cert |
### Phase 4: Cutover (Downtime: ~2 minutes)
This is the sequence that requires your router change:
```
1. Stop host nginx on matrix-ubuntu
ssh matrix-ubuntu "sudo systemctl stop nginx && sudo systemctl disable nginx"
2. Start NPM on matrix-ubuntu (binds 80/443)
cd hosts/vms/matrix-ubuntu && docker compose -f nginx-proxy-manager.yaml up -d
3. Test locally:
curl -sk -H "Host: nb.vish.gg" https://192.168.0.154/ -w "%{http_code}\n"
4. ** YOU: Change router port forwards **
Old: WAN:443 → 192.168.0.250:8443
New: WAN:443 → 192.168.0.154:443
Old: WAN:80 → 192.168.0.250:8880
New: WAN:80 → 192.168.0.154:80
5. Test externally:
curl -s https://nb.vish.gg/ -o /dev/null -w "%{http_code}\n"
6. Stop old NPM on Calypso (after confirming everything works)
```
### Phase 5: Split-Horizon DNS
Once NPM is on matrix-ubuntu with ports 80/443:
1. Add AdGuard DNS rewrites (Calypso AdGuard at http://192.168.0.250:9080):
```
*.vish.gg → 192.168.0.154
*.thevish.io → 192.168.0.154
*.crista.love → 192.168.0.154
```
2. Set router DHCP DNS to 192.168.0.250 (AdGuard)
### Phase 6: Cleanup
```bash
# Stop old NPM on Calypso
ssh calypso "cd /volume1/docker/nginx-proxy-manager && sudo docker compose down"
# Update DDNS — no changes needed (DDNS updates WAN IP, not internal routing)
# Update documentation
# - docs/infrastructure/split-horizon-dns.md
# - docs/infrastructure/npm-migration-jan2026.md
# - Authentik SSO docs (outpost URL may reference calypso)
```
## Rollback Plan
If anything goes wrong at any phase:
### Quick Rollback (< 1 minute)
```bash
# 1. Change router forwards back:
# WAN:443 → 192.168.0.250:8443
# WAN:80 → 192.168.0.250:8880
# 2. Calypso NPM is still running — traffic flows immediately
# 3. Restore host nginx on matrix-ubuntu (if stopped):
ssh matrix-ubuntu "sudo systemctl start nginx"
# 4. Stop new NPM on matrix-ubuntu:
ssh matrix-ubuntu "docker stop nginx-proxy-manager"
```
### Full Rollback
```bash
# If NPM data was corrupted during migration:
ssh matrix-ubuntu "
docker stop nginx-proxy-manager
sudo rm -rf /opt/npm/data/*
sudo systemctl start nginx
"
# Router forwards back to Calypso
# Everything reverts to pre-migration state
# Backups at: /home/homelab/backups/npm-migration-20260320/
```
### Key Rollback Points
| Phase | Rollback Action | Downtime |
|-------|----------------|----------|
| Phase 1-2 (install/copy) | Just stop new NPM, old still running | None |
| Phase 3 (update backends) | Revert in NPM admin UI | None |
| Phase 4 (cutover) | Change router forwards back to Calypso | ~30 seconds |
| Phase 5 (split-horizon) | Remove AdGuard DNS rewrites | ~30 seconds |
| Phase 6 (cleanup) | Restart old Calypso NPM | ~10 seconds |
**The old NPM on Calypso should NOT be stopped until you've confirmed everything works for at least 24 hours.** Keep it as a warm standby.
## Risks
| Risk | Mitigation |
|------|-----------|
| Matrix federation breaks | mx.vish.gg must be re-created in NPM with correct `:8448` federation port handling |
| Mastodon WebSocket breaks | NPM proxy host must enable WebSocket support |
| SSL cert not trusted | Copy Cloudflare origin certs from Calypso NPM data or re-issue Let's Encrypt |
| Authentik outpost can't reach NPM | Update outpost external_host if it references calypso IP |
| Matrix-ubuntu VM goes down | Router forward change back to Calypso takes 30 seconds |
| Memory pressure | NPM uses ~100MB, matrix-ubuntu has 1.4GB available |
## Affected Documentation
After migration, update:
- `docs/infrastructure/split-horizon-dns.md` — NPM IP changes
- `docs/infrastructure/npm-migration-jan2026.md` — historical reference
- `docs/infrastructure/authentik-sso.md` — outpost URLs
- `docs/diagrams/service-architecture.md` — NPM location
- `docs/diagrams/network-topology.md` — traffic flow
- `hosts/synology/calypso/nginx-proxy-manager.yaml` — mark as decommissioned
- `hosts/vms/matrix-ubuntu/nginx-proxy-manager.yaml` — new compose file
## Backups
| What | Location | Size |
|------|----------|------|
| Calypso NPM full data | `/home/homelab/backups/npm-migration-20260320/npm-backup-20260320.tar.gz` | 200MB |
| matrix-ubuntu nginx config | `/home/homelab/backups/npm-migration-20260320/nginx-backup-20260320.tar.gz` | 7.5KB |
## Completion Notes (2026-03-20)
Migration completed successfully. All phases executed, follow-up items resolved:
| Item | Status |
|------|--------|
| NPM on matrix-ubuntu with ports 80/443/81 | Done |
| Router forwards updated to 192.168.0.154 | Done |
| Host nginx disabled on matrix-ubuntu | Done |
| mastodon.vish.gg, mx.vish.gg, mm.crista.love re-created as NPM proxy hosts | Done |
| Let's Encrypt wildcard certs issued (replaced CF Origin certs) | Done |
| Split-horizon DNS via dual AdGuard (Calypso + Atlantis) | Done |
| Headscale control plane unaffected (stays on Calypso) | Confirmed |
| DERP relay routing verified | Confirmed |
| Old NPM on Calypso stopped | Done |