Sanitized mirror from private repository - 2026-03-28 12:26:38 UTC
This commit is contained in:
239
docs/infrastructure/split-horizon-dns.md
Normal file
239
docs/infrastructure/split-horizon-dns.md
Normal file
@@ -0,0 +1,239 @@
|
||||
# Split-Horizon DNS Implementation Guide
|
||||
|
||||
Last updated: 2026-03-20
|
||||
|
||||
## Problem
|
||||
|
||||
All DNS queries for `*.vish.gg`, `*.thevish.io`, and `*.crista.love` currently resolve to Cloudflare proxy IPs (104.21.x.x), even when the client is on the same LAN as the services. This means:
|
||||
|
||||
1. **Hairpin NAT** — LAN traffic goes out to Cloudflare and back in through the router
|
||||
2. **Internet dependency** — if the WAN link goes down, LAN services are unreachable by domain
|
||||
3. **Added latency** — ~50ms roundtrip through Cloudflare vs ~1ms on LAN
|
||||
4. **Cloudflare bottleneck** — all traffic proxied through CF even when unnecessary
|
||||
|
||||
## Solution
|
||||
|
||||
**Status: IMPLEMENTED (2026-03-20)**
|
||||
|
||||
Use AdGuard Home on Calypso (primary) and Atlantis (backup) as **split-horizon DNS resolvers** that return local IPs for homelab domains when queried from the LAN, while external clients continue to use Cloudflare.
|
||||
|
||||
```
|
||||
┌──────────────────────────────────┐
|
||||
│ DNS Query for │
|
||||
│ nb.vish.gg │
|
||||
└───────────────┬──────────────────┘
|
||||
│
|
||||
┌───────────────▼──────────────────┐
|
||||
│ Where is the client? │
|
||||
└───────┬───────────────┬──────────┘
|
||||
│ │
|
||||
LAN Client External Client
|
||||
│ │
|
||||
▼ ▼
|
||||
┌──────────────┐ ┌──────────────┐
|
||||
│ AdGuard Home │ │ Cloudflare │
|
||||
│ (Calypso + │ │ DNS │
|
||||
│ Atlantis) │ │ │
|
||||
│ Returns: │ │ Returns: │
|
||||
│100.85.21.51 │ │ 104.21.73.214│
|
||||
│(NPM Tailscale)│ │ (CF proxy) │
|
||||
└──────┬───────┘ └──────┬───────┘
|
||||
│ │
|
||||
▼ ▼
|
||||
┌──────────────┐ ┌──────────────┐
|
||||
│ NPM (local) │ │ Cloudflare │
|
||||
│ matrix-ubuntu│ │ → WAN IP │
|
||||
│ :443 ~1ms │ │ → NPM │
|
||||
└──────┬───────┘ │ ~50ms │
|
||||
│ └──────┬───────┘
|
||||
▼ ▼
|
||||
┌─────────────────────────────────┐
|
||||
│ Backend Service │
|
||||
│ (same result, faster path) │
|
||||
└─────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Prerequisites
|
||||
|
||||
NPM is now on matrix-ubuntu (192.168.0.154) listening on standard ports 80/443/81. The migration from Calypso was completed on 2026-03-20.
|
||||
|
||||
| Port | Status |
|
||||
|------|--------|
|
||||
| 80:80 | **Active** |
|
||||
| 443:443 | **Active** |
|
||||
| 81:81 | **Active** (Admin UI) |
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### Step 1: Move NPM to Standard Ports -- DONE
|
||||
|
||||
NPM migrated from Calypso to matrix-ubuntu (192.168.0.154) on 2026-03-20. Compose file: `hosts/vms/matrix-ubuntu/nginx-proxy-manager.yaml`. Host nginx on matrix-ubuntu has been disabled (`systemctl disable nginx`); NPM now handles mastodon.vish.gg, mx.vish.gg, and mm.crista.love directly.
|
||||
|
||||
Router port forwards updated:
|
||||
- `WAN:443 → 192.168.0.154:443`
|
||||
- `WAN:80 → 192.168.0.154:80`
|
||||
|
||||
### Step 2: Configure AdGuard DNS Rewrites -- DONE
|
||||
|
||||
AdGuard DNS rewrites configured on both Calypso (http://192.168.0.250:9080) and Atlantis (http://192.168.0.200:9080). Wildcard entries point to NPM's Tailscale IP:
|
||||
|
||||
| Domain | Answer | Notes |
|
||||
|--------|--------|-------|
|
||||
| `*.vish.gg` | `100.85.21.51` | All vish.gg domains → NPM Tailscale IP |
|
||||
| `*.thevish.io` | `100.85.21.51` | All thevish.io domains → NPM Tailscale IP |
|
||||
| `*.crista.love` | `100.85.21.51` | All crista.love domains → NPM Tailscale IP |
|
||||
|
||||
These three wildcards cover all 36 proxy hosts. AdGuard resolves matching queries locally instead of forwarding to upstream DNS.
|
||||
|
||||
**Exceptions** — these domains need direct IPs (not NPM), added as specific overrides:
|
||||
|
||||
| Domain | Answer | Reason |
|
||||
|--------|--------|--------|
|
||||
| `mx.vish.gg` | `192.168.0.154` | Matrix federation needs direct access on port 8448 |
|
||||
| `derp.vish.gg` | `192.168.0.250` | DERP relay — direct IP, no CF proxy |
|
||||
| `derp-atl.vish.gg` | `192.168.0.200` | Atlantis DERP relay |
|
||||
| `headscale.vish.gg` | `192.168.0.250` | Headscale control — direct access |
|
||||
| `turn.thevish.io` | `192.168.0.200` | TURN/STUN needs direct UDP |
|
||||
|
||||
**.tail.vish.gg overrides** — specific rewrites to override the wildcard for Tailscale-specific subdomains.
|
||||
|
||||
Specific entries take priority over wildcards in AdGuard.
|
||||
|
||||
### Step 3: Set AdGuard as LAN DNS Server -- DONE
|
||||
|
||||
Router (Archer BE800) DHCP configured with dual AdGuard DNS:
|
||||
|
||||
1. **Primary DNS:** `192.168.0.250` (Calypso AdGuard)
|
||||
2. **Secondary DNS:** `192.168.0.200` (Atlantis AdGuard, backup)
|
||||
|
||||
### Step 4: Configure Atlantis AdGuard (Backup DNS) -- DONE
|
||||
|
||||
Same DNS rewrites added to Atlantis's AdGuard instance (http://192.168.0.200:9080) as backup:
|
||||
|
||||
- Same wildcard rewrites as Calypso (pointing to `100.85.21.51`)
|
||||
- Reachable at `192.168.0.200`
|
||||
|
||||
### Step 5: Test
|
||||
|
||||
```bash
|
||||
# Verify local resolution
|
||||
dig nb.vish.gg @192.168.0.250
|
||||
# Expected: 192.168.0.250 (NPM local IP)
|
||||
|
||||
# Verify external resolution still works
|
||||
dig nb.vish.gg @1.1.1.1
|
||||
# Expected: 104.21.73.214 (Cloudflare proxy)
|
||||
|
||||
# Test HTTPS access via local DNS
|
||||
curl -s --resolve "nb.vish.gg:443:192.168.0.250" https://nb.vish.gg/ -o /dev/null -w "%{http_code} %{time_total}s\n"
|
||||
# Expected: 200 in ~0.05s (vs ~0.15s through Cloudflare)
|
||||
|
||||
# Test all domains resolve locally
|
||||
for domain in nb.vish.gg gf.vish.gg git.vish.gg sso.vish.gg dash.vish.gg; do
|
||||
ip=$(dig +short $domain @192.168.0.250 | tail -1)
|
||||
echo "$domain → $ip"
|
||||
done
|
||||
```
|
||||
|
||||
## SSL Considerations
|
||||
|
||||
**Resolved:** NPM now uses **Let's Encrypt wildcard certificates** (DNS challenge via Cloudflare API) instead of Cloudflare Origin certs. This means:
|
||||
|
||||
- Certs are trusted by all browsers, whether traffic comes through Cloudflare or directly via LAN
|
||||
- No browser warnings for split-horizon DNS LAN access
|
||||
- Certs auto-renew via NPM's built-in Let's Encrypt integration
|
||||
|
||||
## What Changes for Each Path
|
||||
|
||||
### LAN Client
|
||||
```
|
||||
Browser → nb.vish.gg
|
||||
→ AdGuard DNS: 100.85.21.51 (NPM Tailscale IP)
|
||||
→ NPM (matrix-ubuntu:443) → SSL termination (LE wildcard cert)
|
||||
→ Proxy to backend (192.168.0.210:8443)
|
||||
→ Response (~1ms total DNS+proxy)
|
||||
```
|
||||
|
||||
### External Client
|
||||
```
|
||||
Browser → nb.vish.gg
|
||||
→ Cloudflare DNS: 104.21.73.214
|
||||
→ Cloudflare proxy → WAN IP → Router
|
||||
→ NPM (matrix-ubuntu:443) → SSL termination
|
||||
→ Proxy to backend (192.168.0.210:8443)
|
||||
→ Response (~50ms total)
|
||||
```
|
||||
|
||||
### Internet Down
|
||||
```
|
||||
Browser → nb.vish.gg
|
||||
→ AdGuard DNS: 100.85.21.51 (cached/local)
|
||||
→ NPM (matrix-ubuntu:443) → SSL termination
|
||||
→ Proxy to backend
|
||||
→ Response (services still work!)
|
||||
```
|
||||
|
||||
## Current NPM Proxy Hosts (for reference)
|
||||
|
||||
All 36 domains that would benefit from split-horizon:
|
||||
|
||||
### vish.gg (27 domains)
|
||||
| Domain | Backend |
|
||||
|--------|---------|
|
||||
| actual.vish.gg | calypso:8304 |
|
||||
| cal.vish.gg | atlantis:12852 |
|
||||
| dash.vish.gg | atlantis:7575 |
|
||||
| dav.vish.gg | calypso:8612 |
|
||||
| docs.vish.gg | calypso:8777 |
|
||||
| gf.vish.gg | homelab-vm:3300 |
|
||||
| git.vish.gg | calypso:3052 |
|
||||
| headscale.vish.gg | calypso:8085 |
|
||||
| kuma.vish.gg | rpi5:3001 |
|
||||
| mastodon.vish.gg | matrix-ubuntu:3000 |
|
||||
| mx.vish.gg | matrix-ubuntu:8082 |
|
||||
| nb.vish.gg | homelab-vm:8443 |
|
||||
| npm.vish.gg | calypso:81 |
|
||||
| ntfy.vish.gg | homelab-vm:8081 |
|
||||
| ollama.vish.gg | atlantis:11434 |
|
||||
| ost.vish.gg | calypso:3000 |
|
||||
| paperless.vish.gg | calypso:8777 |
|
||||
| pt.vish.gg | atlantis:10000 |
|
||||
| pw.vish.gg | atlantis:4080 |
|
||||
| rackula.vish.gg | calypso:3891 |
|
||||
| retro.vish.gg | calypso:8025 |
|
||||
| rx.vish.gg | calypso:9751 |
|
||||
| rxdl.vish.gg | calypso:9753 |
|
||||
| scrutiny.vish.gg | homelab-vm:8090 |
|
||||
| sf.vish.gg | calypso:8611 |
|
||||
| sso.vish.gg | calypso:9000 |
|
||||
| wizarr.vish.gg | atlantis:5690 |
|
||||
|
||||
### thevish.io (5 domains)
|
||||
| Domain | Backend |
|
||||
|--------|---------|
|
||||
| binterest.thevish.io | homelab-vm:21544 |
|
||||
| hoarder.thevish.io | homelab-vm:3482 |
|
||||
| joplin.thevish.io | atlantis:22300 |
|
||||
| matrix.thevish.io | matrix-ubuntu:8081 |
|
||||
| meet.thevish.io | atlantis:5443 |
|
||||
|
||||
### crista.love (2 domains)
|
||||
| Domain | Backend |
|
||||
|--------|---------|
|
||||
| crista.love | guava:28888 |
|
||||
| cocalc.crista.love | guava:8080 |
|
||||
| mm.crista.love | matrix-ubuntu:8065 |
|
||||
|
||||
## Rollback
|
||||
|
||||
If something breaks:
|
||||
1. Change router DHCP DNS back to `1.1.1.1` / `8.8.8.8`
|
||||
2. Or remove the DNS rewrites from AdGuard
|
||||
3. All traffic reverts to Cloudflare path immediately
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [NPM Migration](npm-migration-jan2026.md) — Reverse proxy configuration
|
||||
- [Authentik SSO](authentik-sso.md) — Forward auth depends on NPM routing
|
||||
- [Cloudflare DNS](cloudflare-dns.md) — External DNS records
|
||||
- [Image Update Guide](../admin/IMAGE_UPDATE_GUIDE.md) — Mentions Gitea/NPM as bootstrap dependencies
|
||||
Reference in New Issue
Block a user