Sanitized mirror from private repository - 2026-03-20 09:03:20 UTC
This commit is contained in:
260
docs/infrastructure/split-horizon-dns.md
Normal file
260
docs/infrastructure/split-horizon-dns.md
Normal file
@@ -0,0 +1,260 @@
|
||||
# Split-Horizon DNS Implementation Guide
|
||||
|
||||
Last updated: 2026-03-20
|
||||
|
||||
## Problem
|
||||
|
||||
All DNS queries for `*.vish.gg`, `*.thevish.io`, and `*.crista.love` currently resolve to Cloudflare proxy IPs (104.21.x.x), even when the client is on the same LAN as the services. This means:
|
||||
|
||||
1. **Hairpin NAT** — LAN traffic goes out to Cloudflare and back in through the router
|
||||
2. **Internet dependency** — if the WAN link goes down, LAN services are unreachable by domain
|
||||
3. **Added latency** — ~50ms roundtrip through Cloudflare vs ~1ms on LAN
|
||||
4. **Cloudflare bottleneck** — all traffic proxied through CF even when unnecessary
|
||||
|
||||
## Solution
|
||||
|
||||
Use AdGuard Home on Calypso as a **split-horizon DNS resolver** that returns local IPs for homelab domains when queried from the LAN, while external clients continue to use Cloudflare.
|
||||
|
||||
```
|
||||
┌──────────────────────────────────┐
|
||||
│ DNS Query for │
|
||||
│ nb.vish.gg │
|
||||
└───────────────┬──────────────────┘
|
||||
│
|
||||
┌───────────────▼──────────────────┐
|
||||
│ Where is the client? │
|
||||
└───────┬───────────────┬──────────┘
|
||||
│ │
|
||||
LAN Client External Client
|
||||
│ │
|
||||
▼ ▼
|
||||
┌──────────────┐ ┌──────────────┐
|
||||
│ AdGuard Home │ │ Cloudflare │
|
||||
│ (Calypso) │ │ DNS │
|
||||
│ │ │ │
|
||||
│ Returns: │ │ Returns: │
|
||||
│ 192.168.0.250│ │ 104.21.73.214│
|
||||
│ (NPM local) │ │ (CF proxy) │
|
||||
└──────┬───────┘ └──────┬───────┘
|
||||
│ │
|
||||
▼ ▼
|
||||
┌──────────────┐ ┌──────────────┐
|
||||
│ NPM (local) │ │ Cloudflare │
|
||||
│ calypso:443 │ │ → WAN IP │
|
||||
│ ~1ms │ │ → NPM │
|
||||
└──────┬───────┘ │ ~50ms │
|
||||
│ └──────┬───────┘
|
||||
▼ ▼
|
||||
┌─────────────────────────────────┐
|
||||
│ Backend Service │
|
||||
│ (same result, faster path) │
|
||||
└─────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before implementing split-horizon DNS, NPM must listen on standard ports (80/443) so that LAN clients can reach it without specifying a port. Currently NPM uses temporary ports from the migration:
|
||||
|
||||
| Current | Target |
|
||||
|---------|--------|
|
||||
| 8880:80 | **80:80** |
|
||||
| 8443:443 | **443:443** |
|
||||
| 81:81 | 81:81 (unchanged) |
|
||||
|
||||
## Implementation Steps
|
||||
|
||||
### Step 1: Move NPM to Standard Ports
|
||||
|
||||
The NPM compose file at `hosts/synology/calypso/nginx-proxy-manager.yaml` has a comment noting the ports are temporary. To change them:
|
||||
|
||||
1. **Stop Synology's built-in nginx** from binding port 80/443 (if active):
|
||||
- DSM → Control Panel → Login Portal → Web Services → change port from 80/443 to 5000/5001
|
||||
- Or via SSH: `sudo synosystemctl stop nginx`
|
||||
|
||||
2. **Update the compose file:**
|
||||
```yaml
|
||||
ports:
|
||||
- "80:80" # HTTP
|
||||
- "443:443" # HTTPS
|
||||
- "81:81" # Admin UI
|
||||
```
|
||||
|
||||
3. **Update the router port forwarding:**
|
||||
- Change `WAN:443 → 192.168.0.250:8443` to `WAN:443 → 192.168.0.250:443`
|
||||
- Change `WAN:80 → 192.168.0.250:8880` to `WAN:80 → 192.168.0.250:80`
|
||||
|
||||
4. **Redeploy NPM** — push the compose change to git, CI auto-deploys.
|
||||
|
||||
### Step 2: Configure AdGuard DNS Rewrites
|
||||
|
||||
In AdGuard Home on Calypso (http://192.168.0.250:9080), go to **Filters → DNS rewrites** and add wildcard entries:
|
||||
|
||||
| Domain | Answer | Notes |
|
||||
|--------|--------|-------|
|
||||
| `*.vish.gg` | `192.168.0.250` | All vish.gg domains → NPM on Calypso |
|
||||
| `*.thevish.io` | `192.168.0.250` | All thevish.io domains → NPM on Calypso |
|
||||
| `*.crista.love` | `192.168.0.250` | All crista.love domains → NPM on Calypso |
|
||||
|
||||
These three wildcards cover all 36 proxy hosts. AdGuard resolves matching queries locally instead of forwarding to upstream DNS.
|
||||
|
||||
**Exceptions** — these domains need direct IPs (not NPM), add them as specific overrides:
|
||||
|
||||
| Domain | Answer | Reason |
|
||||
|--------|--------|--------|
|
||||
| `mx.vish.gg` | `192.168.0.154` | Matrix federation needs direct access on port 8448 |
|
||||
| `derp.vish.gg` | `192.168.0.250` | DERP relay — direct IP, no CF proxy |
|
||||
| `derp-atl.vish.gg` | `192.168.0.200` | Atlantis DERP relay |
|
||||
| `headscale.vish.gg` | `192.168.0.250` | Headscale control — direct access |
|
||||
| `turn.thevish.io` | `192.168.0.200` | TURN/STUN needs direct UDP |
|
||||
|
||||
Specific entries take priority over wildcards in AdGuard.
|
||||
|
||||
### Step 3: Set AdGuard as LAN DNS Server
|
||||
|
||||
Configure the router (Archer BE800) to hand out AdGuard's IP as the DNS server via DHCP:
|
||||
|
||||
1. **Router admin** → DHCP Settings → DNS Server
|
||||
2. Set Primary DNS: `192.168.0.250` (Calypso/AdGuard)
|
||||
3. Set Secondary DNS: `192.168.68.100` (NUC/AdGuard, backup)
|
||||
|
||||
Or per-device: point `/etc/resolv.conf` or network settings to `192.168.0.250`.
|
||||
|
||||
### Step 4: Configure NUC AdGuard (Backup DNS)
|
||||
|
||||
Add the same DNS rewrites to the NUC's AdGuard instance so it works as a backup:
|
||||
|
||||
- Same wildcard rewrites as Calypso
|
||||
- Reachable at `192.168.68.100` or `100.72.55.21` (Tailscale)
|
||||
|
||||
### Step 5: Test
|
||||
|
||||
```bash
|
||||
# Verify local resolution
|
||||
dig nb.vish.gg @192.168.0.250
|
||||
# Expected: 192.168.0.250 (NPM local IP)
|
||||
|
||||
# Verify external resolution still works
|
||||
dig nb.vish.gg @1.1.1.1
|
||||
# Expected: 104.21.73.214 (Cloudflare proxy)
|
||||
|
||||
# Test HTTPS access via local DNS
|
||||
curl -s --resolve "nb.vish.gg:443:192.168.0.250" https://nb.vish.gg/ -o /dev/null -w "%{http_code} %{time_total}s\n"
|
||||
# Expected: 200 in ~0.05s (vs ~0.15s through Cloudflare)
|
||||
|
||||
# Test all domains resolve locally
|
||||
for domain in nb.vish.gg gf.vish.gg git.vish.gg sso.vish.gg dash.vish.gg; do
|
||||
ip=$(dig +short $domain @192.168.0.250 | tail -1)
|
||||
echo "$domain → $ip"
|
||||
done
|
||||
```
|
||||
|
||||
## SSL Considerations
|
||||
|
||||
This works because:
|
||||
- NPM has the **Cloudflare Origin Certificate** for `*.vish.gg` (valid until 2041)
|
||||
- Browsers trust this cert because it's signed by Cloudflare's CA
|
||||
- The cert works whether traffic comes through Cloudflare or directly
|
||||
|
||||
**However**, the origin cert is only trusted by Cloudflare's proxy. If a browser connects directly to NPM (bypassing CF), it will see an untrusted cert warning because Cloudflare Origin CA is not in public trust stores.
|
||||
|
||||
**Fix options:**
|
||||
1. **Use Let's Encrypt certs in NPM** instead of Cloudflare Origin — trusted everywhere, works for both paths
|
||||
2. **Accept the warning** for LAN-only access (add exception in browser)
|
||||
3. **Use Cloudflare in "Full" mode** (not "Full Strict") — CF doesn't validate origin cert, and LAN clients would need to add the Cloudflare Origin CA to their trust store
|
||||
|
||||
**Recommended:** Switch to Let's Encrypt with DNS challenge (Cloudflare API) for the wildcard certs. NPM supports this natively. This gives you certs trusted by both Cloudflare and direct LAN connections.
|
||||
|
||||
## What Changes for Each Path
|
||||
|
||||
### LAN Client (after implementation)
|
||||
```
|
||||
Browser → nb.vish.gg
|
||||
→ AdGuard DNS: 192.168.0.250
|
||||
→ NPM (calypso:443) → SSL termination
|
||||
→ Proxy to backend (192.168.0.210:8443)
|
||||
→ Response (~1ms total DNS+proxy)
|
||||
```
|
||||
|
||||
### External Client (unchanged)
|
||||
```
|
||||
Browser → nb.vish.gg
|
||||
→ Cloudflare DNS: 104.21.73.214
|
||||
→ Cloudflare proxy → WAN IP → Router
|
||||
→ NPM (calypso:443) → SSL termination
|
||||
→ Proxy to backend (192.168.0.210:8443)
|
||||
→ Response (~50ms total)
|
||||
```
|
||||
|
||||
### Internet Down (new capability)
|
||||
```
|
||||
Browser → nb.vish.gg
|
||||
→ AdGuard DNS: 192.168.0.250 (cached/local)
|
||||
→ NPM (calypso:443) → SSL termination
|
||||
→ Proxy to backend
|
||||
→ Response (services still work!)
|
||||
```
|
||||
|
||||
## Current NPM Proxy Hosts (for reference)
|
||||
|
||||
All 36 domains that would benefit from split-horizon:
|
||||
|
||||
### vish.gg (27 domains)
|
||||
| Domain | Backend |
|
||||
|--------|---------|
|
||||
| actual.vish.gg | calypso:8304 |
|
||||
| cal.vish.gg | atlantis:12852 |
|
||||
| dash.vish.gg | atlantis:7575 |
|
||||
| dav.vish.gg | calypso:8612 |
|
||||
| docs.vish.gg | calypso:8777 |
|
||||
| gf.vish.gg | homelab-vm:3300 |
|
||||
| git.vish.gg | calypso:3052 |
|
||||
| headscale.vish.gg | calypso:8085 |
|
||||
| kuma.vish.gg | rpi5:3001 |
|
||||
| mastodon.vish.gg | matrix-ubuntu:3000 |
|
||||
| mx.vish.gg | matrix-ubuntu:8082 |
|
||||
| nb.vish.gg | homelab-vm:8443 |
|
||||
| npm.vish.gg | calypso:81 |
|
||||
| ntfy.vish.gg | homelab-vm:8081 |
|
||||
| ollama.vish.gg | atlantis:11434 |
|
||||
| ost.vish.gg | calypso:3000 |
|
||||
| paperless.vish.gg | calypso:8777 |
|
||||
| pt.vish.gg | atlantis:10000 |
|
||||
| pw.vish.gg | atlantis:4080 |
|
||||
| rackula.vish.gg | calypso:3891 |
|
||||
| retro.vish.gg | calypso:8025 |
|
||||
| rx.vish.gg | calypso:9751 |
|
||||
| rxdl.vish.gg | calypso:9753 |
|
||||
| scrutiny.vish.gg | homelab-vm:8090 |
|
||||
| sf.vish.gg | calypso:8611 |
|
||||
| sso.vish.gg | calypso:9000 |
|
||||
| wizarr.vish.gg | atlantis:5690 |
|
||||
|
||||
### thevish.io (5 domains)
|
||||
| Domain | Backend |
|
||||
|--------|---------|
|
||||
| binterest.thevish.io | homelab-vm:21544 |
|
||||
| hoarder.thevish.io | homelab-vm:3482 |
|
||||
| joplin.thevish.io | atlantis:22300 |
|
||||
| matrix.thevish.io | matrix-ubuntu:8081 |
|
||||
| meet.thevish.io | atlantis:5443 |
|
||||
|
||||
### crista.love (2 domains)
|
||||
| Domain | Backend |
|
||||
|--------|---------|
|
||||
| crista.love | guava:28888 |
|
||||
| cocalc.crista.love | guava:8080 |
|
||||
| mm.crista.love | matrix-ubuntu:8065 |
|
||||
|
||||
## Rollback
|
||||
|
||||
If something breaks:
|
||||
1. Change router DHCP DNS back to `1.1.1.1` / `8.8.8.8`
|
||||
2. Or remove the DNS rewrites from AdGuard
|
||||
3. All traffic reverts to Cloudflare path immediately
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [NPM Migration](npm-migration-jan2026.md) — Reverse proxy configuration
|
||||
- [Authentik SSO](authentik-sso.md) — Forward auth depends on NPM routing
|
||||
- [Cloudflare DNS](cloudflare-dns.md) — External DNS records
|
||||
- [Image Update Guide](../admin/IMAGE_UPDATE_GUIDE.md) — Mentions Gitea/NPM as bootstrap dependencies
|
||||
Reference in New Issue
Block a user