Sanitized mirror from private repository - 2026-04-06 21:14:57 UTC
This commit is contained in:
411
docs/infrastructure/headscale-migration-guide.md
Normal file
411
docs/infrastructure/headscale-migration-guide.md
Normal file
@@ -0,0 +1,411 @@
|
||||
# Headscale Migration Guide
|
||||
|
||||
## Overview
|
||||
|
||||
This homelab uses a self-hosted [Headscale](https://github.com/juanfont/headscale) instance instead of Tailscale cloud. Headscale is a drop-in open-source replacement for the Tailscale control server.
|
||||
|
||||
- **Headscale server**: `https://headscale.vish.gg:8443`
|
||||
- **MagicDNS suffix**: `tail.vish.gg` (e.g. `atlantis.tail.vish.gg`)
|
||||
- **Login**: Authentik SSO at `sso.vish.gg` — username `vish` or email `admin@thevish.io`
|
||||
- **Hosted on**: Calypso (`192.168.0.250`), managed via Docker
|
||||
|
||||
---
|
||||
|
||||
## Connecting a New Device
|
||||
|
||||
### Linux (Ubuntu / Debian)
|
||||
|
||||
1. Install Tailscale if not already installed:
|
||||
```bash
|
||||
curl -fsSL https://tailscale.com/install.sh | sh
|
||||
```
|
||||
|
||||
2. Connect to headscale:
|
||||
```bash
|
||||
sudo tailscale up \
|
||||
--login-server=https://headscale.vish.gg:8443 \
|
||||
--accept-routes \
|
||||
--force-reauth
|
||||
```
|
||||
|
||||
3. A browser auth URL will be printed. Open it and log in with Authentik SSO.
|
||||
|
||||
4. If DNS doesn't resolve `headscale.vish.gg` (e.g. fresh machine with no AdGuard), add a temporary hosts entry first:
|
||||
```bash
|
||||
echo '184.23.52.14 headscale.vish.gg' | sudo tee -a /etc/hosts
|
||||
# Run tailscale up, then clean up:
|
||||
sudo sed -i '/headscale.vish.gg/d' /etc/hosts
|
||||
```
|
||||
|
||||
5. If the machine was previously on Tailscale cloud and complains about non-default flags, Tailscale will print the exact command with all required flags — copy and run that command.
|
||||
|
||||
> **Note**: After registration, an admin must approve the node and fix the IP if preserving the original Tailscale IP (see Admin section below).
|
||||
|
||||
---
|
||||
|
||||
### Windows
|
||||
|
||||
1. Download and install Tailscale from https://tailscale.com/download/windows
|
||||
|
||||
2. Open **PowerShell as Administrator** and run:
|
||||
```powershell
|
||||
tailscale up --login-server=https://headscale.vish.gg:8443 --accept-routes --force-reauth
|
||||
```
|
||||
|
||||
3. A browser window will open — log in with Authentik SSO (`vish` / `admin@thevish.io`).
|
||||
|
||||
4. If it shows a "mention all non-default flags" error, copy and run the exact command it provides, adding `--login-server=https://headscale.vish.gg:8443 --force-reauth` to it.
|
||||
|
||||
> **Important**: Always include `--accept-routes` on Windows otherwise subnet routes (e.g. `192.168.0.x`) won't be reachable.
|
||||
|
||||
---
|
||||
|
||||
### iOS (iPhone / iPad)
|
||||
|
||||
1. Install **Tailscale** from the App Store.
|
||||
|
||||
2. Open the app → tap your **account icon** (top right) → **Log in**
|
||||
|
||||
3. Tap the `···` menu (top right of the login screen) → **Use custom coordination server**
|
||||
|
||||
4. Enter: `https://headscale.vish.gg:8443` → **Save**
|
||||
|
||||
5. Log in with Authentik SSO — username `vish` or email `admin@thevish.io`
|
||||
|
||||
> **Note**: `.vish.local` hostnames do NOT work on iOS — iOS intercepts `.local` for mDNS and never forwards to DNS. Use Tailscale IPs (`100.x.x.x`) or MagicDNS names (`hostname.tail.vish.gg`) instead.
|
||||
|
||||
---
|
||||
|
||||
### macOS
|
||||
|
||||
1. Install Tailscale from the App Store or https://tailscale.com/download/mac
|
||||
|
||||
2. **Option A — GUI**: Click the Tailscale menu bar icon → Preferences → hold `Option` while clicking "Log in" to enter a custom server URL → enter `https://headscale.vish.gg:8443`
|
||||
|
||||
3. **Option B — CLI**:
|
||||
```bash
|
||||
sudo tailscale up \
|
||||
--login-server=https://headscale.vish.gg:8443 \
|
||||
--accept-routes \
|
||||
--force-reauth
|
||||
```
|
||||
|
||||
4. Log in with Authentik SSO when the browser opens.
|
||||
|
||||
> **Note**: Same as iOS, `.vish.local` hostnames won't resolve on macOS when remote. Use `hostname.tail.vish.gg` or the Tailscale IP instead.
|
||||
|
||||
---
|
||||
|
||||
### GL.iNet Routers (OpenWrt)
|
||||
|
||||
1. SSH into the router.
|
||||
|
||||
2. Add a hosts entry (since GL routers don't use AdGuard):
|
||||
```bash
|
||||
echo '184.23.52.14 headscale.vish.gg' >> /etc/hosts
|
||||
```
|
||||
|
||||
3. Run tailscale up — it will error with the required flags. Copy and run the exact command it provides, appending:
|
||||
```
|
||||
--login-server=https://headscale.vish.gg:8443 --auth-key=<preauth-key> --force-reauth
|
||||
```
|
||||
Get a pre-auth key from an admin (see below).
|
||||
|
||||
4. If advertising subnet routes, add `--advertise-routes=<subnet>` to the command.
|
||||
|
||||
---
|
||||
|
||||
### Home Assistant (Tailscale Add-on)
|
||||
|
||||
> **Note**: HA Green does not expose SSH by default. Use the WebSocket API approach below,
|
||||
> which works fully remotely via a Tailscale-connected hop host.
|
||||
|
||||
**Remote migration steps** (no physical access required):
|
||||
|
||||
1. Reach HA via a hop host on the same LAN (e.g. jellyfish at `100.69.121.120`):
|
||||
```
|
||||
ssh lulu@100.69.121.120
|
||||
curl http://192.168.12.202:8123/api/ # confirm HA reachable
|
||||
```
|
||||
|
||||
2. If the add-on was previously authenticated to Tailscale cloud, it will refuse
|
||||
`--login-server` change with: `can't change --login-server without --force-reauth`.
|
||||
**Fix**: uninstall + reinstall the add-on via supervisor API to clear `tailscaled.state`:
|
||||
```python
|
||||
# Via HA WebSocket API (supervisor/api endpoint):
|
||||
{"type": "supervisor/api", "endpoint": "/addons/a0d7b954_tailscale/uninstall", "method": "post"}
|
||||
{"type": "supervisor/api", "endpoint": "/addons/a0d7b954_tailscale/install", "method": "post"}
|
||||
```
|
||||
|
||||
3. Set options before starting:
|
||||
```python
|
||||
{"type": "supervisor/api", "endpoint": "/addons/a0d7b954_tailscale/options", "method": "post",
|
||||
"data": {"options": {"login_server": "https://headscale.vish.gg:8443", "accept_dns": false}}}
|
||||
```
|
||||
|
||||
4. Start the add-on via `hassio/addon_start` service, then read logs:
|
||||
```
|
||||
GET http://192.168.12.202:8123/api/hassio/addons/a0d7b954_tailscale/logs
|
||||
```
|
||||
Look for: `AuthURL is https://headscale.vish.gg:8443/register/<key>`
|
||||
|
||||
5. Register on Calypso:
|
||||
```bash
|
||||
docker exec headscale headscale nodes register --user vish --key <key-from-log>
|
||||
```
|
||||
|
||||
6. Fix IP via SQLite (see section above) and restart headscale.
|
||||
|
||||
---
|
||||
|
||||
## Admin: Registering a New Node
|
||||
|
||||
After a node connects, an admin needs to:
|
||||
|
||||
### 1. Generate a Pre-Auth Key (optional, avoids browser auth)
|
||||
|
||||
```bash
|
||||
ssh -p 62000 Vish@192.168.0.250
|
||||
sudo /volume1/@appstore/REDACTED_APP_PASSWORD/usr/bin/docker exec headscale \
|
||||
headscale preauthkeys create --user 1 --expiration 1h
|
||||
```
|
||||
|
||||
Use `--authkey=<key>` instead of browser auth in `tailscale up`.
|
||||
|
||||
### 2. Check Registered Nodes
|
||||
|
||||
```bash
|
||||
sudo /volume1/@appstore/REDACTED_APP_PASSWORD/usr/bin/docker exec headscale headscale nodes list
|
||||
```
|
||||
|
||||
### 3. Preserve Original Tailscale IP (if migrating from Tailscale cloud)
|
||||
|
||||
Headscale v0.28+ removed the `--ipv4` flag. Fix IPs via SQLite:
|
||||
|
||||
```bash
|
||||
sudo sqlite3 /volume1/@docker/volumes/headscale-data/_data/db.sqlite \
|
||||
"UPDATE nodes SET ipv4='100.x.x.x' WHERE id=<node-id>;"
|
||||
sudo /volume1/@appstore/REDACTED_APP_PASSWORD/usr/bin/docker restart headscale
|
||||
```
|
||||
|
||||
### 4. Rename a Node
|
||||
|
||||
```bash
|
||||
sudo /volume1/@appstore/REDACTED_APP_PASSWORD/usr/bin/docker exec headscale \
|
||||
headscale nodes rename -i <id> <new-name>
|
||||
```
|
||||
|
||||
### 5. Approve Subnet Routes
|
||||
|
||||
Routes advertised by nodes must be explicitly approved:
|
||||
|
||||
```bash
|
||||
sudo /volume1/@appstore/REDACTED_APP_PASSWORD/usr/bin/docker exec headscale \
|
||||
headscale nodes approve-routes -i <node-id> -r <subnet>
|
||||
# e.g. -r 192.168.0.0/24
|
||||
```
|
||||
|
||||
Check all routes (v0.28 — routes are embedded in node JSON output):
|
||||
```bash
|
||||
sudo /volume1/@appstore/REDACTED_APP_PASSWORD/usr/bin/docker exec headscale \
|
||||
headscale nodes list --output json | python3 -c "
|
||||
import sys,json
|
||||
for n in json.load(sys.stdin):
|
||||
r=n.get('available_routes',[])
|
||||
a=n.get('approved_routes',[])
|
||||
if r: print(n['given_name'], 'available:', r, 'approved:', a)
|
||||
"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## DNS Notes
|
||||
|
||||
- **MagicDNS**: Headscale pushes `192.168.0.250` (Calypso AdGuard) as DNS to all tailnet clients
|
||||
- **AdGuard rewrites**: `*.vish.local` names resolve to their Tailscale IPs via AdGuard rewrites on Calypso
|
||||
- **`.vish.local` on iOS/macOS**: Does NOT work remotely — iOS/macOS intercept `.local` for mDNS. Use `hostname.tail.vish.gg` instead
|
||||
- **External DNS**: `headscale.vish.gg` resolves to `184.23.52.14` (home WAN) externally, `192.168.0.250` internally via AdGuard rewrite
|
||||
|
||||
## Uptime Kuma Monitoring
|
||||
|
||||
Kuma runs on **pi-5** (`100.77.151.40`) inside the `uptime-kuma` container. DB at `/app/data/kuma.db`.
|
||||
|
||||
### Monitor groups and hosts
|
||||
|
||||
| Group | Host | Tailscale IP |
|
||||
|-------|------|-------------|
|
||||
| Homelab | `homelab.tail.vish.gg` | `100.67.40.126` |
|
||||
| Atlantis | `atlantis.tail.vish.gg` | `100.83.230.112` |
|
||||
| Calypso | `calypso.tail.vish.gg` | `100.103.48.78` |
|
||||
| Concord_NUC | `vish-concord-nuc.tail.vish.gg` | `100.72.55.21` |
|
||||
| Setillo | `setillo.tail.vish.gg` | `100.125.0.20` |
|
||||
| Proxmox_NUC | `pve.tail.vish.gg` | `100.87.12.28` |
|
||||
| Guava | `truenas-scale.tail.vish.gg` | `100.75.252.64` |
|
||||
| Seattle | `seattle.tail.vish.gg` | `100.82.197.124` |
|
||||
| Raspberry Pi 5 | `100.77.151.40` | `100.77.151.40` |
|
||||
|
||||
### Firewall rules required for Kuma (pi-5 = `100.77.151.40`)
|
||||
|
||||
Kuma polls via Tailscale IP. Each host with a ts-input/ts-forward chain needs ACCEPT rules for pi-5:
|
||||
|
||||
- **Homelab VM**: Rules in `iptables-legacy` ts-input/ts-forward for pi-5 on all monitored ports. Persisted via `netfilter-persistent`.
|
||||
- **Concord NUC**: Same — ts-input/ts-forward ACCEPT for pi-5 on monitored ports.
|
||||
- **Seattle**: UFW rule `ufw allow from 100.77.151.40 to any port 8444`
|
||||
- **Calypso/Atlantis/Setillo**: No ts-input blocking — Tailscale is in userspace mode on Synology.
|
||||
|
||||
### Duplicate service naming
|
||||
|
||||
Services that exist on both Atlantis and Calypso use prefixes:
|
||||
- `[ATL] Sonarr`, `[ATL] Radarr`, etc. for Atlantis
|
||||
- `[CAL] Sonarr`, `[CAL] Radarr`, etc. for Calypso
|
||||
|
||||
### AdGuard DNS fix for `*.tail.vish.gg` on pi-5
|
||||
|
||||
Pi-5's Docker daemon was using `100.100.100.100` (Tailscale MagicDNS) but AdGuard on Calypso was forwarding `*.vish.gg` to Cloudflare, which returned stale IPs. Fixed by adding a private upstream in AdGuard config at `/volume1/docker/adguard/config/AdGuardHome.yaml`:
|
||||
|
||||
```yaml
|
||||
upstream_dns:
|
||||
- "[/tail.vish.gg/]100.100.100.100"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## NPM Proxy Host Gotcha — Same-Subnet LAN IPs
|
||||
|
||||
**Problem**: NPM on Calypso (`192.168.0.250`) cannot reach Docker-published ports on other hosts
|
||||
that are on the same LAN subnet (`192.168.0.x`).
|
||||
|
||||
**Root cause**: When the `Tailscale_outbound_connections` DSM task runs `tailscale configure-host`
|
||||
on Calypso, it installs kernel netfilter hooks. After this, Docker containers on Calypso sending
|
||||
traffic to a LAN IP on the same subnet bypass the DNAT rules on the destination host (same-subnet
|
||||
traffic doesn't go through PREROUTING on the target). The containers are unreachable via their
|
||||
published ports.
|
||||
|
||||
**Fix**: Always use the **Tailscale IP** as the `forward_host` in NPM for services running in
|
||||
Docker on other hosts, not the LAN IP.
|
||||
|
||||
| Host | Use this in NPM (not LAN IP) |
|
||||
|------|------------------------------|
|
||||
| Homelab VM | `100.67.40.126` |
|
||||
| Guava / TrueNAS | `100.75.252.64` |
|
||||
| Atlantis | `100.83.230.112` |
|
||||
|
||||
**Why it worked pre-Headscale**: Before the migration, Tailscale on Calypso ran in pure userspace
|
||||
mode without kernel netfilter hooks. NPM's outbound packets took the normal kernel path, hitting
|
||||
the destination's Docker DNAT rules correctly. The `configure-host` task (which installs kernel
|
||||
hooks) is required for Headscale's subnet routing to work, which introduced this side effect.
|
||||
|
||||
**Known affected proxy hosts** (already fixed to Tailscale IPs):
|
||||
- `gf.vish.gg` → `100.67.40.126:3300` (Grafana)
|
||||
- `ntfy.vish.gg` → `100.67.40.126:8081` (NTFY)
|
||||
- `hoarder.thevish.io` → `100.67.40.126:3482` (Karakeep)
|
||||
- `binterest.thevish.io` → `100.67.40.126:21544` (Binternet)
|
||||
- `crista.love` → `100.75.252.64:28888` (Guava nginx/static site)
|
||||
|
||||
---
|
||||
|
||||
## DERP Relay Servers
|
||||
|
||||
Three DERP relay regions are configured for redundancy:
|
||||
|
||||
| Region | Code | Host | DERP Port | STUN Port | Notes |
|
||||
|--------|------|------|-----------|-----------|-------|
|
||||
| 900 | home-cal | headscale.vish.gg:8443 | 8443 | none | Headscale built-in, LAN only |
|
||||
| 901 | sea | derp-sea.vish.gg:8444 | 8444 | 3478 | Seattle VPS |
|
||||
| 902 | home-atl | derp-atl.vish.gg:8445 | 8445 | 3480 | Atlantis NAS — added for redundancy |
|
||||
|
||||
> **Important**: Tailscale public DERP servers (sfo, nyc, etc.) are disabled. Headscale nodes cannot authenticate through Tailscale's infrastructure. All relay traffic goes through regions 900, 901, or 902.
|
||||
|
||||
### DERP Infrastructure Notes
|
||||
|
||||
- `derp-sea.vish.gg` → Seattle VPS (`YOUR_WAN_IP`), derper container at `hosts/vms/seattle/derper.yaml`
|
||||
- `derp-atl.vish.gg` → Home public IP (`184.23.52.14`), router forwards `8445/tcp` + `3480/udp` to Atlantis (`192.168.0.200`)
|
||||
- Container deployed as **Portainer stack ID 688** on Atlantis (from `hosts/synology/atlantis/derper.yaml`)
|
||||
- TLS cert at `/volume1/docker/derper-atl/certs/live/derp-atl.vish.gg/` (flat `.crt`/`.key` layout required by derper)
|
||||
- Cloudflare credentials at `/volume1/docker/derper-atl/secrets/cloudflare.ini`
|
||||
- Cert auto-renewed monthly (1st of month, 03:00) by `derper-atl-cert-renewer` sidecar container
|
||||
(certbot/dns-cloudflare + supercronic; logs at `/volume1/docker/derper-atl/certs/renew.log`)
|
||||
- Port 3478/udp: coturn/Jitsi on Atlantis — do not use
|
||||
- Port 3479/udp: coturn/Matrix TURN on matrix-ubuntu — do not use
|
||||
- `derpmap.yaml` lives at `hosts/synology/calypso/derpmap.yaml` in repo; must be manually synced to `/volume1/docker/headscale/config/derpmap.yaml` on Calypso after changes
|
||||
|
||||
## Subnet Routes in Use
|
||||
|
||||
| Subnet | Advertised by | Approved |
|
||||
|--------|--------------|---------|
|
||||
| 192.168.0.0/24 | calypso (primary), atlantis | ✅ |
|
||||
| 192.168.68.0/22 | vish-concord-nuc | ✅ |
|
||||
| 192.168.69.0/24 | setillo | ✅ |
|
||||
| 192.168.12.0/24 | gl-mt3000 | ✅ |
|
||||
|
||||
## Node Inventory
|
||||
|
||||
| ID | Hostname | Tailscale IP | Status |
|
||||
|----|----------|-------------|--------|
|
||||
| 1 | headscale-test | 100.64.0.1 | test LXC |
|
||||
| 2 | seattle (vmi2076105) | 100.82.197.124 | Seattle VPS |
|
||||
| 3 | matrix-ubuntu | 100.85.21.51 | |
|
||||
| 4 | pi-5 | 100.77.151.40 | |
|
||||
| 5 | vish-concord-nuc | 100.72.55.21 | |
|
||||
| 6 | setillo | 100.125.0.20 | |
|
||||
| 7 | pve | 100.87.12.28 | |
|
||||
| 8 | truenas-scale | 100.75.252.64 | Guava/TrueNAS |
|
||||
| 9 | ipad-pro | 100.68.71.48 | |
|
||||
| 10 | iphone16-pro-max | 100.79.252.108 | |
|
||||
| 11 | atlantis | 100.83.230.112 | |
|
||||
| 12 | calypso | 100.103.48.78 | Runs headscale |
|
||||
| 13 | homelab | 100.67.40.126 | |
|
||||
| 14 | uqiyoe | 100.124.91.52 | Windows laptop |
|
||||
| 15 | jellyfish | 100.69.121.120 | Remote location |
|
||||
| 16 | gl-mt3000 | 100.126.243.15 | Remote router |
|
||||
| 17 | gl-be3600 | 100.105.59.123 | Home router |
|
||||
|
||||
### Still to migrate (offline nodes)
|
||||
Run `tailscale up --login-server=https://headscale.vish.gg:8443 --force-reauth` when they come online:
|
||||
|
||||
- kevinlaptop (`100.89.160.65`)
|
||||
- mah-pc (`100.121.22.51`)
|
||||
- shinku-ryuu (`100.98.93.15`)
|
||||
- vish-mint (`100.115.169.43`)
|
||||
- vishdebian (`100.86.60.62`)
|
||||
- mastodon-rocky (`100.111.200.21`)
|
||||
- nvidia-shield (`100.89.79.99`)
|
||||
- pi-5-kevin (`100.123.246.75`)
|
||||
- rocky9-playground (`100.105.250.128`)
|
||||
- samsung-sm-x510 (`100.72.118.117`)
|
||||
- sd (`100.83.141.1`)
|
||||
- bluecrownpassionflower (`100.110.25.127`)
|
||||
- glkvm (`100.64.137.1`)
|
||||
- google-pixel-10-pro (`100.122.119.40`)
|
||||
|
||||
### Home Assistant — Migrated ✅
|
||||
|
||||
**Device**: Home Assistant Green at `192.168.12.202:8123` (jellyfish remote location)
|
||||
**Tailscale IP**: `100.112.186.90` (preserved) | **Node ID**: 19 | **MagicDNS**: `homeassistant.tail.vish.gg`
|
||||
|
||||
**Migration completed** remotely (no physical access needed) via:
|
||||
1. HA WebSocket API (`ws://192.168.12.202:8123/api/websocket`) proxied through jellyfish (`100.69.121.120`)
|
||||
2. Supervisor `addon_configs` API to set `login_server: https://headscale.vish.gg:8443`
|
||||
3. Uninstalled + reinstalled the Tailscale add-on to clear stale `tailscaled.state`
|
||||
(necessary because `can't change --login-server without --force-reauth`)
|
||||
4. Add-on registered against headscale — auth URL approved via `headscale nodes register`
|
||||
5. IP updated via SQLite: `UPDATE nodes SET ipv4='100.112.186.90' WHERE id=19;`
|
||||
|
||||
**Current add-on config**:
|
||||
```json
|
||||
{ "login_server": "https://headscale.vish.gg:8443", "accept_dns": false }
|
||||
```
|
||||
|
||||
**Uptime Kuma monitor**: `[JLF] Home Assistant` (ID 5) → `homeassistant.tail.vish.gg:8123`
|
||||
|
||||
**HA API token** (expires 2028-06-07):
|
||||
`eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiIxMzA1ZTE0NDg2ZGY0NDExYmMyOGEwZTY3ZmUyMTc3NyIsImlhdCI6MTc3MzA1MjkzNywiZXhwIjoyMDg4NDEyOTM3fQ.hzqjg7ALTdTDkMJS9Us-RUetQ309Nmfzx4gXevRRlp8` <!-- pragma: allowlist secret -->
|
||||
|
||||
---
|
||||
|
||||
## Outstanding TODOs
|
||||
|
||||
| Priority | Task | Notes |
|
||||
|----------|------|-------|
|
||||
| Low | **Migrate offline nodes** | ~13 nodes still on Tailscale cloud — migrate when they come online |
|
||||
| Info | **NPM proxy hosts audit** | Going forward, always use Tailscale IPs in NPM for Docker services on other LAN hosts (see NPM section above) |
|
||||
Reference in New Issue
Block a user