Files
homelab-optimized/docs/networking/TAILSCALE_MESH_TEST.md
Gitea Mirror Bot 5cbaedc119
Some checks failed
Documentation / Build Docusaurus (push) Failing after 17m43s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-03-31 12:23:18 UTC
2026-03-31 12:23:18 +00:00

5.1 KiB

Tailscale Mesh Connectivity Test

Last tested: 2026-03-31

Test Method

Full tailscale ping from each SSH-accessible host to all other online Headscale nodes. This tests WireGuard tunnel establishment and reports whether the connection is direct (peer-to-peer) or relayed via DERP.

Results

All 10 online hosts can reach all other online hosts. No failures.

Connection Type Matrix

D = direct, R = DERP relay, = self

From \ To Atlantis Calypso Pi-5 Homelab Matrix-Ubuntu Setillo NUC Guava Seattle PVE
Atlantis D D D D D D D D D
Calypso D D D D R D D D D
Pi-5 D D D D D D D D D
Homelab-VM D D D D R D D D D
Matrix-Ubuntu (tested inbound)
Setillo D D D R R
NUC D D D D R D R D R
Guava (no CLI)
Seattle D D D D D D R D D
PVE D D D D D D D D D

Notes

  • Atlantis/Calypso: Tailscale binary at /var/packages/Tailscale/target/bin/tailscale (Synology package)
  • Setillo: Tailscale binary at /usr/local/bin/tailscale
  • Guava: Tailscale runs via TrueNAS built-in management; no tailscale CLI in PATH. Confirmed reachable via inbound pings from all other hosts.
  • Matrix-Ubuntu: SSH via LAN IP (192.168.0.154) was unreliable during testing due to table 52 LAN interception (since fixed). Confirmed reachable via Tailscale IP and inbound pings.
  • DERP relays (NUC ↔ some peers, Setillo ↔ homelab/matrix-ubuntu): Normal for nodes behind different NATs. Adds 15-60ms latency but does not affect reliability. Connections may upgrade to direct over time.

ICMP Ping Notes

Standard ICMP ping from the OS (not tailscale ping) fails for Atlantis, Calypso, and Setillo because those hosts have ICMP blocked at the firewall level. This does not indicate a connectivity problem — tailscale ping and SSH both work.

Tailscale Health Warnings

After fixes applied on 2026-03-31:

Host Health
Homelab-VM none
Pi-5 none
NUC none
Seattle none
PVE none (was --accept-routes is false, fixed)
Matrix-Ubuntu none (was --accept-routes is false, fixed)

Fixes Applied (2026-03-31)

LAN Routing (table 52 interception)

  1. Pi-5: Enabled --accept-routes, added LAN routing rule (priority 5200), persistent via dispatcher script + cron
  2. Matrix-Ubuntu: Enabled --accept-routes, added LAN routing rule (priority 5200), persistent via dispatcher script + cron. Enabling --accept-routes without the rule broke LAN connectivity (SSH via 192.168.0.154 timed out).
  3. PVE: Enabled --accept-routes, added LAN routing rule (priority 5200), persistent via cron @reboot

See LAN Routing Fix for full details on the table 52 issue.

Kuma monitors

  • Switched all 20 Calypso monitors from Tailscale IP (100.103.48.78) to LAN IP (192.168.0.250) in the Kuma SQLite DB. Pi-5 (where Kuma runs) is on the same LAN, so using Tailscale IPs added unnecessary fragility.
  • Added LAN-based monitors for Rustdesk (ID:124) and Rackula (ID:125).
  • Fixed corrupted accepted_statuscodes_json field ([200-299]["200-299"]) that prevented the Kuma UI from loading.
  • Fixed ntfy notifications by setting primaryBaseURL to https://kuma.vish.gg — the "view" action button was missing a URL.

Calypso Tailscale 5-minute disconnect (root cause)

Symptom: Calypso's disco key rotated every 5 minutes, dropping all peer WireGuard sessions for ~30 seconds.

Root cause: A cron job in /etc/crontab ran /usr/local/bin/tailscale-fix.sh every 5 minutes (*/5 * * * *). The script checked for the tailscale0 TUN device, but Calypso runs Tailscale in --tun=userspace-networking mode (Synology DSM7), which has no TUN device. The script also checked tailscale status --json which returned empty state when run as the tailscale user. So every 5 minutes:

  1. Script detects "tailscale0 missing" or "state empty"
  2. Runs configure-host + full service restart via synosystemctl
  3. Re-authenticates with tailscale up --reset
  4. New disco key generated → all peers tear down and re-establish connections

Fix: Rewrote /usr/local/bin/tailscale-fix.sh to check if tailscaled process is running and can tailscale ping a known peer (Atlantis), which works in both TUN and userspace-networking modes.

Additional changes on Calypso (not the root cause but good hygiene):

  • Disabled Docker IPv6 on all bridge interfaces via sysctl (77 routes removed)
  • Updated dockerd.json with "ipv6": false, "ip6tables": false (persistent after Docker restart)
  • Added TS_DEBUG_NETMON_SKIP_INTERFACE_REGEXPS and TS_DEBUG_NETMON_NO_ROUTE_MONITORING env vars to Tailscale startup script
  • Added /etc/hosts entry: 192.168.0.250 headscale.vish.gg to avoid hairpin NAT for control plane