Files
homelab-optimized/docs/networking/GUAVA_LAN_ROUTING_FIX.md
Gitea Mirror Bot d90cf1f849
Some checks failed
Documentation / Deploy to GitHub Pages (push) Has been cancelled
Documentation / Build Docusaurus (push) Has been cancelled
Sanitized mirror from private repository - 2026-04-19 09:52:01 UTC
2026-04-19 09:52:01 +00:00

147 lines
5.4 KiB
Markdown

# LAN Routing Fix: Tailscale Table 52 LAN Interception
## Problem
Hosts with host-level Tailscale on the `192.168.0.0/24` LAN have their local traffic intercepted by Tailscale's policy routing table 52. Instead of going directly over the physical 10GbE link, traffic gets routed through the WireGuard tunnel via Calypso's advertised `192.168.0.0/24` subnet route.
### Root Cause
Calypso (Headscale node ID:12) advertises `192.168.0.0/24` as a subnet route so remote nodes (Moon, Seattle, NUC) can reach LAN devices over Tailscale. However, machines that are **already on** that LAN also accept this route into Tailscale's routing table 52 (ip rule priority 5270), causing local traffic to hairpin through the tunnel.
Diagnosis:
```bash
# Shows traffic going through tailscale0 instead of the physical NIC
ip route get 192.168.0.200
# → 192.168.0.200 dev tailscale0 table 52 src 100.75.252.64
# Table 52 has the LAN subnet routed through Tailscale
ip route show table 52 | grep 192.168.0
# → 192.168.0.0/24 dev tailscale0
```
### Affected Hosts
Any host on `192.168.0.0/24` with `--accept-routes` enabled will have this issue. Calypso advertises the LAN subnet so remote nodes can reach it; LAN-local hosts must not route LAN traffic through the tunnel.
| Host | LAN IP | Physical NIC | Status |
|---|---|---|---|
| Guava (TrueNAS) | 192.168.0.100 | enp1s0f0np0 (10GbE) | **Fixed** — TrueNAS POSTINIT script |
| homelab-vm | 192.168.0.210 | ens18 | **Fixed** — systemd service |
| Pi-5 | 192.168.0.66 | eth0 | **Fixed** (2026-03-31) — dispatcher script + cron |
| Matrix-Ubuntu | 192.168.0.154 | ens3 | **Fixed** (2026-03-31) — dispatcher script + cron |
| PVE | 192.168.0.205 | vmbr0 | **Fixed** (2026-03-31) — cron @reboot |
| Atlantis | 192.168.0.200 | eth2/ovs_eth2 (10GbE) | Not affected (`--accept-routes` off) |
| Calypso | 192.168.0.250 | ovs_eth2 | Not affected (`--accept-routes` off) |
| NUC | 192.168.68.100 | eno1 | Not affected (different subnet) |
### Measured Impact (Guava → Atlantis)
| Route | Throughput | Retransmits |
|---|---|---|
| Before fix (via Tailscale) | 1.39 Gbps | 6,891 |
| After fix (direct LAN) | **7.61 Gbps** | 5,066 |
**5.5x improvement** — from WireGuard-encapsulated tunnel to direct 10GbE.
## Fix Applied
Add an ip policy rule at priority 5200 (before Tailscale's table 52 at 5270) that forces LAN traffic to use the main routing table, which routes via the physical NIC:
```bash
sudo ip rule add to 192.168.0.0/24 lookup main priority 5200
```
This means: for any traffic destined to `192.168.0.0/24`, check the main table first. The main table has `192.168.0.0/24 dev <physical-nic>`, so traffic goes direct. All Tailscale traffic to `100.x.x.x` nodes is unaffected.
### Verification
```bash
# Should show physical NIC, not tailscale0
ip route get 192.168.0.200
# Should get sub-1ms ping
ping -c 3 192.168.0.200
# Confirm rule is in place
ip rule show | grep 5200
```
### Revert
```bash
sudo ip rule del to 192.168.0.0/24 lookup main priority 5200
```
## Persistence
Each host uses the persistence method appropriate to its OS:
### Guava (TrueNAS SCALE)
Init script added via TrueNAS API (ID: 2):
- **Type:** COMMAND
- **When:** POSTINIT
- **Command:** `ip rule add to 192.168.0.0/24 lookup main priority 5200`
- **Comment:** Bypass Tailscale routing for LAN traffic (direct 10GbE)
Manage via TrueNAS UI: **System → Advanced → Init/Shutdown Scripts**
### homelab-vm (Ubuntu 24.04)
Systemd service at `/etc/systemd/system/lan-route-fix.service`:
```ini
[Unit]
Description=Ensure LAN traffic bypasses Tailscale routing table
After=network-online.target tailscaled.service
Wants=network-online.target
[Service]
Type=oneshot
ExecStart=/sbin/ip rule add to 192.168.0.0/24 lookup main priority 5200
ExecStop=/sbin/ip rule del to 192.168.0.0/24 lookup main priority 5200
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
```
Enabled with `sudo systemctl enable lan-route-fix.service`.
### Pi-5 (Raspberry Pi OS) and Matrix-Ubuntu (Ubuntu 24.04)
Dispatcher script at `/etc/networkd-dispatcher/routable.d/50-tailscale-lan`:
```bash
#!/bin/bash
if ! ip rule show | grep -q "5200.*192.168.0.0/24"; then
ip rule add to 192.168.0.0/24 lookup main priority 5200
fi
```
Plus belt-and-suspenders `@reboot` cron entry:
```
@reboot /bin/bash /etc/networkd-dispatcher/routable.d/50-tailscale-lan
```
### PVE (Proxmox VE)
Root crontab `@reboot` entry:
```
@reboot /sbin/ip rule add to 192.168.0.0/24 lookup main priority 5200 2>/dev/null
```
## Adding a New LAN Host
If a new host is added to `192.168.0.0/24` with Tailscale and `--accept-routes`:
1. Apply the fix: `sudo ip rule add to 192.168.0.0/24 lookup main priority 5200`
2. Verify: `ip route get 192.168.0.200` should show the physical NIC, not `tailscale0`
3. Make persistent using one of the methods above
4. Update this document
## Notes
- Remote nodes (Moon, Seattle, NUC, Setillo) that are **not** on `192.168.0.0/24` are unaffected — they correctly use Calypso's subnet route to reach LAN devices via Tailscale.
- The Synology boxes (Atlantis, Calypso) have `--accept-routes` disabled and use Open vSwitch bridging, so they are not affected.
- The `--accept-routes` flag also pulls in `192.168.68.0/22` and `192.168.69.0/24` routes (from NUC's subnet advertisement), but these don't conflict with the primary LAN.
- Enabling `--accept-routes` without the priority 5200 rule will silently break LAN connectivity — outbound packets route through `tailscale0` and replies never reach the sender via the expected path.