Sanitized mirror from private repository - 2026-03-30 00:10:29 UTC
Some checks failed
Documentation / Build Docusaurus (push) Failing after 9m20s
Documentation / Deploy to GitHub Pages (push) Has been skipped

This commit is contained in:
Gitea Mirror Bot
2026-03-30 00:10:29 +00:00
commit 8664c8417c
1280 changed files with 331217 additions and 0 deletions

View File

@@ -0,0 +1,108 @@
# LAN Routing Fix: Tailscale 10GbE Throughput Issue
## Problem
Hosts with host-level Tailscale on the `192.168.0.0/24` LAN have their local traffic intercepted by Tailscale's policy routing table 52. Instead of going directly over the physical 10GbE link, traffic gets routed through the WireGuard tunnel via Calypso's advertised `192.168.0.0/24` subnet route.
### Root Cause
Calypso (Headscale node ID:12) advertises `192.168.0.0/24` as a subnet route so remote nodes (Moon, Seattle, NUC) can reach LAN devices over Tailscale. However, machines that are **already on** that LAN also accept this route into Tailscale's routing table 52 (ip rule priority 5270), causing local traffic to hairpin through the tunnel.
Diagnosis:
```bash
# Shows traffic going through tailscale0 instead of the physical NIC
ip route get 192.168.0.200
# → 192.168.0.200 dev tailscale0 table 52 src 100.75.252.64
# Table 52 has the LAN subnet routed through Tailscale
ip route show table 52 | grep 192.168.0
# → 192.168.0.0/24 dev tailscale0
```
### Affected Hosts
| Host | LAN IP | Physical NIC | Affected? |
|---|---|---|---|
| Guava (TrueNAS) | 192.168.0.100 | enp1s0f0np0 (10GbE) | **YES** — fixed |
| homelab-vm | 192.168.0.210 | ens18 | **YES** — fixed |
| Atlantis | 192.168.0.200 | eth2/ovs_eth2 (10GbE) | No (Synology OVS) |
| Calypso | 192.168.0.250 | ovs_eth2 | No (Synology OVS) |
| Pi-5 | 192.168.0.66 | eth0 | No (not accepting route) |
| NUC | 192.168.68.100 | eno1 | No (different subnet) |
### Measured Impact (Guava → Atlantis)
| Route | Throughput | Retransmits |
|---|---|---|
| Before fix (via Tailscale) | 1.39 Gbps | 6,891 |
| After fix (direct LAN) | **7.61 Gbps** | 5,066 |
**5.5x improvement** — from WireGuard-encapsulated tunnel to direct 10GbE.
## Fix Applied
Add an ip policy rule at priority 5200 (before Tailscale's table 52 at 5270) that forces LAN traffic to use the main routing table, which routes via the physical NIC:
```bash
sudo ip rule add to 192.168.0.0/24 lookup main priority 5200
```
This means: for any traffic destined to `192.168.0.0/24`, check the main table first. The main table has `192.168.0.0/24 dev <physical-nic>`, so traffic goes direct. All Tailscale traffic to `100.x.x.x` nodes is unaffected.
### Verification
```bash
# Should show physical NIC, not tailscale0
ip route get 192.168.0.200
# Should get sub-1ms ping
ping -c 3 192.168.0.200
# Confirm rule is in place
ip rule show | grep 5200
```
### Revert
```bash
sudo ip rule del to 192.168.0.0/24 lookup main priority 5200
```
## Persistence
### Guava (TrueNAS)
Init script added via TrueNAS API (ID: 2):
- **Type:** COMMAND
- **When:** POSTINIT
- **Command:** `ip rule add to 192.168.0.0/24 lookup main priority 5200`
- **Comment:** Bypass Tailscale routing for LAN traffic (direct 10GbE)
Manage via TrueNAS UI: **System → Advanced → Init/Shutdown Scripts**
### homelab-vm (Ubuntu 24.04)
Systemd service at `/etc/systemd/system/lan-route-fix.service`:
```ini
[Unit]
Description=Ensure LAN traffic bypasses Tailscale routing table
After=network-online.target tailscaled.service
Wants=network-online.target
[Service]
Type=oneshot
ExecStart=/sbin/ip rule add to 192.168.0.0/24 lookup main priority 5200
ExecStop=/sbin/ip rule del to 192.168.0.0/24 lookup main priority 5200
RemainAfterExit=yes
[Install]
WantedBy=multi-user.target
```
Enabled with `sudo systemctl enable lan-route-fix.service`.
## Notes
- Remote nodes (Moon, Seattle, NUC) that are **not** on `192.168.0.0/24` are unaffected — they correctly use Calypso's subnet route to reach LAN devices via Tailscale.
- If a new host is added to the LAN with host-level Tailscale, the same fix will need to be applied.
- The Synology boxes (Atlantis, Calypso) use Open vSwitch bridging and don't exhibit this issue.

View File

@@ -0,0 +1,79 @@
# SSH Mesh — Key-Based Authentication Across All Hosts
All Tailscale-connected hosts can SSH to each other using ed25519 key authentication.
No passwords needed.
## Participating Hosts
| Host | User | Tailscale IP | SSH Port | Key |
|------|------|-------------|----------|-----|
| homelab-vm | homelab | 100.67.40.126 | 22 | admin@thevish.io |
| atlantis | vish | 100.83.230.112 | 60000 | vish@atlantis |
| calypso | Vish | 100.103.48.78 | 62000 | calypso access |
| guava | vish | 100.75.252.64 | 22 | vish@guava |
| setillo | vish | 100.125.0.20 | 22 | setillo-key |
| pi-5 | vish | 100.77.151.40 | 22 | vish@pi-5 |
| nuc | vish | 100.72.55.21 | 22 | vish@nuc |
| moon | vish | 100.64.0.6 | 22 | vish@moon |
| seattle | root | 100.82.197.124 | 22 | root@seattle |
| matrix-ubuntu | test | 100.85.21.51 | 22 | test@matrix-ubuntu |
| jellyfish | lulu | 100.69.121.120 | 22 | lulu@jellyfish |
| pve | root | 100.87.12.28 | 22 | root@pve (RSA) |
| gl-mt3000 | root | 100.126.243.15 | 22 | (admin key only) |
| gl-be3600 | root | 100.105.59.123 | 22 | root@gl-be3600 |
The **admin key** (`admin@thevish.io` from homelab-vm) is present on every host.
## Ansible Playbook
Manage the mesh with `ansible/playbooks/ssh_mesh.yml`:
```bash
# Distribute keys to all hosts (collect + push)
ansible-playbook -i inventory.yml playbooks/ssh_mesh.yml --tags distribute
# Verify connectivity from localhost
ansible-playbook -i inventory.yml playbooks/ssh_mesh.yml --tags verify
# Generate missing keys + distribute
ansible-playbook -i inventory.yml playbooks/ssh_mesh.yml -e "generate_missing=true"
```
The `ssh_mesh` group in `inventory.yml` defines which hosts participate.
## Adding a New Host
1. Add the host to `ansible/inventory.yml` under the appropriate group and to the `ssh_mesh` children
2. Run the playbook with key generation:
```bash
ansible-playbook -i inventory.yml playbooks/ssh_mesh.yml -e "generate_missing=true"
```
3. This will generate a key on the new host if needed, collect all keys, and distribute them everywhere
## Notes
- **Synology NAS (Atlantis/Calypso/Setillo)**: Home directory must be `chmod 755` or stricter — SSH refuses key auth if home is world-writable. DSM can reset permissions on reboot.
- **OpenWrt routers (MT3000/BE3600)**: Use dropbear SSH, not OpenSSH. Keys must be in both `/etc/dropbear/authorized_keys` AND `/root/.ssh/authorized_keys`. Key auth works but `ssh -o` flags differ slightly.
- **GL-BE3600 in repeater mode**: SSH port 22 is accessible via Tailscale only — LAN SSH is blocked by the repeater firewall. Use `100.105.59.123` not `192.168.68.1`.
- **TrueNAS (Guava)**: Home directory is at `/mnt/data/vish-home/vish/`, not `/home/vish/`.
- **pi-5-kevin**: Frequently offline — will fail verification but has keys distributed.
- **homelab-vm**: SSH config historically uses password auth to itself; key auth works to all other hosts.
- **rsync to Atlantis**: rsync from homelab-vm to Atlantis fails (Synology SSH subsystem issue). Use `scp -O -r -P 60000` instead, or pull from Atlantis.
## Router Tailscale Auto-Start
Both GL.iNet routers have init scripts to auto-connect to Headscale on boot:
**GL-MT3000** (`/etc/init.d/tailscale-up`, START=81):
```sh
tailscale up --accept-routes --login-server=https://headscale.vish.gg:8443 --accept-dns=false --advertise-routes=192.168.12.0/24
```
**GL-BE3600** (`/etc/init.d/tailscale-up`, START=99):
- Waits for network connectivity (repeater mode needs WiFi first)
- Polls every 2s for up to 120s before running `tailscale up`
- Advertises `192.168.68.0/22,192.168.8.0/24`
Update script on both: `/root/update-tailscale.sh` (Admon's GL.iNet updater, use `--force` for non-interactive).
## Established 2026-03-23, updated 2026-03-24