Files
homelab-optimized/docs/troubleshooting/guava-smb-incident-2026-03-14.md
Gitea Mirror Bot fb00a325d1
Some checks failed
Documentation / Build Docusaurus (push) Failing after 5m14s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-04-18 11:19:59 UTC
2026-04-18 11:19:59 +00:00

5.9 KiB

Guava SMB Incident — 2026-03-14

Affected host: guava (TrueNAS SCALE, 100.75.252.64 / 192.168.0.100) Affected client: shinku-ryuu (Windows, 192.168.0.3) Symptoms: All SMB shares on guava unreachable from shinku after guava reboot


Root Causes (two separate issues)

1. Tailscale app was STOPPED after reboot

Guava's Tailscale was running as an orphaned host process rather than the managed TrueNAS app. On reboot the orphan was gone and the app didn't start because it was in STOPPED state.

Why it was stopped: The app had been upgraded from v1.3.30 → v1.4.2. The new version's startup script ran tailscale up but failed because the stored state had --accept-dns=false while the app config had accept_dns: true — a mismatch that requires --reset. The app exited, leaving the old manually-started daemon running until the next reboot.

2. Tailscale accept_routes: true caused SMB replies to route via tunnel

After fixing the app startup, shinku still couldn't reach guava on the LAN. The cause:

  • Calypso advertises 192.168.0.0/24 as a subnet route via Tailscale
  • Guava had accept_routes: true — it installed Calypso's 192.168.0.0/24 route into Tailscale's policy routing table (table 52, priority 5270)
  • When shinku sent a TCP SYN to guava port 445, it arrived on enp1s0f0np0
  • Guava's reply looked up 192.168.0.3 in the routing table — hit table 52 first — and sent the reply out via tailscale0 instead of the LAN
  • The reply never reached shinku; the connection timed out

This also affected shinku: it had accept_routes: true as well, so it was routing traffic destined for 192.168.0.100 via Calypso's Tailscale tunnel rather than its local Ethernet interface.


Fixes Applied

Fix 1 — Tailscale app startup config

Updated the TrueNAS app config to match the node's actual desired state:

sudo midclt call app.update tailscale '{"values": {"tailscale": {
  "accept_dns": false,
  "accept_routes": false,
  "advertise_exit_node": true,
  "advertise_routes": [],
  "auth_key": "...",
  "auth_once": true,
  "hostname": "truenas-scale",
  "reset": true
}}}'

Key changes:

  • accept_dns: false — matches the running state stored in Tailscale's state dir
  • accept_routes: false — prevents guava from pulling in subnet routes from other nodes (see Fix 2)
  • reset: true — clears the flag mismatch that was causing tailscale up to fail

Saved in: /mnt/.ix-apps/app_configs/tailscale/versions/1.4.2/user_config.yaml

Fix 2 — Remove stale subnet routes from guava's routing table

After updating the app config the stale routes persisted in table 52. Removed manually:

sudo ip route del 192.168.0.0/24 dev tailscale0 table 52
sudo ip route del 192.168.12.0/24 dev tailscale0 table 52
sudo ip route del 192.168.68.0/22 dev tailscale0 table 52
sudo ip route del 192.168.69.0/24 dev tailscale0 table 52

With accept_routes: false now saved, these routes will not reappear on next reboot.

Fix 3 — Disable accept_routes on shinku

Shinku was also accepting Calypso's 192.168.0.0/24 route (metric 0 via Tailscale, beating Ethernet 3's metric 256):

# Before fix — traffic to 192.168.0.100 went via Tailscale
192.168.0.0/24    100.100.100.100    0    Tailscale

# After fix — traffic goes via local LAN
192.168.0.0/24    0.0.0.0    256    Ethernet 3

Fixed by running on shinku:

tailscale up --accept-routes=false --login-server=https://headscale.vish.gg:8443

Fix 4 — SMB password reset and credential cache

The SMB password for vish on guava was changed via the TrueNAS web UI. Windows had stale credentials cached. Fixed by:

  1. Clearing Windows Credential Manager entry for 192.168.0.100
  2. Re-mapping shares from an interactive PowerShell session on shinku

SMB Share Layout on Guava

Windows drive Share Path on guava
I: guava_turquoise /mnt/data/guava_turquoise
J: photos /mnt/data/photos
K: data /mnt/data/passionfruit
L: website /mnt/data/website
M: jellyfin /mnt/data/jellyfin
N: truenas-exporters /mnt/data/truenas-exporters
Q: iso /mnt/data/iso

All shares use vish as the SMB user. Credentials stored in Windows Credential Manager under 192.168.0.100.


Diagnosis Commands

# Check Tailscale app state on guava
ssh guava "sudo midclt call app.query '[[\"name\",\"=\",\"tailscale\"]]' | python3 -c 'import sys,json; a=json.load(sys.stdin)[0]; print(a[\"name\"], a[\"state\"])'"

# Check for rogue subnet routes in Tailscale's routing table
ssh guava "ip route show table 52 | grep 192.168"

# Check tailscale container logs
ssh guava "sudo docker logs \$(sudo docker ps | grep tailscale | awk '{print \$1}' | head -1) 2>&1 | tail -20"

# Check SMB audit log for auth failures on guava
ssh guava "sudo journalctl -u smbd --since '1 hour ago' --no-pager | grep -i 'wrong_password\|STATUS'"

# Check which Tailscale peer is advertising a given subnet (run on any node)
tailscale status --json | python3 -c "
import sys, json
d = json.load(sys.stdin)
for peer in d.get('Peer', {}).values():
    routes = peer.get('PrimaryRoutes') or []
    if routes:
        print(peer['HostName'], routes)
"

Prevention

  • Guava: accept_routes: false is now saved in the TrueNAS app config — will survive reboots
  • Shinku: --accept-routes=false set via tailscale up — survives reboots
  • General rule: Hosts on the same LAN as the subnet-advertising node (Calypso → 192.168.0.0/24) should have accept_routes: false, or the advertised subnet should be scoped to only nodes that need remote access to that LAN
  • TrueNAS app upgrades: After upgrading the Tailscale app version, always check the new user_config.yaml to ensure accept_dns, accept_routes, and other flags match the node's actual running state. If unsure, set reset: true once to clear any stale state, then set it back to false