5.9 KiB
5.9 KiB
Homelab Claude Code Instructions
Deployment
- When deploying services, always verify the target host before proceeding. Confirm which host a service should run on and check for port conflicts with existing services.
- Check
ss -tlnp | grep <port>on the target host before deploying. - Hosts: atlantis (Synology NAS, media/arr), calypso (Synology, DNS/SSO), olares (K3s, GPU), nuc (lightweight), rpi5 (Kuma), homelab-vm (monitoring/dashboard), guava (TrueNAS), seattle (remote), matrix-ubuntu (NPM/CrowdSec).
Configuration Management
- Before modifying config files (YAML, JSON, etc.), always create a backup copy first.
- Never use sed for complex YAML edits — use a proper parser or manual editing to avoid duplicate keys and corruption.
- For YAML changes, validate with
python3 -c "import yaml; yaml.safe_load(open('file.yaml'))"after editing. - Never empty or overwrite a config file without reading it first.
Homelab SSH & Networking
- For homelab SSH operations: if MCP SSH times out on large outputs, fall back to Bash with
sshdirectly. - Always use the correct Tailscale/LAN IP for each host. When Ollama or other services aren't on localhost, check the memory or ask for the correct endpoint before guessing.
- After making infrastructure changes (Tailscale, DNS, networking), always verify connectivity from affected hosts before marking complete.
- Never run a second instance of a network daemon (tailscaled, etc.) — it will break host networking.
- homelab-vm IS localhost — never SSH into it, use local commands.
Heterogeneous Host Awareness
- Before installing/running anything on a remote host, probe the environment first:
uname -a,which <binary>,mount | grep noexec,sudo -n true. Adapt or propose alternatives instead of failing then pivoting. - Tailscale binary paths differ across hosts (Synology, GL.iNet, k3s, standard Linux) — verify with
which tailscalebefore assuming. - Synology
/tmpisnoexec— stage scripts in/volume1or user home. - Synology has no
gitand no SFTP subsystem — usessh-pipe(cat file | ssh host 'cat > dest') and prefix docker commands withsudo /usr/local/bin/docker. - GL.iNet travel routers wipe config on firmware update — reapply watchdog/Tailscale config after every flash.
- uqiyoe is Windows — use
dir/del/rmdir, notls/rm. SSH user isvish, nothomelab. - Check architecture (
uname -m) before downloading binaries; the fleet has mixed amd64/arm64.
Long-Running Commands
- Set explicit, short timeouts on SSH/Bash commands. Default 30s, max 120s for known-slow ops.
- For potentially slow operations (find on NAS, large rsync, apt upgrade): run with
run_in_background: trueand poll, or scope tightly with-maxdepth/path filters. - Never run unbounded
find /on NAS or Synology hosts — always anchor to a specific path. - For destructive/mutating ops (rsync, dd, rm -rf, db edits): dry-run first, verify checksums/counts, take a backup before applying. Don't trust silent successes —
rsynconce truncated 70 GB to 74 MB without erroring.
Debugging Discipline
- Before changing anything to "fix" an issue, list the top 2–3 candidate root causes ranked by likelihood with one diagnostic per candidate. Run the diagnostics first, share results, then propose a fix. Don't patch the visible symptom (e.g., disabling a Kuma monitor) before confirming the underlying cause.
Verification Discipline
- After deploying or fixing a service, verify end-to-end before declaring done: curl the endpoint, check Kuma status, tail logs for >60s of clean uptime.
- Kuma
accepted_statuscodesmust be quoted strings in JSON:["200-299"], not[200-299](parse error otherwise). - Commit and push documentation changes in the same session as the infra change — don't leave docs lagging behind reality.
LLM Services
- When working with LLM model deployments (Ollama, vLLM), always verify: 1) GPU access, 2) context length meets the consumer's requirements, 3) tool-calling support if needed.
- Ollama is at
http://192.168.0.145:31434(Olares LAN NodePort), NOT localhost. - HAMI vGPU on Olares causes ffmpeg segfaults — do NOT request
nvidia.com/gpuresources, useruntimeClassName: nvidiadirectly.
Olares (K3s)
- Olares admission webhook blocks hostNetwork and reverts custom NetworkPolicies.
- Use Calico GlobalNetworkPolicy for LAN access (it can't be overridden by the webhook).
- The Olares proxy adds ~100ms latency — use direct LAN NodePorts for streaming/high-throughput services.
- Marketplace app patches (NFS mounts, GPU) are lost on app updates — re-apply after updates.
Git & Commits
- Never add Co-Authored-By lines to git commits.
- Always run
detect-secrets scan --baseline .secrets.baselinebefore committing if secrets baseline exists. - Use
pragma: allowlist secretcomments for intentional secrets in private repo files.
Documentation
- After completing each task, immediately update the relevant documentation in the repo and commit with a descriptive message before moving to the next task.
- Key docs:
docs/services/individual/dashboard.md,docs/services/individual/olares.md,scripts/README.md.
Portainer
- API uses
X-API-Keyheader (NOT Bearer token). - Portainer URL:
http://100.83.230.112:10000(Tailscale IP). - Endpoints: atlantis=2, calypso=443397, nuc=443398, homelab=443399, rpi5=443395.
- GitOps stacks use Gitea token for auth — if redeploy fails with "authentication required", credentials need re-entry in Portainer UI.
Dashboard
- Dashboard runs at
http://homelab.tail.vish.gg:3100(Next.js on port 3100, FastAPI API on port 18888). - API proxied through Next.js rewrites — frontend calls
/api/*which routes to localhost:18888. - 16 glassmorphism themes with Exo 2 font.
- To rebuild:
cd dashboard/ui && rm -rf .next && BACKEND_URL=http://localhost:18888 npm run build && cp -r .next/static .next/standalone/.next/static && cp -r public .next/standalone/public.