Files
homelab-optimized/CLAUDE.md
Gitea Mirror Bot 79fa379a2f
Some checks failed
Documentation / Build Docusaurus (push) Failing after 5m8s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-04-25 06:50:29 UTC
2026-04-25 06:50:29 +00:00

5.9 KiB
Raw Blame History

Homelab Claude Code Instructions

Deployment

  • When deploying services, always verify the target host before proceeding. Confirm which host a service should run on and check for port conflicts with existing services.
  • Check ss -tlnp | grep <port> on the target host before deploying.
  • Hosts: atlantis (Synology NAS, media/arr), calypso (Synology, DNS/SSO), olares (K3s, GPU), nuc (lightweight), rpi5 (Kuma), homelab-vm (monitoring/dashboard), guava (TrueNAS), seattle (remote), matrix-ubuntu (NPM/CrowdSec).

Configuration Management

  • Before modifying config files (YAML, JSON, etc.), always create a backup copy first.
  • Never use sed for complex YAML edits — use a proper parser or manual editing to avoid duplicate keys and corruption.
  • For YAML changes, validate with python3 -c "import yaml; yaml.safe_load(open('file.yaml'))" after editing.
  • Never empty or overwrite a config file without reading it first.

Homelab SSH & Networking

  • For homelab SSH operations: if MCP SSH times out on large outputs, fall back to Bash with ssh directly.
  • Always use the correct Tailscale/LAN IP for each host. When Ollama or other services aren't on localhost, check the memory or ask for the correct endpoint before guessing.
  • After making infrastructure changes (Tailscale, DNS, networking), always verify connectivity from affected hosts before marking complete.
  • Never run a second instance of a network daemon (tailscaled, etc.) — it will break host networking.
  • homelab-vm IS localhost — never SSH into it, use local commands.

Heterogeneous Host Awareness

  • Before installing/running anything on a remote host, probe the environment first: uname -a, which <binary>, mount | grep noexec, sudo -n true. Adapt or propose alternatives instead of failing then pivoting.
  • Tailscale binary paths differ across hosts (Synology, GL.iNet, k3s, standard Linux) — verify with which tailscale before assuming.
  • Synology /tmp is noexec — stage scripts in /volume1 or user home.
  • Synology has no git and no SFTP subsystem — use ssh-pipe (cat file | ssh host 'cat > dest') and prefix docker commands with sudo /usr/local/bin/docker.
  • GL.iNet travel routers wipe config on firmware update — reapply watchdog/Tailscale config after every flash.
  • uqiyoe is Windows — use dir/del/rmdir, not ls/rm. SSH user is vish, not homelab.
  • Check architecture (uname -m) before downloading binaries; the fleet has mixed amd64/arm64.

Long-Running Commands

  • Set explicit, short timeouts on SSH/Bash commands. Default 30s, max 120s for known-slow ops.
  • For potentially slow operations (find on NAS, large rsync, apt upgrade): run with run_in_background: true and poll, or scope tightly with -maxdepth/path filters.
  • Never run unbounded find / on NAS or Synology hosts — always anchor to a specific path.
  • For destructive/mutating ops (rsync, dd, rm -rf, db edits): dry-run first, verify checksums/counts, take a backup before applying. Don't trust silent successes — rsync once truncated 70 GB to 74 MB without erroring.

Debugging Discipline

  • Before changing anything to "fix" an issue, list the top 23 candidate root causes ranked by likelihood with one diagnostic per candidate. Run the diagnostics first, share results, then propose a fix. Don't patch the visible symptom (e.g., disabling a Kuma monitor) before confirming the underlying cause.

Verification Discipline

  • After deploying or fixing a service, verify end-to-end before declaring done: curl the endpoint, check Kuma status, tail logs for >60s of clean uptime.
  • Kuma accepted_statuscodes must be quoted strings in JSON: ["200-299"], not [200-299] (parse error otherwise).
  • Commit and push documentation changes in the same session as the infra change — don't leave docs lagging behind reality.

LLM Services

  • When working with LLM model deployments (Ollama, vLLM), always verify: 1) GPU access, 2) context length meets the consumer's requirements, 3) tool-calling support if needed.
  • Ollama is at http://192.168.0.145:31434 (Olares LAN NodePort), NOT localhost.
  • HAMI vGPU on Olares causes ffmpeg segfaults — do NOT request nvidia.com/gpu resources, use runtimeClassName: nvidia directly.

Olares (K3s)

  • Olares admission webhook blocks hostNetwork and reverts custom NetworkPolicies.
  • Use Calico GlobalNetworkPolicy for LAN access (it can't be overridden by the webhook).
  • The Olares proxy adds ~100ms latency — use direct LAN NodePorts for streaming/high-throughput services.
  • Marketplace app patches (NFS mounts, GPU) are lost on app updates — re-apply after updates.

Git & Commits

  • Never add Co-Authored-By lines to git commits.
  • Always run detect-secrets scan --baseline .secrets.baseline before committing if secrets baseline exists.
  • Use pragma: allowlist secret comments for intentional secrets in private repo files.

Documentation

  • After completing each task, immediately update the relevant documentation in the repo and commit with a descriptive message before moving to the next task.
  • Key docs: docs/services/individual/dashboard.md, docs/services/individual/olares.md, scripts/README.md.

Portainer

  • API uses X-API-Key header (NOT Bearer token).
  • Portainer URL: http://100.83.230.112:10000 (Tailscale IP).
  • Endpoints: atlantis=2, calypso=443397, nuc=443398, homelab=443399, rpi5=443395.
  • GitOps stacks use Gitea token for auth — if redeploy fails with "authentication required", credentials need re-entry in Portainer UI.

Dashboard

  • Dashboard runs at http://homelab.tail.vish.gg:3100 (Next.js on port 3100, FastAPI API on port 18888).
  • API proxied through Next.js rewrites — frontend calls /api/* which routes to localhost:18888.
  • 16 glassmorphism themes with Exo 2 font.
  • To rebuild: cd dashboard/ui && rm -rf .next && BACKEND_URL=http://localhost:18888 npm run build && cp -r .next/static .next/standalone/.next/static && cp -r public .next/standalone/public.