Sanitized mirror from private repository - 2026-04-06 21:14:57 UTC
This commit is contained in:
166
docs/admin/mcp-usage-guide.md
Normal file
166
docs/admin/mcp-usage-guide.md
Normal file
@@ -0,0 +1,166 @@
|
||||
# MCP Tool Usage Guide — When and Why
|
||||
|
||||
**For Vesper (AI assistant) reference**
|
||||
|
||||
This guide explains when to use MCP tools vs other approaches, and how each tool category helps in practice.
|
||||
|
||||
---
|
||||
|
||||
## The Core Principle
|
||||
|
||||
Use the **most targeted tool available**. MCP tools are purpose-built for the homelab — they handle auth, error formatting, and homelab-specific context automatically. Bash + curl is a fallback when no MCP exists.
|
||||
|
||||
```
|
||||
MCP tool available? → Use MCP
|
||||
No MCP but known API? → Use bash + curl/httpx
|
||||
Needs complex logic? → Use bash + python3
|
||||
On a remote host? → Use ssh_exec or homelab_ssh_exec
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Decision Tree by Task
|
||||
|
||||
### "Check if a service is running"
|
||||
→ `check_url` for HTTP services
|
||||
→ `list_containers` + `get_container_logs` for Docker containers
|
||||
→ `ssh_exec` + `systemctl status` for systemd services
|
||||
|
||||
### "Deploy a config change"
|
||||
1. Edit the compose file in the repo (Write tool)
|
||||
2. `git commit + push` (bash)
|
||||
3. `redeploy_stack` to trigger GitOps pull
|
||||
|
||||
### "Something broke — diagnose it"
|
||||
→ `get_container_logs` first (fastest)
|
||||
→ `check_portainer` for overall health
|
||||
→ `prometheus_query` for metrics
|
||||
→ `ssh_exec` for deep investigation
|
||||
|
||||
### "Add a new service"
|
||||
1. Write compose file (Write tool)
|
||||
2. `cloudflare_create_dns_record` for public DNS
|
||||
3. `adguard_add_rewrite` if it needs a specific LAN override
|
||||
4. `npm_list_proxy_hosts` + bash NPM API call for reverse proxy
|
||||
5. `kuma_add_monitor` + `kuma_restart` for uptime monitoring
|
||||
6. `authentik_list_applications` to check if SSO needed
|
||||
|
||||
### "Add a new Tailscale node"
|
||||
1. `headscale_create_preauth_key` to generate auth key
|
||||
2. Run `tailscale up --login-server=... --authkey=...` on the new host (ssh_exec)
|
||||
3. `headscale_list_nodes` to confirm it registered
|
||||
4. `adguard_add_rewrite` for `hostname.tail.vish.gg → <tailscale_ip>`
|
||||
5. `kuma_add_monitor` for monitoring
|
||||
|
||||
### "Fix a DNS issue"
|
||||
1. `adguard_list_rewrites` — check current overrides
|
||||
2. Check if the wildcard `*.vish.gg → 100.85.21.51` is causing interference
|
||||
3. `adguard_add_rewrite` for specific override before wildcard
|
||||
4. `cloudflare_list_dns_records` to verify public DNS
|
||||
|
||||
### "Fix an Authentik SSO redirect loop"
|
||||
1. `authentik_list_providers` to find the provider PK
|
||||
2. `authentik_set_provider_cookie_domain` → set `vish.gg`
|
||||
3. Check NPM advanced config has `X-Original-URL` header
|
||||
|
||||
### "Fix a cert issue"
|
||||
1. `npm_list_certs` — identify cert IDs and expiry
|
||||
2. `npm_get_proxy_host` — check which cert a host is using
|
||||
3. `npm_update_cert` — swap to correct cert
|
||||
4. **Never reuse an existing npm-N ID** when adding new certs
|
||||
|
||||
---
|
||||
|
||||
## Tool Category Quick Reference
|
||||
|
||||
### When `check_portainer` is useful
|
||||
- Session start: quick health check before doing anything
|
||||
- After a redeploy: confirm stacks came up
|
||||
- Investigating "something seems slow"
|
||||
|
||||
### When `list_containers` / `get_container_logs` are useful
|
||||
- A service is showing errors in the browser
|
||||
- A stack was redeployed and isn't responding
|
||||
- Checking if a container is actually running (not just the stack)
|
||||
|
||||
### When `adguard_list_rewrites` is essential
|
||||
Any time a service is unreachable from inside the LAN/Tailscale network:
|
||||
- `*.vish.gg → 100.85.21.51` wildcard can intercept services
|
||||
- Portainer, Authentik token exchange, GitOps polling all need correct DNS
|
||||
- Always check AdGuard before assuming network/firewall issues
|
||||
|
||||
### When `npm_*` tools save time
|
||||
- Diagnosing SSL cert mismatches (cert ID → domain mapping)
|
||||
- Checking if a proxy host is enabled and what it forwards to
|
||||
- Swapping certs after LE renewal
|
||||
|
||||
### When `headscale_*` tools are needed
|
||||
- Onboarding a new machine to the tailnet
|
||||
- Diagnosing connectivity issues (is the node online?)
|
||||
- Rotating auth keys for automated nodes
|
||||
|
||||
### When `authentik_*` tools are needed
|
||||
- Adding SSO to a new service (check existing providers, create new)
|
||||
- Fixing redirect loops (cookie_domain)
|
||||
- Updating dashboard tile URLs after service migrations
|
||||
|
||||
### When `cloudflare_*` tools are needed
|
||||
- New public-facing service needs a domain
|
||||
- Migrating a service to a different host IP
|
||||
- Checking if proxied vs unproxied is the issue
|
||||
|
||||
### When `kuma_*` tools are needed
|
||||
- New service deployed → add monitor so we know if it goes down
|
||||
- Service moved to different URL → update existing monitor
|
||||
- Organising monitors into host groups for clarity
|
||||
|
||||
### When `prometheus_query` helps
|
||||
- Checking resource usage before/after a change
|
||||
- Diagnosing "host seems slow" (CPU, memory, disk)
|
||||
- Confirming a service is being scraped correctly
|
||||
|
||||
### When `ssh_exec` is the right choice
|
||||
- The task requires commands not exposed by any MCP tool
|
||||
- Editing config files directly on a host
|
||||
- Running host-specific tools (sqlite3, docker compose, certbot)
|
||||
- Anything that needs interactive investigation
|
||||
|
||||
---
|
||||
|
||||
## MCP vs Bash — Specific Examples
|
||||
|
||||
| Task | Use MCP | Use Bash |
|
||||
|------|---------|----------|
|
||||
| List all Headscale nodes | `headscale_list_nodes` | Only if MCP fails |
|
||||
| Get container logs | `get_container_logs` | Only for very long tails |
|
||||
| Add DNS rewrite | `adguard_add_rewrite` | Never — MCP handles auth |
|
||||
| Check cert on a proxy host | `npm_get_proxy_host` | Only if debugging nginx conf |
|
||||
| Run SQL on Kuma DB | `kuma_add_monitor` / `kuma_set_parent` | Only for complex queries |
|
||||
| Redeploy a stack | `redeploy_stack` | Direct Portainer API if MCP times out |
|
||||
| SSH to a host | `ssh_exec` | `bash + ssh` for interactive sessions |
|
||||
| Edit a compose file | Write tool + git | Never edit directly on host |
|
||||
| Check SABnzbd queue | `sabnzbd_queue` | Only if troubleshooting API |
|
||||
| List all DNS records | `cloudflare_list_dns_records` | Only for bulk operations |
|
||||
|
||||
---
|
||||
|
||||
## Homelab-Specific Gotchas MCP Tools Handle
|
||||
|
||||
### AdGuard wildcard DNS
|
||||
The `*.vish.gg → 100.85.21.51` wildcard means many `*.vish.gg` domains resolve to matrix-ubuntu's Tailscale IP internally. `adguard_list_rewrites` quickly shows which services have specific overrides and which rely on the wildcard. Before blaming a network issue, always check this.
|
||||
|
||||
### NPM cert IDs
|
||||
Each cert in NPM has a numeric ID (npm-1 through npm-12+). `npm_list_certs` shows the mapping. Overwriting an existing npm-N with a different cert breaks every proxy host using that ID — this happened once and took down all `*.vish.gg` services. `npm_list_certs` prevents this.
|
||||
|
||||
### Portainer endpoint IDs
|
||||
Portainer has 5 endpoints with numeric IDs. The MCP tools accept names (`atlantis`, `calypso`, etc.) and resolve them internally — no need to remember IDs.
|
||||
|
||||
### Kuma requires restart
|
||||
Every DB change to Uptime Kuma requires a container restart — Kuma caches config in memory. `kuma_restart` is always the last step after `kuma_add_monitor` or `kuma_set_parent`.
|
||||
|
||||
### Authentik token exchange needs correct DNS
|
||||
When Portainer (on Atlantis) tries to exchange an OAuth code for a token, it calls `sso.vish.gg`. If AdGuard resolves that to the wrong IP, the exchange times out silently. Always verify DNS before debugging OAuth flows.
|
||||
|
||||
---
|
||||
|
||||
**Last updated:** 2026-03-21
|
||||
Reference in New Issue
Block a user