# OpenCode **AI-Powered Coding Agent CLI** ## Service Overview | Property | Value | |----------|-------| | **Service Name** | opencode | | **Category** | AI / Development | | **Hosts** | homelab VM (192.168.0.210), moon (100.64.0.6) | | **Install** | `curl -fsSL https://opencode.ai/install \| bash` | | **Config** | `~/.config/opencode/opencode.json` | | **LLM Backend** | Olares vLLM (Qwen3 30B) | ## Purpose OpenCode is an interactive CLI coding agent (similar to Claude Code) that connects to local LLM backends for AI-assisted software engineering. It runs on developer workstations and connects to the Olares Kubernetes appliance for GPU-accelerated inference. ## Architecture ``` Developer Host (homelab VM / moon) └── opencode CLI └── HTTPS → Olares (192.168.0.145) └── vLLM / Ollama (RTX 5090 Max-Q) └── Qwen3 30B / GPT-OSS 20B ``` ## Installation ```bash # Install opencode curl -fsSL https://opencode.ai/install | bash # Create config directory mkdir -p ~/.config/opencode # Run in any project directory cd ~/my-project opencode ``` ## Configuration Config location: `~/.config/opencode/opencode.json` ### Configured Providers | Provider | Model | Backend | Context | Tool Calling | |----------|-------|---------|---------|-------------| | `olares` (default) | Qwen3 30B A3B | vLLM | 40k tokens | Yes (Hermes parser) | | `olares-gptoss` | GPT-OSS 20B | vLLM | 65k tokens | Yes | | `olares-qwen35` | Qwen3.5 27B Q4_K_M | Ollama | 65k tokens | No (avoid for OpenCode) | ### Switching Models Edit the `"model"` field in `opencode.json`: ```json "model": "olares//models/qwen3-30b" ``` Available values: - `olares//models/qwen3-30b` — recommended, supports tool calling - `olares-gptoss//models/gpt-oss-20b` — larger context, experimental ### Loop Prevention OpenCode can get stuck in tool call loops with smaller models. These settings cap iterations: ```json "mode": { "build": { "steps": 25, "permission": { "doom_loop": "deny" } }, "plan": { "steps": 15, "permission": { "doom_loop": "deny" } } } ``` - `steps` — max agentic iterations before forcing text response - `doom_loop: "deny"` — immediately stop when loop detected ### MCP Integration The homelab MCP server is configured on the homelab VM: ```json "mcp": { "homelab": { "type": "local", "command": ["python3", "/path/to/homelab-mcp/server.py"], "enabled": true } } ``` ## Host-Specific Setup ### homelab VM (192.168.0.210) - **User**: homelab - **Binary**: `~/.opencode/bin/opencode` - **Config**: `~/.config/opencode/opencode.json` - **MCP**: homelab MCP server enabled - **All 3 providers configured** ### moon (100.64.0.6 via Tailscale) - **User**: moon (access via `ssh moon`, then `sudo -i su - moon`) - **Binary**: `~/.opencode/bin/opencode` - **Config**: `~/.config/opencode/opencode.json` - **Qwen3 30B provider only** ## Requirements - **Tool calling support required** — OpenCode sends tools with every request. Models without tool call templates (e.g., Ollama Qwen3.5) return 400 errors - **Large context needed** — OpenCode's system prompt + tool definitions use ~15-20k tokens. Models with less than 32k context will fail - **vLLM recommended** — Use `--enable-auto-tool-choice --tool-call-parser hermes` for reliable function calling ## Troubleshooting | Error | Cause | Fix | |-------|-------|-----| | `bad request` / 400 | Model doesn't support tools, or context exceeded | Switch to vLLM model with tool calling | | `max_tokens negative` | Context window too small for system prompt | Increase `max_model_len` on vLLM | | Stuck in loops | Model keeps retrying failed tool calls | `doom_loop: "deny"` and reduce `steps` | | Connection refused | LLM endpoint down or auth blocking | Check vLLM pod status, verify auth bypass | | Slow responses | Dense model on limited GPU | Use MoE model (Qwen3 30B) for faster inference |