3.9 KiB
3.9 KiB
OpenCode
AI-Powered Coding Agent CLI
Service Overview
| Property | Value |
|---|---|
| Service Name | opencode |
| Category | AI / Development |
| Hosts | homelab VM (192.168.0.210), moon (100.64.0.6) |
| Install | curl -fsSL https://opencode.ai/install | bash |
| Config | ~/.config/opencode/opencode.json |
| LLM Backend | Olares vLLM (Qwen3 30B) |
Purpose
OpenCode is an interactive CLI coding agent (similar to Claude Code) that connects to local LLM backends for AI-assisted software engineering. It runs on developer workstations and connects to the Olares Kubernetes appliance for GPU-accelerated inference.
Architecture
Developer Host (homelab VM / moon)
└── opencode CLI
└── HTTPS → Olares (192.168.0.145)
└── vLLM / Ollama (RTX 5090 Max-Q)
└── Qwen3 30B / GPT-OSS 20B
Installation
# Install opencode
curl -fsSL https://opencode.ai/install | bash
# Create config directory
mkdir -p ~/.config/opencode
# Run in any project directory
cd ~/my-project
opencode
Configuration
Config location: ~/.config/opencode/opencode.json
Configured Providers
| Provider | Model | Backend | Context | Tool Calling |
|---|---|---|---|---|
olares (default) |
Qwen3 30B A3B | vLLM | 40k tokens | Yes (Hermes parser) |
olares-gptoss |
GPT-OSS 20B | vLLM | 65k tokens | Yes |
olares-qwen35 |
Qwen3.5 27B Q4_K_M | Ollama | 65k tokens | No (avoid for OpenCode) |
Switching Models
Edit the "model" field in opencode.json:
"model": "olares//models/qwen3-30b"
Available values:
olares//models/qwen3-30b— recommended, supports tool callingolares-gptoss//models/gpt-oss-20b— larger context, experimental
Loop Prevention
OpenCode can get stuck in tool call loops with smaller models. These settings cap iterations:
"mode": {
"build": {
"steps": 25,
"permission": { "doom_loop": "deny" }
},
"plan": {
"steps": 15,
"permission": { "doom_loop": "deny" }
}
}
steps— max agentic iterations before forcing text responsedoom_loop: "deny"— immediately stop when loop detected
MCP Integration
The homelab MCP server is configured on the homelab VM:
"mcp": {
"homelab": {
"type": "local",
"command": ["python3", "/path/to/homelab-mcp/server.py"],
"enabled": true
}
}
Host-Specific Setup
homelab VM (192.168.0.210)
- User: homelab
- Binary:
~/.opencode/bin/opencode - Config:
~/.config/opencode/opencode.json - MCP: homelab MCP server enabled
- All 3 providers configured
moon (100.64.0.6 via Tailscale)
- User: moon (access via
ssh moon, thensudo -i su - moon) - Binary:
~/.opencode/bin/opencode - Config:
~/.config/opencode/opencode.json - Qwen3 30B provider only
Requirements
- Tool calling support required — OpenCode sends tools with every request. Models without tool call templates (e.g., Ollama Qwen3.5) return 400 errors
- Large context needed — OpenCode's system prompt + tool definitions use ~15-20k tokens. Models with less than 32k context will fail
- vLLM recommended — Use
--enable-auto-tool-choice --tool-call-parser hermesfor reliable function calling
Troubleshooting
| Error | Cause | Fix |
|---|---|---|
bad request / 400 |
Model doesn't support tools, or context exceeded | Switch to vLLM model with tool calling |
max_tokens negative |
Context window too small for system prompt | Increase max_model_len on vLLM |
| Stuck in loops | Model keeps retrying failed tool calls | doom_loop: "deny" and reduce steps |
| Connection refused | LLM endpoint down or auth blocking | Check vLLM pod status, verify auth bypass |
| Slow responses | Dense model on limited GPU | Use MoE model (Qwen3 30B) for faster inference |