134 lines
3.9 KiB
Markdown
134 lines
3.9 KiB
Markdown
# OpenCode
|
|
|
|
**AI-Powered Coding Agent CLI**
|
|
|
|
## Service Overview
|
|
|
|
| Property | Value |
|
|
|----------|-------|
|
|
| **Service Name** | opencode |
|
|
| **Category** | AI / Development |
|
|
| **Hosts** | homelab VM (192.168.0.210), moon (100.64.0.6) |
|
|
| **Install** | `curl -fsSL https://opencode.ai/install \| bash` |
|
|
| **Config** | `~/.config/opencode/opencode.json` |
|
|
| **LLM Backend** | Olares vLLM (Qwen3 30B) |
|
|
|
|
## Purpose
|
|
|
|
OpenCode is an interactive CLI coding agent (similar to Claude Code) that connects to local LLM backends for AI-assisted software engineering. It runs on developer workstations and connects to the Olares Kubernetes appliance for GPU-accelerated inference.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Developer Host (homelab VM / moon)
|
|
└── opencode CLI
|
|
└── HTTPS → Olares (192.168.0.145)
|
|
└── vLLM / Ollama (RTX 5090 Max-Q)
|
|
└── Qwen3 30B / GPT-OSS 20B
|
|
```
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
# Install opencode
|
|
curl -fsSL https://opencode.ai/install | bash
|
|
|
|
# Create config directory
|
|
mkdir -p ~/.config/opencode
|
|
|
|
# Run in any project directory
|
|
cd ~/my-project
|
|
opencode
|
|
```
|
|
|
|
## Configuration
|
|
|
|
Config location: `~/.config/opencode/opencode.json`
|
|
|
|
### Configured Providers
|
|
|
|
| Provider | Model | Backend | Context | Tool Calling |
|
|
|----------|-------|---------|---------|-------------|
|
|
| `olares` (default) | Qwen3 30B A3B | vLLM | 40k tokens | Yes (Hermes parser) |
|
|
| `olares-gptoss` | GPT-OSS 20B | vLLM | 65k tokens | Yes |
|
|
| `olares-qwen35` | Qwen3.5 27B Q4_K_M | Ollama | 65k tokens | No (avoid for OpenCode) |
|
|
|
|
### Switching Models
|
|
|
|
Edit the `"model"` field in `opencode.json`:
|
|
|
|
```json
|
|
"model": "olares//models/qwen3-30b"
|
|
```
|
|
|
|
Available values:
|
|
- `olares//models/qwen3-30b` — recommended, supports tool calling
|
|
- `olares-gptoss//models/gpt-oss-20b` — larger context, experimental
|
|
|
|
### Loop Prevention
|
|
|
|
OpenCode can get stuck in tool call loops with smaller models. These settings cap iterations:
|
|
|
|
```json
|
|
"mode": {
|
|
"build": {
|
|
"steps": 25,
|
|
"permission": { "doom_loop": "deny" }
|
|
},
|
|
"plan": {
|
|
"steps": 15,
|
|
"permission": { "doom_loop": "deny" }
|
|
}
|
|
}
|
|
```
|
|
|
|
- `steps` — max agentic iterations before forcing text response
|
|
- `doom_loop: "deny"` — immediately stop when loop detected
|
|
|
|
### MCP Integration
|
|
|
|
The homelab MCP server is configured on the homelab VM:
|
|
|
|
```json
|
|
"mcp": {
|
|
"homelab": {
|
|
"type": "local",
|
|
"command": ["python3", "/path/to/homelab-mcp/server.py"],
|
|
"enabled": true
|
|
}
|
|
}
|
|
```
|
|
|
|
## Host-Specific Setup
|
|
|
|
### homelab VM (192.168.0.210)
|
|
|
|
- **User**: homelab
|
|
- **Binary**: `~/.opencode/bin/opencode`
|
|
- **Config**: `~/.config/opencode/opencode.json`
|
|
- **MCP**: homelab MCP server enabled
|
|
- **All 3 providers configured**
|
|
|
|
### moon (100.64.0.6 via Tailscale)
|
|
|
|
- **User**: moon (access via `ssh moon`, then `sudo -i su - moon`)
|
|
- **Binary**: `~/.opencode/bin/opencode`
|
|
- **Config**: `~/.config/opencode/opencode.json`
|
|
- **Qwen3 30B provider only**
|
|
|
|
## Requirements
|
|
|
|
- **Tool calling support required** — OpenCode sends tools with every request. Models without tool call templates (e.g., Ollama Qwen3.5) return 400 errors
|
|
- **Large context needed** — OpenCode's system prompt + tool definitions use ~15-20k tokens. Models with less than 32k context will fail
|
|
- **vLLM recommended** — Use `--enable-auto-tool-choice --tool-call-parser hermes` for reliable function calling
|
|
|
|
## Troubleshooting
|
|
|
|
| Error | Cause | Fix |
|
|
|-------|-------|-----|
|
|
| `bad request` / 400 | Model doesn't support tools, or context exceeded | Switch to vLLM model with tool calling |
|
|
| `max_tokens negative` | Context window too small for system prompt | Increase `max_model_len` on vLLM |
|
|
| Stuck in loops | Model keeps retrying failed tool calls | `doom_loop: "deny"` and reduce `steps` |
|
|
| Connection refused | LLM endpoint down or auth blocking | Check vLLM pod status, verify auth bypass |
|
|
| Slow responses | Dense model on limited GPU | Use MoE model (Qwen3 30B) for faster inference |
|