Files
homelab-optimized/docs/services/individual/opencode.md
Gitea Mirror Bot 4bb38d4e1f
Some checks failed
Documentation / Build Docusaurus (push) Failing after 5m12s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-03-21 08:56:04 UTC
2026-03-21 08:56:04 +00:00

3.9 KiB

OpenCode

AI-Powered Coding Agent CLI

Service Overview

Property Value
Service Name opencode
Category AI / Development
Hosts homelab VM (192.168.0.210), moon (100.64.0.6)
Install curl -fsSL https://opencode.ai/install | bash
Config ~/.config/opencode/opencode.json
LLM Backend Olares vLLM (Qwen3 30B)

Purpose

OpenCode is an interactive CLI coding agent (similar to Claude Code) that connects to local LLM backends for AI-assisted software engineering. It runs on developer workstations and connects to the Olares Kubernetes appliance for GPU-accelerated inference.

Architecture

Developer Host (homelab VM / moon)
  └── opencode CLI
        └── HTTPS → Olares (192.168.0.145)
              └── vLLM / Ollama (RTX 5090 Max-Q)
                    └── Qwen3 30B / GPT-OSS 20B

Installation

# Install opencode
curl -fsSL https://opencode.ai/install | bash

# Create config directory
mkdir -p ~/.config/opencode

# Run in any project directory
cd ~/my-project
opencode

Configuration

Config location: ~/.config/opencode/opencode.json

Configured Providers

Provider Model Backend Context Tool Calling
olares (default) Qwen3 30B A3B vLLM 40k tokens Yes (Hermes parser)
olares-gptoss GPT-OSS 20B vLLM 65k tokens Yes
olares-qwen35 Qwen3.5 27B Q4_K_M Ollama 65k tokens No (avoid for OpenCode)

Switching Models

Edit the "model" field in opencode.json:

"model": "olares//models/qwen3-30b"

Available values:

  • olares//models/qwen3-30b — recommended, supports tool calling
  • olares-gptoss//models/gpt-oss-20b — larger context, experimental

Loop Prevention

OpenCode can get stuck in tool call loops with smaller models. These settings cap iterations:

"mode": {
  "build": {
    "steps": 25,
    "permission": { "doom_loop": "deny" }
  },
  "plan": {
    "steps": 15,
    "permission": { "doom_loop": "deny" }
  }
}
  • steps — max agentic iterations before forcing text response
  • doom_loop: "deny" — immediately stop when loop detected

MCP Integration

The homelab MCP server is configured on the homelab VM:

"mcp": {
  "homelab": {
    "type": "local",
    "command": ["python3", "/path/to/homelab-mcp/server.py"],
    "enabled": true
  }
}

Host-Specific Setup

homelab VM (192.168.0.210)

  • User: homelab
  • Binary: ~/.opencode/bin/opencode
  • Config: ~/.config/opencode/opencode.json
  • MCP: homelab MCP server enabled
  • All 3 providers configured

moon (100.64.0.6 via Tailscale)

  • User: moon (access via ssh moon, then sudo -i su - moon)
  • Binary: ~/.opencode/bin/opencode
  • Config: ~/.config/opencode/opencode.json
  • Qwen3 30B provider only

Requirements

  • Tool calling support required — OpenCode sends tools with every request. Models without tool call templates (e.g., Ollama Qwen3.5) return 400 errors
  • Large context needed — OpenCode's system prompt + tool definitions use ~15-20k tokens. Models with less than 32k context will fail
  • vLLM recommended — Use --enable-auto-tool-choice --tool-call-parser hermes for reliable function calling

Troubleshooting

Error Cause Fix
bad request / 400 Model doesn't support tools, or context exceeded Switch to vLLM model with tool calling
max_tokens negative Context window too small for system prompt Increase max_model_len on vLLM
Stuck in loops Model keeps retrying failed tool calls doom_loop: "deny" and reduce steps
Connection refused LLM endpoint down or auth blocking Check vLLM pod status, verify auth bypass
Slow responses Dense model on limited GPU Use MoE model (Qwen3 30B) for faster inference