Sanitized mirror from private repository - 2026-03-21 06:37:51 UTC

2026-03-21 06:37:51 +00:00
commit f273b940ad
1235 changed files with 306105 additions and 0 deletions
--- a/docs/services/individual/opencode.md
+++ b/docs/services/individual/opencode.md
@@ -0,0 +1,133 @@
+# OpenCode
+
+**AI-Powered Coding Agent CLI**
+
+## Service Overview
+
+| Property | Value |
+|----------|-------|
+| **Service Name** | opencode |
+| **Category** | AI / Development |
+| **Hosts** | homelab VM (192.168.0.210), moon (100.64.0.6) |
+| **Install** | `curl -fsSL https://opencode.ai/install \| bash` |
+| **Config** | `~/.config/opencode/opencode.json` |
+| **LLM Backend** | Olares vLLM (Qwen3 30B) |
+
+## Purpose
+
+OpenCode is an interactive CLI coding agent (similar to Claude Code) that connects to local LLM backends for AI-assisted software engineering. It runs on developer workstations and connects to the Olares Kubernetes appliance for GPU-accelerated inference.
+
+## Architecture
+
+```
+Developer Host (homelab VM / moon)
+  └── opencode CLI
+        └── HTTPS → Olares (192.168.0.145)
+              └── vLLM / Ollama (RTX 5090 Max-Q)
+                    └── Qwen3 30B / GPT-OSS 20B
+```
+
+## Installation
+
+```bash
+# Install opencode
+curl -fsSL https://opencode.ai/install | bash
+
+# Create config directory
+mkdir -p ~/.config/opencode
+
+# Run in any project directory
+cd ~/my-project
+opencode
+```
+
+## Configuration
+
+Config location: `~/.config/opencode/opencode.json`
+
+### Configured Providers
+
+| Provider | Model | Backend | Context | Tool Calling |
+|----------|-------|---------|---------|-------------|
+| `olares` (default) | Qwen3 30B A3B | vLLM | 40k tokens | Yes (Hermes parser) |
+| `olares-gptoss` | GPT-OSS 20B | vLLM | 65k tokens | Yes |
+| `olares-qwen35` | Qwen3.5 27B Q4_K_M | Ollama | 65k tokens | No (avoid for OpenCode) |
+
+### Switching Models
+
+Edit the `"model"` field in `opencode.json`:
+
+```json
+"model": "olares//models/qwen3-30b"
+```
+
+Available values:
+- `olares//models/qwen3-30b` — recommended, supports tool calling
+- `olares-gptoss//models/gpt-oss-20b` — larger context, experimental
+
+### Loop Prevention
+
+OpenCode can get stuck in tool call loops with smaller models. These settings cap iterations:
+
+```json
+"mode": {
+  "build": {
+    "steps": 25,
+    "permission": { "doom_loop": "deny" }
+  },
+  "plan": {
+    "steps": 15,
+    "permission": { "doom_loop": "deny" }
+  }
+}
+```
+
+- `steps` — max agentic iterations before forcing text response
+- `doom_loop: "deny"` — immediately stop when loop detected
+
+### MCP Integration
+
+The homelab MCP server is configured on the homelab VM:
+
+```json
+"mcp": {
+  "homelab": {
+    "type": "local",
+    "command": ["python3", "/path/to/homelab-mcp/server.py"],
+    "enabled": true
+  }
+}
+```
+
+## Host-Specific Setup
+
+### homelab VM (192.168.0.210)
+
+- **User**: homelab
+- **Binary**: `~/.opencode/bin/opencode`
+- **Config**: `~/.config/opencode/opencode.json`
+- **MCP**: homelab MCP server enabled
+- **All 3 providers configured**
+
+### moon (100.64.0.6 via Tailscale)
+
+- **User**: moon (access via `ssh moon`, then `sudo -i su - moon`)
+- **Binary**: `~/.opencode/bin/opencode`
+- **Config**: `~/.config/opencode/opencode.json`
+- **Qwen3 30B provider only**
+
+## Requirements
+
+- **Tool calling support required** — OpenCode sends tools with every request. Models without tool call templates (e.g., Ollama Qwen3.5) return 400 errors
+- **Large context needed** — OpenCode's system prompt + tool definitions use ~15-20k tokens. Models with less than 32k context will fail
+- **vLLM recommended** — Use `--enable-auto-tool-choice --tool-call-parser hermes` for reliable function calling
+
+## Troubleshooting
+
+| Error | Cause | Fix |
+|-------|-------|-----|
+| `bad request` / 400 | Model doesn't support tools, or context exceeded | Switch to vLLM model with tool calling |
+| `max_tokens negative` | Context window too small for system prompt | Increase `max_model_len` on vLLM |
+| Stuck in loops | Model keeps retrying failed tool calls | `doom_loop: "deny"` and reduce `steps` |
+| Connection refused | LLM endpoint down or auth blocking | Check vLLM pod status, verify auth bypass |
+| Slow responses | Dense model on limited GPU | Use MoE model (Qwen3 30B) for faster inference |