Sanitized mirror from private repository - 2026-03-21 06:37:51 UTC
Some checks failed
Documentation / Build Docusaurus (push) Failing after 4m58s
Documentation / Deploy to GitHub Pages (push) Has been skipped

This commit is contained in:
Gitea Mirror Bot
2026-03-21 06:37:51 +00:00
commit f273b940ad
1235 changed files with 306105 additions and 0 deletions

View File

@@ -0,0 +1,133 @@
# OpenCode
**AI-Powered Coding Agent CLI**
## Service Overview
| Property | Value |
|----------|-------|
| **Service Name** | opencode |
| **Category** | AI / Development |
| **Hosts** | homelab VM (192.168.0.210), moon (100.64.0.6) |
| **Install** | `curl -fsSL https://opencode.ai/install \| bash` |
| **Config** | `~/.config/opencode/opencode.json` |
| **LLM Backend** | Olares vLLM (Qwen3 30B) |
## Purpose
OpenCode is an interactive CLI coding agent (similar to Claude Code) that connects to local LLM backends for AI-assisted software engineering. It runs on developer workstations and connects to the Olares Kubernetes appliance for GPU-accelerated inference.
## Architecture
```
Developer Host (homelab VM / moon)
└── opencode CLI
└── HTTPS → Olares (192.168.0.145)
└── vLLM / Ollama (RTX 5090 Max-Q)
└── Qwen3 30B / GPT-OSS 20B
```
## Installation
```bash
# Install opencode
curl -fsSL https://opencode.ai/install | bash
# Create config directory
mkdir -p ~/.config/opencode
# Run in any project directory
cd ~/my-project
opencode
```
## Configuration
Config location: `~/.config/opencode/opencode.json`
### Configured Providers
| Provider | Model | Backend | Context | Tool Calling |
|----------|-------|---------|---------|-------------|
| `olares` (default) | Qwen3 30B A3B | vLLM | 40k tokens | Yes (Hermes parser) |
| `olares-gptoss` | GPT-OSS 20B | vLLM | 65k tokens | Yes |
| `olares-qwen35` | Qwen3.5 27B Q4_K_M | Ollama | 65k tokens | No (avoid for OpenCode) |
### Switching Models
Edit the `"model"` field in `opencode.json`:
```json
"model": "olares//models/qwen3-30b"
```
Available values:
- `olares//models/qwen3-30b` — recommended, supports tool calling
- `olares-gptoss//models/gpt-oss-20b` — larger context, experimental
### Loop Prevention
OpenCode can get stuck in tool call loops with smaller models. These settings cap iterations:
```json
"mode": {
"build": {
"steps": 25,
"permission": { "doom_loop": "deny" }
},
"plan": {
"steps": 15,
"permission": { "doom_loop": "deny" }
}
}
```
- `steps` — max agentic iterations before forcing text response
- `doom_loop: "deny"` — immediately stop when loop detected
### MCP Integration
The homelab MCP server is configured on the homelab VM:
```json
"mcp": {
"homelab": {
"type": "local",
"command": ["python3", "/path/to/homelab-mcp/server.py"],
"enabled": true
}
}
```
## Host-Specific Setup
### homelab VM (192.168.0.210)
- **User**: homelab
- **Binary**: `~/.opencode/bin/opencode`
- **Config**: `~/.config/opencode/opencode.json`
- **MCP**: homelab MCP server enabled
- **All 3 providers configured**
### moon (100.64.0.6 via Tailscale)
- **User**: moon (access via `ssh moon`, then `sudo -i su - moon`)
- **Binary**: `~/.opencode/bin/opencode`
- **Config**: `~/.config/opencode/opencode.json`
- **Qwen3 30B provider only**
## Requirements
- **Tool calling support required** — OpenCode sends tools with every request. Models without tool call templates (e.g., Ollama Qwen3.5) return 400 errors
- **Large context needed** — OpenCode's system prompt + tool definitions use ~15-20k tokens. Models with less than 32k context will fail
- **vLLM recommended** — Use `--enable-auto-tool-choice --tool-call-parser hermes` for reliable function calling
## Troubleshooting
| Error | Cause | Fix |
|-------|-------|-----|
| `bad request` / 400 | Model doesn't support tools, or context exceeded | Switch to vLLM model with tool calling |
| `max_tokens negative` | Context window too small for system prompt | Increase `max_model_len` on vLLM |
| Stuck in loops | Model keeps retrying failed tool calls | `doom_loop: "deny"` and reduce `steps` |
| Connection refused | LLM endpoint down or auth blocking | Check vLLM pod status, verify auth bypass |
| Slow responses | Dense model on limited GPU | Use MoE model (Qwen3 30B) for faster inference |