6.6 KiB
OpenHands - AI Coding Agent
OpenHands is an autonomous AI software development agent that can execute code, browse the web, and interact with your development environment.
Deployment Options
Option 1: CLI Mode (Recommended) ✅
The CLI runs directly on the host machine without Docker sandbox containers. This is the recommended approach for homelab setups.
Why CLI is better for homelab:
- No Docker-in-Docker networking issues
- More private (see Privacy Considerations)
- Simpler setup and maintenance
- Works reliably on Linux hosts
Installation
# Install uv (fast Python package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh
source ~/.local/bin/env
# Install OpenHands CLI
uv tool install openhands --python 3.12
Configuration
Create a wrapper script for easy usage:
cat > ~/openhands-cli.sh << 'EOF'
#!/bin/bash
export PATH=$HOME/.local/bin:$PATH
export LLM_MODEL=anthropic/claude-sonnet-4-20250514
export LLM_API_KEY=REDACTED_API_KEY
if [ $# -eq 0 ]; then
openhands --override-with-envs --always-approve
else
openhands --override-with-envs "$@"
fi
EOF
chmod +x ~/openhands-cli.sh
Usage
# Interactive TUI mode
~/openhands-cli.sh
# Headless mode (for scripts/automation)
~/openhands-cli.sh --headless -t "Write a Python script that lists files"
# REDACTED_APP_PASSWORD
~/openhands-cli.sh --headless -f task.txt
# Resume a conversation
~/openhands-cli.sh --resume <conversation-id>
Option 2: Docker GUI Mode (Has Issues)
The Docker-based GUI spawns runtime containers dynamically. On Linux, these containers cannot resolve host.docker.internal, causing MCP (Model Context Protocol) failures.
Known issues:
- Runtime containers fail to connect back to main container
host.docker.internalnot resolvable in spawned containers- Error:
Server error '500 Internal Server Error' for url 'http://host.docker.internal:XXXXX/api/conversations'
If you still want to try Docker GUI, the compose file is at:
hosts/vms/homelab-vm/openhands.yaml
Privacy Considerations
CLI vs Docker GUI Privacy
| Aspect | CLI Mode | Docker GUI Mode |
|---|---|---|
| Code execution | Runs on host directly | Runs in isolated containers |
| Network isolation | None (host network) | Partial (container network) |
| Data exposure | Full host access | Limited to mounted volumes |
| API calls | Direct to LLM provider | Direct to LLM provider |
Both modes send your code/prompts to the LLM provider (Anthropic, OpenAI, etc.) unless you use a local model.
What Data Leaves Your Network?
When using cloud LLMs (Claude, GPT-4, etc.):
- Your prompts and task descriptions
- Code snippets you ask it to analyze/modify
- File contents it reads to complete tasks
- Command outputs
To keep everything local, you need a local LLM.
Running Fully Local (Maximum Privacy)
For complete privacy, run OpenHands with a local LLM. No data leaves your network.
Option A: Ollama (Easiest)
-
Install Ollama (if not already running):
# On homelab VM or dedicated machine curl -fsSL https://ollama.com/install.sh | sh # Pull a capable coding model ollama pull deepseek-coder-v2:16b # Or for more capability (needs ~32GB RAM): ollama pull qwen2.5-coder:32b -
Configure OpenHands CLI for Ollama:
cat > ~/openhands-local.sh << 'EOF' #!/bin/bash export PATH=$HOME/.local/bin:$PATH export LLM_MODEL=ollama/deepseek-coder-v2:16b export LLM_BASE_URL=http://localhost:11434 export LLM_API_KEY=ollama # Required but not used openhands --override-with-envs "$@" EOF chmod +x ~/openhands-local.sh -
Run:
~/openhands-local.sh --headless -t "Create a hello world Python script"
Option B: Use Existing Ollama Stack
If you already have Ollama running (e.g., on Atlantis), point OpenHands to it:
export LLM_MODEL=ollama/deepseek-coder-v2:16b
export LLM_BASE_URL=http://atlantis.local:11434
export LLM_API_KEY=ollama
Recommended Local Models for Coding
| Model | VRAM Needed | Quality | Speed |
|---|---|---|---|
deepseek-coder-v2:16b |
~12GB | Good | Fast |
qwen2.5-coder:32b |
~24GB | Better | Medium |
codellama:34b |
~26GB | Good | Medium |
deepseek-coder:33b |
~26GB | Better | Slower |
Option C: Local vLLM/text-generation-inference
For maximum performance with local models:
# docker-compose for vLLM
version: '3.8'
services:
vllm:
image: vllm/vllm-openai:latest
runtime: nvidia
ports:
- "8000:8000"
volumes:
- ~/.cache/huggingface:/root/.cache/huggingface
command: >
--model deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
--trust-remote-code
--max-model-len 32768
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
Then configure OpenHands:
export LLM_MODEL=openai/deepseek-coder-v2
export LLM_BASE_URL=http://localhost:8000/v1
export LLM_API_KEY=dummy
Privacy Comparison Summary
| Setup | Privacy Level | Performance | Cost |
|---|---|---|---|
| Claude/GPT-4 API | ❌ Low (data sent to cloud) | ⚡ Excellent | 💰 Pay per use |
| Ollama + small model | ✅ High (fully local) | 🐢 Good | 🆓 Free |
| vLLM + large model | ✅ High (fully local) | ⚡ Very Good | 🆓 Free (needs GPU) |
Troubleshooting
CLI won't start
# Ensure PATH includes local bin
export PATH=$HOME/.local/bin:$PATH
# Reinstall if needed
uv tool install openhands --python 3.12 --force
"Headless mode requires existing settings"
Use --override-with-envs flag to bypass the settings requirement:
openhands --headless --override-with-envs -t "your task"
Local model is slow
- Use a smaller model (7B-16B parameters)
- Ensure you have enough RAM/VRAM
- Consider quantized models (Q4_K_M, Q5_K_M)
Ollama connection refused
# Check if Ollama is running
systemctl status ollama
# Start it
sudo systemctl start ollama
# Or run manually
ollama serve
Related Services
- Ollama (
Atlantis/ollama/) - Local LLM inference - Perplexica (
homelab_vm/perplexica.yaml) - AI-powered search (docs)