homelab-optimized/docs/services/individual/anythingllm.md

# AnythingLLM

**Local RAG Document Assistant**

## Service Overview

| Property | Value |
|----------|-------|
| **Service Name** | anythingllm |
| **Host** | atlantis |
| **Category** | AI |
| **Docker Image** | `mintplexlabs/anythingllm:latest` |
| **Compose File** | `hosts/synology/atlantis/anythingllm/docker-compose.yml` |
| **Port** | 3101 |
| **URL** | `http://192.168.0.200:3101` |

## Purpose

AnythingLLM is a self-hosted, local-first document assistant powered by RAG (Retrieval-Augmented Generation). It indexes documents into a vector database, then uses a local LLM to answer questions with context from those documents.

Primary use cases:
- Semantic search across all Paperless-NGX documents (355 docs as of 2026-03-15)
- Natural language Q&A over document library ("find my 2024 property tax assessment")
- Document summarization ("summarize my medical records")

## Architecture

```
AnythingLLM (atlantis:3101)
├── Embedder: built-in all-MiniLM-L6-v2 (CPU, runs locally)
├── Vector DB: built-in LanceDB (no external service)
├── LLM: Olares qwen3-coder:latest (30B MoE, RTX 5090)
│   └── Endpoint: https://a5be22681.vishinator.olares.com/v1
└── Documents: Paperless-NGX archive (mounted read-only)
```

## Configuration

Configuration is done through the web UI on first launch at `http://192.168.0.200:3101`.

### LLM Provider Setup

| Setting | Value |
|---------|-------|
| **Provider** | Generic OpenAI |
| **Base URL** | `https://a5be22681.vishinator.olares.com/v1` |
| **Model** | `qwen3-coder:latest` |
| **Token Limit** | 65536 |
| **API Key** | (leave blank or any string — Olares auth is bypassed for this endpoint) |

### Embedding Setup

| Setting | Value |
|---------|-------|
| **Provider** | AnythingLLM (built-in) |
| **Model** | all-MiniLM-L6-v2 |

No external embedding service needed. Runs on CPU inside the container.

### Vector Database

| Setting | Value |
|---------|-------|
| **Provider** | LanceDB (built-in) |

No external vector DB service needed. Data stored in the container volume.

## Volumes

| Container Path | Host Path | Purpose |
|----------------|-----------|---------|
| `/app/server/storage` | `/volume2/metadata/docker/anythingllm/storage` | Config, vector DB, user data |
| `/documents/paperless-archive` | `/volume1/archive/paperless/backup_2026-03-15/media/documents/archive` | OCR'd Paperless PDFs (read-only) |
| `/documents/paperless-originals` | `/volume1/archive/paperless/backup_2026-03-15/media/documents/originals` | Original Paperless uploads (read-only) |

## Document Import

After initial setup via the UI:

1. Create a workspace (e.g., "Documents")
2. Open the workspace, click the upload/document icon
3. Browse to `/documents/paperless-archive` — these are OCR'd PDFs with searchable text
4. Select all files and embed them into the workspace
5. AnythingLLM will chunk, embed, and index all documents

The archive directory contains 339 OCR'd PDFs; originals has 355 files (includes non-PDF formats that Tika processed).

## Paperless-NGX Backup

The documents served to AnythingLLM come from a Paperless-NGX backup taken 2026-03-15:

| Property | Value |
|----------|-------|
| **Source** | calypso `/volume1/docker/paperlessngx/` |
| **Destination** | atlantis `/volume1/archive/paperless/backup_2026-03-15/` |
| **Size** | 1.6 GB |
| **Documents** | 355 total (339 with OCR archive) |
| **Previous backup** | `/volume1/archive/paperless/paperless_backup_2025-12-03.tar.gz` |

## Dependencies

- **Olares** must be running with `qwen3-coder:latest` loaded. We standardized AnythingLLM and Perplexica on `qwen3-coder` to avoid VRAM swap cycles REDACTED_APP_PASSWORD use `qwen3:32b`. See [AI Integrations](../../admin/ai-integrations.md) for the migration history.
- Olares endpoint must be accessible from atlantis LAN (192.168.0.145)
- No dependency on atlantis Ollama (stopped — not needed)

## Troubleshooting

| Issue | Cause | Fix |
|-------|-------|-----|
| LLM responses fail | Olares `qwen3-coder:latest` not running | Check: `ssh olares "sudo kubectl get pods -n ollamaserver-shared"` and scale up if needed |
| Slow embedding | Expected on CPU (Ryzen V1780B) | Initial 355-doc ingestion may take a while; subsequent queries are fast |
| Empty search results | Documents not yet embedded | Check workspace → documents tab, ensure files are uploaded and embedded |
| 502 from Olares endpoint | Model loading / pod restarting | Wait 2-3 min, check Olares pod status |