Vish/homelab-optimized

Fork 0

Files

Gitea Mirror Bot 24f1036b45

Documentation / Deploy to GitHub Pages (push) Has been cancelled

Details

Documentation / Build Docusaurus (push) Has been cancelled

Details

Sanitized mirror from private repository - 2026-04-16 07:04:43 UTC

2026-04-16 07:04:43 +00:00

4.2 KiB

Raw Blame History

AnythingLLM

Local RAG Document Assistant

Service Overview

Property	Value
Service Name	anythingllm
Host	atlantis
Category	AI
Docker Image	`mintplexlabs/anythingllm:latest`
Compose File	`hosts/synology/atlantis/anythingllm/docker-compose.yml`
Port	3101
URL	`http://192.168.0.200:3101`

Purpose

AnythingLLM is a self-hosted, local-first document assistant powered by RAG (Retrieval-Augmented Generation). It indexes documents into a vector database, then uses a local LLM to answer questions with context from those documents.

Primary use cases:

Semantic search across all Paperless-NGX documents (355 docs as of 2026-03-15)
Natural language Q&A over document library ("find my 2024 property tax assessment")
Document summarization ("summarize my medical records")

Architecture

AnythingLLM (atlantis:3101)
├── Embedder: built-in all-MiniLM-L6-v2 (CPU, runs locally)
├── Vector DB: built-in LanceDB (no external service)
├── LLM: Olares qwen3:32b (30B, RTX 5090)
│   └── Endpoint: https://a5be22681.vishinator.olares.com/v1
└── Documents: Paperless-NGX archive (mounted read-only)

Configuration

Configuration is done through the web UI on first launch at http://192.168.0.200:3101.

LLM Provider Setup

Setting	Value
Provider	Generic OpenAI
Base URL	`https://a5be22681.vishinator.olares.com/v1`
Model	`qwen3:32b`
Token Limit	65536
API Key	(leave blank or any string — Olares auth is bypassed for this endpoint)

Embedding Setup

Setting	Value
Provider	AnythingLLM (built-in)
Model	all-MiniLM-L6-v2

No external embedding service needed. Runs on CPU inside the container.

Vector Database

Setting	Value
Provider	LanceDB (built-in)

No external vector DB service needed. Data stored in the container volume.

Volumes

Container Path	Host Path	Purpose
`/app/server/storage`	`/volume2/metadata/docker/anythingllm/storage`	Config, vector DB, user data
`/documents/paperless-archive`	`/volume1/archive/paperless/backup_2026-03-15/media/documents/archive`	OCR'd Paperless PDFs (read-only)
`/documents/paperless-originals`	`/volume1/archive/paperless/backup_2026-03-15/media/documents/originals`	Original Paperless uploads (read-only)

Document Import

After initial setup via the UI:

Create a workspace (e.g., "Documents")
Open the workspace, click the upload/document icon
Browse to /documents/paperless-archive — these are OCR'd PDFs with searchable text
Select all files and embed them into the workspace
AnythingLLM will chunk, embed, and index all documents

The archive directory contains 339 OCR'd PDFs; originals has 355 files (includes non-PDF formats that Tika processed).

Paperless-NGX Backup

The documents served to AnythingLLM come from a Paperless-NGX backup taken 2026-03-15:

Property	Value
Source	calypso `/volume1/docker/paperlessngx/`
Destination	atlantis `/volume1/archive/paperless/backup_2026-03-15/`
Size	1.6 GB
Documents	355 total (339 with OCR archive)
Previous backup	`/volume1/archive/paperless/paperless_backup_2025-12-03.tar.gz`

Dependencies

Olares must be running with qwen3:32b loaded (the only model on that box)
Olares endpoint must be accessible from atlantis LAN (192.168.0.145)
No dependency on atlantis Ollama (stopped — not needed)

Troubleshooting

Issue	Cause	Fix
LLM responses fail	Olares qwen3:32b not running	Check: `ssh olares "sudo kubectl get pods -n ollamaserver-shared"` and scale up if needed
Slow embedding	Expected on CPU (Ryzen V1780B)	Initial 355-doc ingestion may take a while; subsequent queries are fast
Empty search results	Documents not yet embedded	Check workspace → documents tab, ensure files are uploaded and embedded
502 from Olares endpoint	Model loading / pod restarting	Wait 2-3 min, check Olares pod status

4.2 KiB Raw Blame History