442 lines
10 KiB
Markdown
442 lines
10 KiB
Markdown
# Perplexica - AI-Powered Search Engine
|
|
|
|
Perplexica is a self-hosted AI-powered search engine that combines traditional search with Large Language Models to provide intelligent, conversational search results.
|
|
|
|
## Overview
|
|
|
|
| Setting | Value |
|
|
|---------|-------|
|
|
| **Host** | Homelab VM (192.168.0.210) |
|
|
| **Port** | 4785 |
|
|
| **Image** | `itzcrazykns1337/perplexica:latest` |
|
|
| **Web UI** | http://192.168.0.210:4785 |
|
|
| **Settings** | http://192.168.0.210:4785/settings |
|
|
| **Stack File** | `hosts/vms/homelab-vm/perplexica.yaml` |
|
|
|
|
## Features
|
|
|
|
- **AI-Powered Search**: Combines web search with LLM reasoning
|
|
- **Multiple LLM Support**: OpenAI, Ollama, Anthropic, Gemini, Groq, LM Studio
|
|
- **Integrated SearXNG**: Self-hosted search engine for privacy
|
|
- **Media Search**: Automatic image and video search
|
|
- **Chat History**: Persistent conversation storage
|
|
- **Custom System Instructions**: Personalize AI behavior
|
|
|
|
## Current Configuration
|
|
|
|
### LLM Providers
|
|
|
|
Perplexica is currently configured with:
|
|
|
|
1. **Transformers** (Built-in)
|
|
- Embedding models for semantic search
|
|
- No external API needed
|
|
|
|
2. **Ollama - Atlantis** (Primary)
|
|
- Base URL: `http://192.168.0.200:11434` (local network)
|
|
- Public URL: `https://ollama.vish.gg` (Cloudflare proxy - may timeout)
|
|
- Available models:
|
|
- `qwen2.5:3b` - Fast, efficient
|
|
- `qwen2.5:1.5b` - Very fast, lightweight
|
|
- `llama3.2:3b` - Good balance
|
|
- `mistral:7b` - Strong reasoning
|
|
- `codellama:7b` - For code-related searches
|
|
- And 15+ more models
|
|
|
|
3. **Ollama - Seattle** (Secondary/Backup)
|
|
- Base URL: `http://100.82.197.124:11434` (Tailscale VPN)
|
|
- Hosted on Contabo VPS (CPU-only inference)
|
|
- Available models:
|
|
- `qwen2.5:1.5b` - Fast, lightweight
|
|
- Purpose: Load distribution and redundancy
|
|
- See: `hosts/vms/seattle/README-ollama.md`
|
|
|
|
### SearXNG Integration
|
|
|
|
Perplexica includes a built-in SearXNG instance:
|
|
- Runs internally on port 8080
|
|
- Aggregates results from multiple search engines
|
|
- Provides privacy-focused web search
|
|
- Automatically started with the container
|
|
|
|
## Setup & Deployment
|
|
|
|
### Docker Compose
|
|
|
|
```yaml
|
|
services:
|
|
perplexica:
|
|
image: itzcrazykns1337/perplexica:latest
|
|
container_name: perplexica
|
|
ports:
|
|
- "4785:3000"
|
|
environment:
|
|
- OLLAMA_BASE_URL=http://192.168.0.200:11434
|
|
volumes:
|
|
- perplexica-data:/home/perplexica/data
|
|
restart: unless-stopped
|
|
|
|
volumes:
|
|
perplexica-data:
|
|
```
|
|
|
|
**Important:** The `OLLAMA_BASE_URL` environment variable configures which Ollama instance Perplexica uses.
|
|
|
|
**Current Configuration (February 2026):**
|
|
- Using Seattle Ollama: `http://100.82.197.124:11434` (via Tailscale)
|
|
- This distributes LLM inference load to the Contabo VPS
|
|
- CPU-only inference (~8-12 tokens/second)
|
|
- Zero additional cost (VPS already running)
|
|
|
|
### Recent Fixes
|
|
|
|
#### January 2026 - Networking Simplification
|
|
|
|
The configuration was simplified to resolve networking issues:
|
|
|
|
**Before (❌ Had Issues):**
|
|
```yaml
|
|
services:
|
|
perplexica:
|
|
extra_hosts:
|
|
- "host.docker.internal:host-gateway"
|
|
network_mode: bridge
|
|
```
|
|
|
|
**After (✅ Working):**
|
|
```yaml
|
|
services:
|
|
perplexica:
|
|
# Uses default bridge network
|
|
# No extra_hosts needed
|
|
```
|
|
|
|
**What Changed:**
|
|
- Removed `extra_hosts` configuration (not needed for external Ollama access)
|
|
- Removed explicit `network_mode: bridge` (uses default)
|
|
- Simplified networking works better with container DNS
|
|
|
|
#### February 2026 - Cloudflare Timeout Fix
|
|
|
|
Fixed LLM query timeouts by using local Ollama URL:
|
|
|
|
**Problem:**
|
|
- Using `https://ollama.vish.gg` caused Cloudflare 524 timeouts
|
|
- LLM queries took longer than Cloudflare's timeout limit
|
|
- Searches stuck in "answering" state indefinitely
|
|
|
|
**Solution:**
|
|
```yaml
|
|
environment:
|
|
- OLLAMA_BASE_URL=http://192.168.0.200:11434
|
|
```
|
|
|
|
**Result:**
|
|
- Direct local network connection to Ollama
|
|
- No Cloudflare proxy delays
|
|
- Fast, reliable LLM responses
|
|
- Searches complete successfully
|
|
|
|
## Configuration
|
|
|
|
### Adding LLM Providers
|
|
|
|
1. Navigate to http://192.168.0.210:4785/settings
|
|
2. Click "Model Providers"
|
|
3. Add a new provider:
|
|
|
|
#### Example: Ollama Seattle (Secondary Instance)
|
|
|
|
```json
|
|
{
|
|
"name": "Ollama Seattle",
|
|
"type": "ollama",
|
|
"baseURL": "http://100.82.197.124:11434",
|
|
"apiKey": ""
|
|
}
|
|
```
|
|
|
|
Benefits:
|
|
- Load distribution across multiple Ollama instances
|
|
- Redundancy if primary Ollama is down
|
|
- Access to models hosted on seattle VM
|
|
|
|
#### Example: Local LM Studio
|
|
|
|
```json
|
|
{
|
|
"name": "LM Studio",
|
|
"type": "lmstudio",
|
|
"baseURL": "http://100.98.93.15:1234",
|
|
"apiKey": "lm-studio"
|
|
}
|
|
```
|
|
|
|
#### Example: OpenAI
|
|
|
|
```json
|
|
{
|
|
"name": "OpenAI",
|
|
"type": "openai",
|
|
"apiKey": "sk-...",
|
|
"baseURL": "https://api.openai.com/v1"
|
|
}
|
|
```
|
|
|
|
### Custom System Instructions
|
|
|
|
Add personalized behavior in Settings → Personalization:
|
|
|
|
```
|
|
Respond in a friendly and concise tone.
|
|
Format answers as bullet points when appropriate.
|
|
Focus on technical accuracy for programming questions.
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Basic Search
|
|
|
|
1. Open http://192.168.0.210:4785
|
|
2. Enter your search query
|
|
3. Select search mode:
|
|
- **Web Search** - General internet search
|
|
- **Academic** - Research papers and publications
|
|
- **YouTube** - Video search
|
|
- **Code** - GitHub and programming resources
|
|
|
|
### Advanced Features
|
|
|
|
**Auto Media Search:**
|
|
- Automatically finds relevant images and videos
|
|
- Enable in Settings → Preferences
|
|
|
|
**Weather Widget:**
|
|
- Shows current weather on homepage
|
|
- Toggle in Settings → Preferences
|
|
|
|
**News Widget:**
|
|
- Recent news headlines on homepage
|
|
- Toggle in Settings → Preferences
|
|
|
|
## Data Persistence
|
|
|
|
Perplexica stores data in a Docker volume:
|
|
|
|
```bash
|
|
# Location: perplexica-data volume
|
|
/home/perplexica/data/
|
|
├── config.json # App configuration & LLM providers
|
|
└── db.sqlite # Chat history and conversations
|
|
```
|
|
|
|
### Backup
|
|
|
|
```bash
|
|
# Backup perplexica data
|
|
docker run --rm -v perplexica-data:/data -v $(pwd):/backup alpine tar czf /backup/perplexica-backup.tar.gz /data
|
|
|
|
# Restore
|
|
docker run --rm -v perplexica-data:/data -v $(pwd):/backup alpine tar xzf /backup/perplexica-backup.tar.gz -C /
|
|
```
|
|
|
|
## API Access
|
|
|
|
### Configuration API
|
|
|
|
```bash
|
|
# Get current configuration
|
|
curl http://192.168.0.210:4785/api/config
|
|
|
|
# Returns LLM providers, preferences, etc.
|
|
```
|
|
|
|
### Search API
|
|
|
|
```bash
|
|
# Perform a search (requires authentication)
|
|
curl -X POST http://192.168.0.210:4785/api/search \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"query": "what is kubernetes",
|
|
"mode": "web"
|
|
}'
|
|
```
|
|
|
|
## Monitoring
|
|
|
|
### Container Status
|
|
|
|
```bash
|
|
# Check if running
|
|
docker ps | grep perplexica
|
|
|
|
# View logs
|
|
docker logs perplexica
|
|
|
|
# Follow logs
|
|
docker logs -f perplexica
|
|
```
|
|
|
|
### Health Check
|
|
|
|
```bash
|
|
# Test HTTP response
|
|
curl -I http://192.168.0.210:4785
|
|
|
|
# Expected: HTTP/1.1 200 OK
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### SearXNG Errors
|
|
|
|
If you see `ERROR:searx.engines` in logs:
|
|
|
|
```bash
|
|
# Check internal SearXNG
|
|
docker exec perplexica curl http://localhost:8080
|
|
|
|
# These errors are normal and non-critical:
|
|
# - "loading engine ahmia failed" (Tor engine)
|
|
# - "loading engine torch failed" (Tor engine)
|
|
# - "X-Forwarded-For nor X-Real-IP header is set"
|
|
```
|
|
|
|
### LLM Connection Issues
|
|
|
|
**Problem:** "Failed to connect to LLM provider"
|
|
|
|
**Solution:**
|
|
1. Verify the provider URL is accessible from the container
|
|
2. Check API key is correct
|
|
3. For Ollama, ensure models are pulled:
|
|
```bash
|
|
curl https://ollama.vish.gg/api/tags
|
|
```
|
|
|
|
### Container Won't Start
|
|
|
|
```bash
|
|
# Check logs for errors
|
|
docker logs perplexica
|
|
|
|
# Common issues:
|
|
# - Port 4785 already in use
|
|
# - Volume mount permissions
|
|
# - Database corruption (delete and recreate volume)
|
|
```
|
|
|
|
### Database Issues
|
|
|
|
If chat history is corrupted:
|
|
|
|
```bash
|
|
# Stop container
|
|
docker stop perplexica
|
|
|
|
# Backup and reset database
|
|
docker run --rm -v perplexica-data:/data alpine rm /data/db.sqlite
|
|
|
|
# Restart (will create new database)
|
|
docker start perplexica
|
|
```
|
|
|
|
## Privacy Considerations
|
|
|
|
### What Data Leaves Your Network?
|
|
|
|
When using external LLM APIs (OpenAI, Anthropic, etc.):
|
|
- Your search queries
|
|
- Chat history for context
|
|
- Search results fed to the LLM
|
|
|
|
### Keeping Everything Local
|
|
|
|
For maximum privacy, use local models:
|
|
|
|
1. **Use Ollama** (as currently configured)
|
|
- `https://ollama.vish.gg` is your local Ollama instance
|
|
- No data sent to external APIs
|
|
- All processing happens on your hardware
|
|
|
|
2. **Use LM Studio** (Tailscale network)
|
|
- `http://100.98.93.15:1234/v1`
|
|
- Local inference
|
|
- Private to your network
|
|
|
|
## Performance Tips
|
|
|
|
1. **Choose appropriate models:**
|
|
- `qwen2.5:1.5b` - Fastest, basic queries
|
|
- `qwen2.5:3b` - Good balance
|
|
- `mistral:7b` - Better quality, slower
|
|
|
|
2. **Disable auto media search** if you don't need images/videos
|
|
|
|
3. **Use SearXNG directly** for simple searches (bypass AI)
|
|
|
|
4. **Limit context window** in system instructions to reduce token usage
|
|
|
|
## Integration Ideas
|
|
|
|
### Home Assistant
|
|
|
|
Create a custom command to search via Perplexica:
|
|
|
|
```yaml
|
|
# In Home Assistant configuration.yaml
|
|
shell_command:
|
|
perplexica_search: 'curl -X POST http://192.168.0.210:4785/api/search -d "{\"query\": \"{{ query }}\"}"'
|
|
```
|
|
|
|
### Alfred/Raycast (macOS)
|
|
|
|
Create a workflow to search directly from your launcher.
|
|
|
|
### Custom Dashboard
|
|
|
|
Embed the search interface in your homelab dashboard:
|
|
|
|
```html
|
|
<iframe src="http://192.168.0.210:4785" width="100%" height="800px"></iframe>
|
|
```
|
|
|
|
## Updates
|
|
|
|
### Manual Update
|
|
|
|
```bash
|
|
# Pull latest image
|
|
docker pull itzcrazykns1337/perplexica:latest
|
|
|
|
# Recreate container (GitOps handles this)
|
|
docker compose -f hosts/vms/homelab-vm/perplexica.yaml up -d
|
|
```
|
|
|
|
### Automatic Updates
|
|
|
|
Managed via GitOps + Watchtower:
|
|
- GitOps polls repo every 5 minutes
|
|
- Watchtower updates `:latest` images automatically
|
|
- No manual intervention needed
|
|
|
|
## Related Services
|
|
|
|
- **Ollama** (`Atlantis/ollama/`) - Local LLM inference
|
|
- **OpenHands** (`homelab_vm/openhands.yaml`) - AI coding agent
|
|
- **Redlib** (`homelab_vm/redlib.yaml`) - Reddit privacy frontend
|
|
- **SearXNG** (Built into Perplexica) - Privacy-focused search
|
|
|
|
## References
|
|
|
|
- [Perplexica GitHub](https://github.com/ItzCrazyKns/Perplexica)
|
|
- [SearXNG Documentation](https://docs.searxng.org/)
|
|
- [Ollama Models](https://ollama.com/library)
|
|
|
|
---
|
|
|
|
**Status:** ✅ Fully operational
|
|
**Last Updated:** February 2026
|
|
**Maintained By:** GitOps (Portainer)
|