Sanitized mirror from private repository - 2026-04-18 11:19:59 UTC
This commit is contained in:
179
docs/guides/PERPLEXICA_TROUBLESHOOTING.md
Normal file
179
docs/guides/PERPLEXICA_TROUBLESHOOTING.md
Normal file
@@ -0,0 +1,179 @@
|
||||
# Perplexica Performance Troubleshooting
|
||||
|
||||
## Issue Summary
|
||||
|
||||
Perplexica search queries were taking 10 minutes with CPU-based Ollama inference on Seattle VM.
|
||||
|
||||
## Timeline of Solutions Attempted
|
||||
|
||||
### 1. Initial Setup: Seattle Ollama with Qwen2.5:1.5b
|
||||
- **Result**: 10 minutes per search query
|
||||
- **Problem**: CPU inference too slow, Seattle load average 9.82, Ollama using 937% CPU
|
||||
- **Metrics**:
|
||||
- Chat requests: 16-28 seconds each
|
||||
- Generate requests: 2+ minutes each
|
||||
|
||||
### 2. Switched to TinyLlama:1.1b
|
||||
- **Model Size**: 608MB (vs 940MB for Qwen2.5)
|
||||
- **Speed**: 12 seconds per response
|
||||
- **Improvement**: 50x faster than Qwen2.5
|
||||
- **Quality**: Lower quality responses
|
||||
- **Status**: Works but still slow
|
||||
|
||||
### 3. Switched to Groq API (Current)
|
||||
- **Model**: llama-3.3-70b-versatile
|
||||
- **Speed**: 0.4 seconds per response
|
||||
- **Quality**: Excellent (70B model)
|
||||
- **Cost**: Free tier (30 req/min, 14,400/day)
|
||||
- **Status**: Configured but user reports not working
|
||||
|
||||
## Current Configuration
|
||||
|
||||
### Perplexica Config (`config.json`)
|
||||
```json
|
||||
{
|
||||
"version": 1,
|
||||
"setupComplete": true,
|
||||
"modelProviders": [
|
||||
{
|
||||
"id": "groq-provider",
|
||||
"name": "Groq",
|
||||
"type": "openai",
|
||||
"config": {
|
||||
"baseURL": "https://api.groq.com/openai/v1",
|
||||
"apiKey": "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"
|
||||
},
|
||||
"chatModels": [
|
||||
{
|
||||
"name": "llama-3.3-70b-versatile",
|
||||
"key": "llama-3.3-70b-versatile"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"id": "seattle-ollama",
|
||||
"name": "Seattle Ollama",
|
||||
"type": "ollama",
|
||||
"config": {
|
||||
"baseURL": "http://100.82.197.124:11434"
|
||||
},
|
||||
"chatModels": [
|
||||
{
|
||||
"name": "tinyllama:1.1b",
|
||||
"key": "tinyllama:1.1b"
|
||||
}
|
||||
],
|
||||
"embeddingModels": [
|
||||
{
|
||||
"name": "nomic-embed-text:latest",
|
||||
"key": "nomic-embed-text:latest"
|
||||
}
|
||||
]
|
||||
}
|
||||
],
|
||||
"REDACTED_APP_PASSWORD": "llama-3.3-70b-versatile",
|
||||
"defaultEmbeddingModel": "nomic-embed-text:latest"
|
||||
}
|
||||
```
|
||||
|
||||
### Seattle Ollama Models
|
||||
```bash
|
||||
ssh seattle "docker exec ollama-seattle ollama list"
|
||||
```
|
||||
|
||||
Available models:
|
||||
- `tinyllama:1.1b` (608MB) - Fast CPU inference
|
||||
- `qwen2.5:1.5b` (940MB) - Slow but better quality
|
||||
- `nomic-embed-text:latest` (261MB) - For embeddings
|
||||
|
||||
## Performance Comparison
|
||||
|
||||
| Configuration | Chat Speed | Quality | Notes |
|
||||
|--------------|------------|---------|-------|
|
||||
| Qwen2.5 1.5B (Seattle CPU) | 10 minutes | Good | CPU overload, unusable |
|
||||
| TinyLlama 1.1B (Seattle CPU) | 12 seconds | Basic | Usable but slow |
|
||||
| Llama 3.3 70B (Groq API) | 0.4 seconds | Excellent | Best option |
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Issue: "nomic-embed-text:latest does not support chat"
|
||||
- **Cause**: Config has embedding model listed as chat model
|
||||
- **Fix**: Ensure embedding models are only in `embeddingModels` array
|
||||
|
||||
### Issue: Browser shows old model selections
|
||||
- **Cause**: Browser cache
|
||||
- **Fix**: Clear browser cache (Ctrl+F5) and close all tabs
|
||||
|
||||
### Issue: Database retains old conversations
|
||||
- **Fix**: Clear database:
|
||||
```bash
|
||||
docker run --rm -v perplexica-data:/data alpine rm -f /data/db.sqlite
|
||||
docker restart perplexica
|
||||
```
|
||||
|
||||
### Issue: Config reverts after restart
|
||||
- **Cause**: Config is in Docker volume, not git-tracked file
|
||||
- **Fix**: Update config in volume:
|
||||
```bash
|
||||
docker run --rm -v perplexica-data:/data -v /tmp:/tmp alpine cp /tmp/config.json /data/config.json
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
### Test SearXNG (from inside container)
|
||||
```bash
|
||||
docker exec perplexica curl -s "http://localhost:8080/search?q=test&format=json" | jq '.results | length'
|
||||
```
|
||||
|
||||
### Test Seattle Ollama
|
||||
```bash
|
||||
curl -s http://100.82.197.124:11434/api/tags | jq '.models[].name'
|
||||
```
|
||||
|
||||
### Test Groq API
|
||||
```bash
|
||||
curl -s https://api.groq.com/openai/v1/chat/completions \
|
||||
-H "Authorization: Bearer YOUR_API_KEY" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "llama-3.3-70b-versatile",
|
||||
"messages": [{"role": "user", "content": "Test"}],
|
||||
"max_tokens": 50
|
||||
}' | jq -r '.choices[0].message.content'
|
||||
```
|
||||
|
||||
### Check Perplexica Config
|
||||
```bash
|
||||
docker run --rm -v perplexica-data:/data alpine cat /data/config.json | jq .
|
||||
```
|
||||
|
||||
## Recommendations
|
||||
|
||||
1. **Use Groq for chat** (0.4s response time, excellent quality)
|
||||
2. **Use Seattle Ollama for embeddings** (nomic-embed-text:latest)
|
||||
3. **Keep TinyLlama as fallback** (if Groq rate limits hit)
|
||||
4. **Monitor Groq rate limits** (30 req/min on free tier)
|
||||
|
||||
## Alternative Solutions
|
||||
|
||||
If Groq doesn't work:
|
||||
|
||||
1. **OpenRouter API**: Similar to Groq, multiple models
|
||||
2. **Anthropic Claude**: Via API (costs money)
|
||||
3. **Local GPU**: Move Ollama to GPU-enabled host
|
||||
4. **Accept slow performance**: Use TinyLlama with 12s responses
|
||||
|
||||
## Status
|
||||
|
||||
- ✅ Groq API key configured
|
||||
- ✅ Groq API responding in 0.4s
|
||||
- ✅ Config updated in Perplexica
|
||||
- ❌ User reports web UI still not working (needs investigation)
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Test from web UI and capture exact error message
|
||||
2. Check browser console for JavaScript errors
|
||||
3. Check Perplexica logs during failed search
|
||||
4. Verify Groq API calls in network tab
|
||||
5. Consider switching to different LLM provider if Groq incompatible
|
||||
Reference in New Issue
Block a user