Sanitized mirror from private repository - 2026-04-20 01:32:01 UTC
This commit is contained in:
230
hosts/synology/calypso/reactive_resume_v5/AI_MODEL_GUIDE.md
Normal file
230
hosts/synology/calypso/reactive_resume_v5/AI_MODEL_GUIDE.md
Normal file
@@ -0,0 +1,230 @@
|
||||
# Reactive Resume v5 - AI Model Configuration Guide
|
||||
|
||||
## 🤖 Current AI Setup
|
||||
|
||||
### Ollama Configuration
|
||||
- **Model**: `llama3.2:3b`
|
||||
- **Provider**: `ollama`
|
||||
- **Endpoint**: `http://ollama:11434` (internal)
|
||||
- **External API**: `http://192.168.0.250:11434`
|
||||
|
||||
## 📋 Model Details for Reactive Resume v5
|
||||
|
||||
### Environment Variables
|
||||
Add these to your `docker-compose.yml` environment section:
|
||||
|
||||
```yaml
|
||||
environment:
|
||||
# AI Integration (Ollama) - v5 uses OpenAI-compatible API
|
||||
OPENAI_API_KEY: "ollama" # Dummy key for local Ollama
|
||||
OPENAI_BASE_URL: "http://ollama:11434/v1" # Ollama OpenAI-compatible endpoint
|
||||
OPENAI_MODEL: "llama3.2:3b" # Model name
|
||||
```
|
||||
|
||||
### Model Specifications
|
||||
|
||||
#### llama3.2:3b
|
||||
- **Size**: ~2GB download
|
||||
- **Parameters**: 3 billion
|
||||
- **Context Length**: 8,192 tokens
|
||||
- **Use Case**: General text generation, resume assistance
|
||||
- **Performance**: Fast inference on CPU
|
||||
- **Memory**: ~4GB RAM during inference
|
||||
|
||||
## 🔧 Alternative Models
|
||||
|
||||
If you want to use different models, here are recommended options:
|
||||
|
||||
### Lightweight Options (< 4GB RAM)
|
||||
```yaml
|
||||
# Fastest, smallest
|
||||
OLLAMA_MODEL: "llama3.2:1b" # ~1GB, very fast
|
||||
|
||||
# Balanced performance
|
||||
OLLAMA_MODEL: "llama3.2:3b" # ~2GB, good quality (current)
|
||||
|
||||
# Better quality, still reasonable
|
||||
OLLAMA_MODEL: "qwen2.5:3b" # ~2GB, good for professional text
|
||||
```
|
||||
|
||||
### High-Quality Options (8GB+ RAM)
|
||||
```yaml
|
||||
# Better reasoning
|
||||
OLLAMA_MODEL: "llama3.2:7b" # ~4GB, higher quality
|
||||
|
||||
# Excellent for professional content
|
||||
OLLAMA_MODEL: "qwen2.5:7b" # ~4GB, great for business writing
|
||||
|
||||
# Best quality (if you have the resources)
|
||||
OLLAMA_MODEL: "llama3.2:11b" # ~7GB, excellent quality
|
||||
```
|
||||
|
||||
### Specialized Models
|
||||
```yaml
|
||||
# Code-focused (good for tech resumes)
|
||||
OLLAMA_MODEL: "codellama:7b" # ~4GB, code-aware
|
||||
|
||||
# Instruction-following
|
||||
OLLAMA_MODEL: "mistral:7b" # ~4GB, good at following prompts
|
||||
```
|
||||
|
||||
## 🚀 Model Management Commands
|
||||
|
||||
### Pull New Models
|
||||
```bash
|
||||
# Pull a different model
|
||||
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama pull qwen2.5:3b"
|
||||
|
||||
# List available models
|
||||
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama list"
|
||||
|
||||
# Remove unused models
|
||||
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama rm llama3.2:1b"
|
||||
```
|
||||
|
||||
### Change Active Model
|
||||
1. Update `OLLAMA_MODEL` in `docker-compose.yml`
|
||||
2. Redeploy: `./deploy.sh restart`
|
||||
3. Pull new model if needed: `./deploy.sh setup-ollama`
|
||||
|
||||
## 🧪 Testing AI Features
|
||||
|
||||
### Direct API Test
|
||||
```bash
|
||||
# Test the AI API directly
|
||||
curl -X POST http://192.168.0.250:11434/api/generate \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "llama3.2:3b",
|
||||
"prompt": "Write a professional summary for a software engineer with 5 years experience in Python and React",
|
||||
"stream": false
|
||||
}'
|
||||
```
|
||||
|
||||
### Expected Response
|
||||
```json
|
||||
{
|
||||
"model": "llama3.2:3b",
|
||||
"created_at": "2026-02-16T10:00:00.000Z",
|
||||
"response": "Experienced Software Engineer with 5+ years of expertise in full-stack development using Python and React. Proven track record of building scalable web applications...",
|
||||
"done": true
|
||||
}
|
||||
```
|
||||
|
||||
## 🎯 AI Features in Reactive Resume v5
|
||||
|
||||
### 1. Resume Content Suggestions
|
||||
- **Trigger**: Click "AI Assist" button in any text field
|
||||
- **Function**: Suggests professional content based on context
|
||||
- **Model Usage**: Generates 2-3 sentence suggestions
|
||||
|
||||
### 2. Job Description Analysis
|
||||
- **Trigger**: Paste job description in "Job Match" feature
|
||||
- **Function**: Analyzes requirements and suggests skill additions
|
||||
- **Model Usage**: Extracts key requirements and matches to profile
|
||||
|
||||
### 3. Skills Optimization
|
||||
- **Trigger**: "Optimize Skills" button in Skills section
|
||||
- **Function**: Suggests relevant skills based on experience
|
||||
- **Model Usage**: Analyzes work history and recommends skills
|
||||
|
||||
### 4. Cover Letter Generation
|
||||
- **Trigger**: "Generate Cover Letter" in Documents section
|
||||
- **Function**: Creates personalized cover letter
|
||||
- **Model Usage**: Uses resume data + job description to generate letter
|
||||
|
||||
## 📊 Performance Tuning
|
||||
|
||||
### Model Performance Comparison
|
||||
| Model | Size | Speed | Quality | RAM Usage | Best For |
|
||||
|-------|------|-------|---------|-----------|----------|
|
||||
| llama3.2:1b | 1GB | Very Fast | Good | 2GB | Quick suggestions |
|
||||
| llama3.2:3b | 2GB | Fast | Very Good | 4GB | **Recommended** |
|
||||
| qwen2.5:3b | 2GB | Fast | Very Good | 4GB | Professional content |
|
||||
| llama3.2:7b | 4GB | Medium | Excellent | 8GB | High quality |
|
||||
|
||||
### Optimization Settings
|
||||
```yaml
|
||||
# In docker-compose.yml for Ollama service
|
||||
environment:
|
||||
OLLAMA_HOST: "0.0.0.0"
|
||||
OLLAMA_KEEP_ALIVE: "5m" # Keep model loaded for 5 minutes
|
||||
OLLAMA_MAX_LOADED_MODELS: "1" # Only keep one model in memory
|
||||
OLLAMA_NUM_PARALLEL: "1" # Number of parallel requests
|
||||
```
|
||||
|
||||
## 🔍 Troubleshooting AI Issues
|
||||
|
||||
### Model Not Loading
|
||||
```bash
|
||||
# Check if model exists
|
||||
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama list"
|
||||
|
||||
# Pull model manually
|
||||
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama pull llama3.2:3b"
|
||||
|
||||
# Check Ollama logs
|
||||
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker logs Resume-OLLAMA-V5"
|
||||
```
|
||||
|
||||
### Slow AI Responses
|
||||
1. **Check CPU usage**: `htop` on Calypso
|
||||
2. **Reduce model size**: Switch to `llama3.2:1b`
|
||||
3. **Increase keep-alive**: Set `OLLAMA_KEEP_ALIVE: "30m"`
|
||||
|
||||
### AI Features Not Appearing in UI
|
||||
1. **Check environment variables**: Ensure `AI_PROVIDER=ollama` is set
|
||||
2. **Verify connectivity**: Test API endpoint from app container
|
||||
3. **Check app logs**: Look for AI-related errors
|
||||
|
||||
### Memory Issues
|
||||
```bash
|
||||
# Check memory usage
|
||||
ssh Vish@192.168.0.250 -p 62000 "free -h"
|
||||
|
||||
# If low memory, switch to smaller model
|
||||
OLLAMA_MODEL: "llama3.2:1b" # Uses ~2GB instead of 4GB
|
||||
```
|
||||
|
||||
## 🔄 Model Updates
|
||||
|
||||
### Updating to Newer Models
|
||||
1. **Check available models**: https://ollama.ai/library
|
||||
2. **Pull new model**: `ollama pull model-name`
|
||||
3. **Update compose file**: Change `OLLAMA_MODEL` value
|
||||
4. **Restart services**: `./deploy.sh restart`
|
||||
|
||||
### Model Versioning
|
||||
```yaml
|
||||
# Pin to specific version
|
||||
OLLAMA_MODEL: "llama3.2:3b-q4_0" # Specific quantization
|
||||
|
||||
# Use latest (auto-updates)
|
||||
OLLAMA_MODEL: "llama3.2:3b" # Latest version
|
||||
```
|
||||
|
||||
## 📈 Monitoring AI Performance
|
||||
|
||||
### Metrics to Watch
|
||||
- **Response Time**: Should be < 10s for most prompts
|
||||
- **Memory Usage**: Monitor RAM consumption
|
||||
- **Model Load Time**: First request after idle takes longer
|
||||
- **Error Rate**: Check for failed AI requests
|
||||
|
||||
### Performance Commands
|
||||
```bash
|
||||
# Check AI API health
|
||||
curl http://192.168.0.250:11434/api/tags
|
||||
|
||||
# Monitor resource usage
|
||||
ssh Vish@192.168.0.250 -p 62000 "docker stats Resume-OLLAMA-V5"
|
||||
|
||||
# Check AI request logs
|
||||
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker logs Resume-ACCESS-V5 | grep -i ollama"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Current Configuration**: llama3.2:3b (Recommended)
|
||||
**Last Updated**: 2026-02-16
|
||||
**Performance**: ✅ Optimized for Calypso hardware
|
||||
Reference in New Issue
Block a user