# Reactive Resume v5 - AI Model Configuration Guide

## 🤖 Current AI Setup

### Ollama Configuration
- **Model**: `llama3.2:3b`
- **Provider**: `ollama`
- **Endpoint**: `http://ollama:11434` (internal)
- **External API**: `http://192.168.0.250:11434`

## 📋 Model Details for Reactive Resume v5

### Environment Variables
Add these to your `docker-compose.yml` environment section:

```yaml
environment:
  # AI Integration (Ollama) - v5 uses OpenAI-compatible API
  OPENAI_API_KEY: "ollama"  # Dummy key for local Ollama
  OPENAI_BASE_URL: "http://ollama:11434/v1"  # Ollama OpenAI-compatible endpoint
  OPENAI_MODEL: "llama3.2:3b"  # Model name
```

### Model Specifications

#### llama3.2:3b
- **Size**: ~2GB download
- **Parameters**: 3 billion
- **Context Length**: 8,192 tokens
- **Use Case**: General text generation, resume assistance
- **Performance**: Fast inference on CPU
- **Memory**: ~4GB RAM during inference

## 🔧 Alternative Models

If you want to use different models, here are recommended options:

### Lightweight Options (< 4GB RAM)
```yaml
# Fastest, smallest
OLLAMA_MODEL: "llama3.2:1b"     # ~1GB, very fast

# Balanced performance
OLLAMA_MODEL: "llama3.2:3b"     # ~2GB, good quality (current)

# Better quality, still reasonable
OLLAMA_MODEL: "qwen2.5:3b"      # ~2GB, good for professional text
```

### High-Quality Options (8GB+ RAM)
```yaml
# Better reasoning
OLLAMA_MODEL: "llama3.2:7b"     # ~4GB, higher quality

# Excellent for professional content
OLLAMA_MODEL: "qwen2.5:7b"      # ~4GB, great for business writing

# Best quality (if you have the resources)
OLLAMA_MODEL: "llama3.2:11b"    # ~7GB, excellent quality
```

### Specialized Models
```yaml
# Code-focused (good for tech resumes)
OLLAMA_MODEL: "codellama:7b"    # ~4GB, code-aware

# Instruction-following
OLLAMA_MODEL: "mistral:7b"      # ~4GB, good at following prompts
```

## 🚀 Model Management Commands

### Pull New Models
```bash
# Pull a different model
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama pull qwen2.5:3b"

# List available models
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama list"

# Remove unused models
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama rm llama3.2:1b"
```

### Change Active Model
1. Update `OLLAMA_MODEL` in `docker-compose.yml`
2. Redeploy: `./deploy.sh restart`
3. Pull new model if needed: `./deploy.sh setup-ollama`

## 🧪 Testing AI Features

### Direct API Test
```bash
# Test the AI API directly
curl -X POST http://192.168.0.250:11434/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2:3b",
    "prompt": "Write a professional summary for a software engineer with 5 years experience in Python and React",
    "stream": false
  }'
```

### Expected Response
```json
{
  "model": "llama3.2:3b",
  "created_at": "2026-02-16T10:00:00.000Z",
  "response": "Experienced Software Engineer with 5+ years of expertise in full-stack development using Python and React. Proven track record of building scalable web applications...",
  "done": true
}
```

## 🎯 AI Features in Reactive Resume v5

### 1. Resume Content Suggestions
- **Trigger**: Click "AI Assist" button in any text field
- **Function**: Suggests professional content based on context
- **Model Usage**: Generates 2-3 sentence suggestions

### 2. Job Description Analysis
- **Trigger**: Paste job description in "Job Match" feature
- **Function**: Analyzes requirements and suggests skill additions
- **Model Usage**: Extracts key requirements and matches to profile

### 3. Skills Optimization
- **Trigger**: "Optimize Skills" button in Skills section
- **Function**: Suggests relevant skills based on experience
- **Model Usage**: Analyzes work history and recommends skills

### 4. Cover Letter Generation
- **Trigger**: "Generate Cover Letter" in Documents section
- **Function**: Creates personalized cover letter
- **Model Usage**: Uses resume data + job description to generate letter

## 📊 Performance Tuning

### Model Performance Comparison
| Model | Size | Speed | Quality | RAM Usage | Best For |
|-------|------|-------|---------|-----------|----------|
| llama3.2:1b | 1GB | Very Fast | Good | 2GB | Quick suggestions |
| llama3.2:3b | 2GB | Fast | Very Good | 4GB | **Recommended** |
| qwen2.5:3b | 2GB | Fast | Very Good | 4GB | Professional content |
| llama3.2:7b | 4GB | Medium | Excellent | 8GB | High quality |

### Optimization Settings
```yaml
# In docker-compose.yml for Ollama service
environment:
  OLLAMA_HOST: "0.0.0.0"
  OLLAMA_KEEP_ALIVE: "5m"        # Keep model loaded for 5 minutes
  OLLAMA_MAX_LOADED_MODELS: "1"  # Only keep one model in memory
  OLLAMA_NUM_PARALLEL: "1"       # Number of parallel requests
```

## 🔍 Troubleshooting AI Issues

### Model Not Loading
```bash
# Check if model exists
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama list"

# Pull model manually
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama pull llama3.2:3b"

# Check Ollama logs
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker logs Resume-OLLAMA-V5"
```

### Slow AI Responses
1. **Check CPU usage**: `htop` on Calypso
2. **Reduce model size**: Switch to `llama3.2:1b`
3. **Increase keep-alive**: Set `OLLAMA_KEEP_ALIVE: "30m"`

### AI Features Not Appearing in UI
1. **Check environment variables**: Ensure `AI_PROVIDER=ollama` is set
2. **Verify connectivity**: Test API endpoint from app container
3. **Check app logs**: Look for AI-related errors

### Memory Issues
```bash
# Check memory usage
ssh Vish@192.168.0.250 -p 62000 "free -h"

# If low memory, switch to smaller model
OLLAMA_MODEL: "llama3.2:1b"  # Uses ~2GB instead of 4GB
```

## 🔄 Model Updates

### Updating to Newer Models
1. **Check available models**: https://ollama.ai/library
2. **Pull new model**: `ollama pull model-name`
3. **Update compose file**: Change `OLLAMA_MODEL` value
4. **Restart services**: `./deploy.sh restart`

### Model Versioning
```yaml
# Pin to specific version
OLLAMA_MODEL: "llama3.2:3b-q4_0"  # Specific quantization

# Use latest (auto-updates)
OLLAMA_MODEL: "llama3.2:3b"       # Latest version
```

## 📈 Monitoring AI Performance

### Metrics to Watch
- **Response Time**: Should be < 10s for most prompts
- **Memory Usage**: Monitor RAM consumption
- **Model Load Time**: First request after idle takes longer
- **Error Rate**: Check for failed AI requests

### Performance Commands
```bash
# Check AI API health
curl http://192.168.0.250:11434/api/tags

# Monitor resource usage
ssh Vish@192.168.0.250 -p 62000 "docker stats Resume-OLLAMA-V5"

# Check AI request logs
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker logs Resume-ACCESS-V5 | grep -i ollama"
```

---

**Current Configuration**: llama3.2:3b (Recommended)  
**Last Updated**: 2026-02-16  
**Performance**: ✅ Optimized for Calypso hardware