Files
homelab-optimized/hosts/synology/calypso/reactive_resume_v5/AI_MODEL_GUIDE.md
Gitea Mirror Bot e7652c8dab
Some checks failed
Documentation / Build Docusaurus (push) Failing after 5m3s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-04-20 01:32:01 UTC
2026-04-20 01:32:01 +00:00

6.9 KiB

Reactive Resume v5 - AI Model Configuration Guide

🤖 Current AI Setup

Ollama Configuration

  • Model: llama3.2:3b
  • Provider: ollama
  • Endpoint: http://ollama:11434 (internal)
  • External API: http://192.168.0.250:11434

📋 Model Details for Reactive Resume v5

Environment Variables

Add these to your docker-compose.yml environment section:

environment:
  # AI Integration (Ollama) - v5 uses OpenAI-compatible API
  OPENAI_API_KEY: "ollama"  # Dummy key for local Ollama
  OPENAI_BASE_URL: "http://ollama:11434/v1"  # Ollama OpenAI-compatible endpoint
  OPENAI_MODEL: "llama3.2:3b"  # Model name

Model Specifications

llama3.2:3b

  • Size: ~2GB download
  • Parameters: 3 billion
  • Context Length: 8,192 tokens
  • Use Case: General text generation, resume assistance
  • Performance: Fast inference on CPU
  • Memory: ~4GB RAM during inference

🔧 Alternative Models

If you want to use different models, here are recommended options:

Lightweight Options (< 4GB RAM)

# Fastest, smallest
OLLAMA_MODEL: "llama3.2:1b"     # ~1GB, very fast

# Balanced performance
OLLAMA_MODEL: "llama3.2:3b"     # ~2GB, good quality (current)

# Better quality, still reasonable
OLLAMA_MODEL: "qwen2.5:3b"      # ~2GB, good for professional text

High-Quality Options (8GB+ RAM)

# Better reasoning
OLLAMA_MODEL: "llama3.2:7b"     # ~4GB, higher quality

# Excellent for professional content
OLLAMA_MODEL: "qwen2.5:7b"      # ~4GB, great for business writing

# Best quality (if you have the resources)
OLLAMA_MODEL: "llama3.2:11b"    # ~7GB, excellent quality

Specialized Models

# Code-focused (good for tech resumes)
OLLAMA_MODEL: "codellama:7b"    # ~4GB, code-aware

# Instruction-following
OLLAMA_MODEL: "mistral:7b"      # ~4GB, good at following prompts

🚀 Model Management Commands

Pull New Models

# Pull a different model
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama pull qwen2.5:3b"

# List available models
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama list"

# Remove unused models
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama rm llama3.2:1b"

Change Active Model

  1. Update OLLAMA_MODEL in docker-compose.yml
  2. Redeploy: ./deploy.sh restart
  3. Pull new model if needed: ./deploy.sh setup-ollama

🧪 Testing AI Features

Direct API Test

# Test the AI API directly
curl -X POST http://192.168.0.250:11434/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2:3b",
    "prompt": "Write a professional summary for a software engineer with 5 years experience in Python and React",
    "stream": false
  }'

Expected Response

{
  "model": "llama3.2:3b",
  "created_at": "2026-02-16T10:00:00.000Z",
  "response": "Experienced Software Engineer with 5+ years of expertise in full-stack development using Python and React. Proven track record of building scalable web applications...",
  "done": true
}

🎯 AI Features in Reactive Resume v5

1. Resume Content Suggestions

  • Trigger: Click "AI Assist" button in any text field
  • Function: Suggests professional content based on context
  • Model Usage: Generates 2-3 sentence suggestions

2. Job Description Analysis

  • Trigger: Paste job description in "Job Match" feature
  • Function: Analyzes requirements and suggests skill additions
  • Model Usage: Extracts key requirements and matches to profile

3. Skills Optimization

  • Trigger: "Optimize Skills" button in Skills section
  • Function: Suggests relevant skills based on experience
  • Model Usage: Analyzes work history and recommends skills

4. Cover Letter Generation

  • Trigger: "Generate Cover Letter" in Documents section
  • Function: Creates personalized cover letter
  • Model Usage: Uses resume data + job description to generate letter

📊 Performance Tuning

Model Performance Comparison

Model Size Speed Quality RAM Usage Best For
llama3.2:1b 1GB Very Fast Good 2GB Quick suggestions
llama3.2:3b 2GB Fast Very Good 4GB Recommended
qwen2.5:3b 2GB Fast Very Good 4GB Professional content
llama3.2:7b 4GB Medium Excellent 8GB High quality

Optimization Settings

# In docker-compose.yml for Ollama service
environment:
  OLLAMA_HOST: "0.0.0.0"
  OLLAMA_KEEP_ALIVE: "5m"        # Keep model loaded for 5 minutes
  OLLAMA_MAX_LOADED_MODELS: "1"  # Only keep one model in memory
  OLLAMA_NUM_PARALLEL: "1"       # Number of parallel requests

🔍 Troubleshooting AI Issues

Model Not Loading

# Check if model exists
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama list"

# Pull model manually
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama pull llama3.2:3b"

# Check Ollama logs
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker logs Resume-OLLAMA-V5"

Slow AI Responses

  1. Check CPU usage: htop on Calypso
  2. Reduce model size: Switch to llama3.2:1b
  3. Increase keep-alive: Set OLLAMA_KEEP_ALIVE: "30m"

AI Features Not Appearing in UI

  1. Check environment variables: Ensure AI_PROVIDER=ollama is set
  2. Verify connectivity: Test API endpoint from app container
  3. Check app logs: Look for AI-related errors

Memory Issues

# Check memory usage
ssh Vish@192.168.0.250 -p 62000 "free -h"

# If low memory, switch to smaller model
OLLAMA_MODEL: "llama3.2:1b"  # Uses ~2GB instead of 4GB

🔄 Model Updates

Updating to Newer Models

  1. Check available models: https://ollama.ai/library
  2. Pull new model: ollama pull model-name
  3. Update compose file: Change OLLAMA_MODEL value
  4. Restart services: ./deploy.sh restart

Model Versioning

# Pin to specific version
OLLAMA_MODEL: "llama3.2:3b-q4_0"  # Specific quantization

# Use latest (auto-updates)
OLLAMA_MODEL: "llama3.2:3b"       # Latest version

📈 Monitoring AI Performance

Metrics to Watch

  • Response Time: Should be < 10s for most prompts
  • Memory Usage: Monitor RAM consumption
  • Model Load Time: First request after idle takes longer
  • Error Rate: Check for failed AI requests

Performance Commands

# Check AI API health
curl http://192.168.0.250:11434/api/tags

# Monitor resource usage
ssh Vish@192.168.0.250 -p 62000 "docker stats Resume-OLLAMA-V5"

# Check AI request logs
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker logs Resume-ACCESS-V5 | grep -i ollama"

Current Configuration: llama3.2:3b (Recommended)
Last Updated: 2026-02-16
Performance: Optimized for Calypso hardware