Vish/homelab-optimized

Fork 0

Files

Gitea Mirror Bot e7652c8dab

Documentation / Build Docusaurus (push) Failing after 5m3s

Details

Documentation / Deploy to GitHub Pages (push) Has been skipped

Details

Sanitized mirror from private repository - 2026-04-20 01:32:01 UTC

2026-04-20 01:32:01 +00:00

6.9 KiB

Raw Blame History

Reactive Resume v5 - AI Model Configuration Guide

🤖 Current AI Setup

Ollama Configuration

Model: llama3.2:3b
Provider: ollama
Endpoint: http://ollama:11434 (internal)
External API: http://192.168.0.250:11434

📋 Model Details for Reactive Resume v5

Environment Variables

Add these to your docker-compose.yml environment section:

environment:
  # AI Integration (Ollama) - v5 uses OpenAI-compatible API
  OPENAI_API_KEY: "ollama"  # Dummy key for local Ollama
  OPENAI_BASE_URL: "http://ollama:11434/v1"  # Ollama OpenAI-compatible endpoint
  OPENAI_MODEL: "llama3.2:3b"  # Model name

Model Specifications

llama3.2:3b

Size: ~2GB download
Parameters: 3 billion
Context Length: 8,192 tokens
Use Case: General text generation, resume assistance
Performance: Fast inference on CPU
Memory: ~4GB RAM during inference

🔧 Alternative Models

If you want to use different models, here are recommended options:

Lightweight Options (< 4GB RAM)

# Fastest, smallest
OLLAMA_MODEL: "llama3.2:1b"     # ~1GB, very fast

# Balanced performance
OLLAMA_MODEL: "llama3.2:3b"     # ~2GB, good quality (current)

# Better quality, still reasonable
OLLAMA_MODEL: "qwen2.5:3b"      # ~2GB, good for professional text

High-Quality Options (8GB+ RAM)

# Better reasoning
OLLAMA_MODEL: "llama3.2:7b"     # ~4GB, higher quality

# Excellent for professional content
OLLAMA_MODEL: "qwen2.5:7b"      # ~4GB, great for business writing

# Best quality (if you have the resources)
OLLAMA_MODEL: "llama3.2:11b"    # ~7GB, excellent quality

Specialized Models

# Code-focused (good for tech resumes)
OLLAMA_MODEL: "codellama:7b"    # ~4GB, code-aware

# Instruction-following
OLLAMA_MODEL: "mistral:7b"      # ~4GB, good at following prompts

🚀 Model Management Commands

Pull New Models

# Pull a different model
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama pull qwen2.5:3b"

# List available models
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama list"

# Remove unused models
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama rm llama3.2:1b"

Change Active Model

Update OLLAMA_MODEL in docker-compose.yml
Redeploy: ./deploy.sh restart
Pull new model if needed: ./deploy.sh setup-ollama

🧪 Testing AI Features

Direct API Test

# Test the AI API directly
curl -X POST http://192.168.0.250:11434/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2:3b",
    "prompt": "Write a professional summary for a software engineer with 5 years experience in Python and React",
    "stream": false
  }'

Expected Response

{
  "model": "llama3.2:3b",
  "created_at": "2026-02-16T10:00:00.000Z",
  "response": "Experienced Software Engineer with 5+ years of expertise in full-stack development using Python and React. Proven track record of building scalable web applications...",
  "done": true
}

🎯 AI Features in Reactive Resume v5

1. Resume Content Suggestions

Trigger: Click "AI Assist" button in any text field
Function: Suggests professional content based on context
Model Usage: Generates 2-3 sentence suggestions

2. Job Description Analysis

Trigger: Paste job description in "Job Match" feature
Function: Analyzes requirements and suggests skill additions
Model Usage: Extracts key requirements and matches to profile

3. Skills Optimization

Trigger: "Optimize Skills" button in Skills section
Function: Suggests relevant skills based on experience
Model Usage: Analyzes work history and recommends skills

4. Cover Letter Generation

Trigger: "Generate Cover Letter" in Documents section
Function: Creates personalized cover letter
Model Usage: Uses resume data + job description to generate letter

📊 Performance Tuning

Model Performance Comparison

Model	Size	Speed	Quality	RAM Usage	Best For
llama3.2:1b	1GB	Very Fast	Good	2GB	Quick suggestions
llama3.2:3b	2GB	Fast	Very Good	4GB	Recommended
qwen2.5:3b	2GB	Fast	Very Good	4GB	Professional content
llama3.2:7b	4GB	Medium	Excellent	8GB	High quality

Optimization Settings

# In docker-compose.yml for Ollama service
environment:
  OLLAMA_HOST: "0.0.0.0"
  OLLAMA_KEEP_ALIVE: "5m"        # Keep model loaded for 5 minutes
  OLLAMA_MAX_LOADED_MODELS: "1"  # Only keep one model in memory
  OLLAMA_NUM_PARALLEL: "1"       # Number of parallel requests

🔍 Troubleshooting AI Issues

Model Not Loading

# Check if model exists
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama list"

# Pull model manually
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker exec Resume-OLLAMA-V5 ollama pull llama3.2:3b"

# Check Ollama logs
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker logs Resume-OLLAMA-V5"

Slow AI Responses

Check CPU usage: htop on Calypso
Reduce model size: Switch to llama3.2:1b
Increase keep-alive: Set OLLAMA_KEEP_ALIVE: "30m"

AI Features Not Appearing in UI

Check environment variables: Ensure AI_PROVIDER=ollama is set
Verify connectivity: Test API endpoint from app container
Check app logs: Look for AI-related errors

Memory Issues

# Check memory usage
ssh Vish@192.168.0.250 -p 62000 "free -h"

# If low memory, switch to smaller model
OLLAMA_MODEL: "llama3.2:1b"  # Uses ~2GB instead of 4GB

🔄 Model Updates

Updating to Newer Models

Check available models: https://ollama.ai/library
Pull new model: ollama pull model-name
Update compose file: Change OLLAMA_MODEL value
Restart services: ./deploy.sh restart

Model Versioning

# Pin to specific version
OLLAMA_MODEL: "llama3.2:3b-q4_0"  # Specific quantization

# Use latest (auto-updates)
OLLAMA_MODEL: "llama3.2:3b"       # Latest version

📈 Monitoring AI Performance

Metrics to Watch

Response Time: Should be < 10s for most prompts
Memory Usage: Monitor RAM consumption
Model Load Time: First request after idle takes longer
Error Rate: Check for failed AI requests

Performance Commands

# Check AI API health
curl http://192.168.0.250:11434/api/tags

# Monitor resource usage
ssh Vish@192.168.0.250 -p 62000 "docker stats Resume-OLLAMA-V5"

# Check AI request logs
ssh Vish@192.168.0.250 -p 62000 "sudo /usr/local/bin/docker logs Resume-ACCESS-V5 | grep -i ollama"

Current Configuration: llama3.2:3b (Recommended)
Last Updated: 2026-02-16
Performance: ✅ Optimized for Calypso hardware

6.9 KiB Raw Blame History