Files
homelab-optimized/docs/services/individual/gmail-organizer.md
Gitea Mirror Bot 11d496f233
Some checks failed
Documentation / Build Docusaurus (push) Failing after 17m32s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-04-19 08:46:29 UTC
2026-04-19 08:46:29 +00:00

277 lines
8.8 KiB
Markdown

# Gmail Organizer
**🟢 Automation Script**
## Service Overview
| Property | Value |
|----------|-------|
| **Service Name** | gmail-organizer |
| **Host** | homelab-vm |
| **Category** | Automation / Email |
| **Difficulty** | 🟢 |
| **Language** | Python 3 |
| **Script Directory** | `scripts/gmail-organizer` |
| **LLM Backend** | Ollama (qwen3:32b) |
| **Schedule** | Every 30 minutes via cron |
## Purpose
Gmail Organizer is a local automation script that classifies incoming Gmail emails using a self-hosted LLM (qwen3:32b via Ollama) and automatically applies labels and archives low-priority mail. It connects to Gmail via IMAP using an app password, sends each email's metadata to Ollama for classification, applies a `AutoOrg/*` label, and optionally archives the email out of the inbox.
This replaces manual Gmail filters with LLM-powered classification that can understand context and intent rather than relying on simple keyword/sender rules.
## How It Works
```
Gmail INBOX (IMAP)
┌─────────────────┐ ┌──────────────────────┐
│ gmail_organizer │────▶│ Ollama (qwen3:32b) │
│ .py │◀────│ on Olares │
└─────────────────┘ └──────────────────────┘
┌─────────────────┐
│ Apply label │──▶ AutoOrg/Newsletters, AutoOrg/Receipts, etc.
│ Archive if set │──▶ Remove from inbox (newsletters, spam, accounts)
│ Track in SQLite │──▶ processed.db (skip on next run)
└─────────────────┘
```
1. Connects to Gmail via IMAP SSL with an app password
2. Fetches the most recent N emails (default: 50 per run)
3. Skips emails already in the local SQLite tracking database
4. For each unprocessed email, extracts subject, sender, and body snippet
5. Sends the email data to Ollama for classification into one of 6 categories
6. Applies the corresponding Gmail label via IMAP `X-GM-LABELS`
7. If the category has `archive: true`, removes the email from inbox
8. Records the email as processed in SQLite to avoid re-classification
## Categories
| Category | Gmail Label | Auto-Archive | Description |
|----------|-------------|:------------:|-------------|
| **receipts** | `AutoOrg/Receipts` | No | Purchase confirmations, invoices, payment receipts, order updates |
| **newsletters** | `AutoOrg/Newsletters` | Yes | Mailing lists, digests, blog updates, promotional content |
| **work** | `AutoOrg/Work` | No | Professional correspondence, meeting invites, project updates |
| **accounts** | `AutoOrg/Accounts` | Yes | Security alerts, password resets, 2FA notifications, login alerts |
| **spam** | `AutoOrg/Spam` | Yes | Unsolicited marketing, phishing, junk that bypassed Gmail filters |
| **personal** | `AutoOrg/Personal` | No | Friends, family, personal accounts |
Categories are fully configurable in `config.local.yaml`. You can add, remove, or rename categories and toggle archiving per category.
## Prerequisites
- Python 3.10+ (installed on homelab-vm)
- `pyyaml` package (`pip install pyyaml`)
- A Gmail account with 2FA enabled
- A Gmail app password (see setup below)
- Access to an Ollama instance with a model loaded
## Setup
### 1. Gmail App Password
Gmail requires an app password for IMAP access (regular passwords don't work with 2FA):
1. Go to [myaccount.google.com](https://myaccount.google.com)
2. Navigate to **Security** > **2-Step Verification**
3. Scroll to the bottom and click **App passwords**
4. Name it `homelab-organizer` and click **Create**
5. Copy the 16-character password (format: `REDACTED_APP_PASSWORD`)
6. You'll only see this once — save it securely
### 2. Configure the Script
```bash
cd ~/organized/repos/homelab/scripts/gmail-organizer
# Copy the template config
cp config.yaml config.local.yaml
# Edit with your credentials
vim config.local.yaml
```
Fill in your Gmail address and app password:
"REDACTED_PASSWORD"
gmail:
email: "you@gmail.com"
app_password: "REDACTED_PASSWORD" xxxx xxxx xxxx" # pragma: allowlist secret
ollama:
url: "https://a5be22681.vishinator.olares.com"
model: "qwen3:32b"
```
> **Note:** `config.local.yaml` is gitignored — your credentials stay local.
### 3. Install Dependencies
```bash
pip install pyyaml
# or if pip is externally managed:
pip install pyyaml --break-system-packages
```
### 4. Test with a Dry Run
```bash
# Classify 5 emails without applying any changes
python3 gmail_organizer.py --dry-run --limit 5 -v
```
You should see output like:
```
2026-03-22 03:51:06 INFO Connecting to Gmail as you@gmail.com
2026-03-22 03:51:07 INFO Fetched 5 message UIDs
2026-03-22 03:51:07 INFO [1/5] Classifying: Security alert (from: Google)
2026-03-22 03:51:12 INFO → accounts (AutoOrg/Accounts)
2026-03-22 03:51:12 INFO [DRY RUN] Would apply label: AutoOrg/Accounts + archive
```
### 5. Run for Real
```bash
# Process default batch (50 emails)
python3 gmail_organizer.py -v
# Process ALL emails in inbox
python3 gmail_organizer.py --limit 1000 -v
```
### 6. Set Up Cron (Automatic Sorting)
The cron job runs every 30 minutes to classify new emails:
```bash
crontab -e
```
Add this line:
```cron
*/30 * * * * cd /home/homelab/organized/repos/homelab/scripts/gmail-organizer && python3 gmail_organizer.py >> /tmp/gmail-organizer.log 2>&1
```
## Usage
### Command-Line Options
```
usage: gmail_organizer.py [-h] [-c CONFIG] [-n] [--reprocess] [--limit LIMIT] [-v]
Options:
-c, --config PATH Path to config YAML (default: config.local.yaml)
-n, --dry-run Classify but don't apply labels or archive
--reprocess Re-classify already-processed emails
--limit N Override batch size (default: 50)
-v, --verbose Debug logging
```
### Common Operations
```bash
# Normal run (processes new emails only)
python3 gmail_organizer.py
# Verbose output
python3 gmail_organizer.py -v
# Preview what would happen (no changes)
python3 gmail_organizer.py --dry-run --limit 10 -v
# Re-classify everything (e.g., after changing categories or archive rules)
python3 gmail_organizer.py --reprocess --limit 1000
# Check the cron log
tail -f /tmp/gmail-organizer.log
```
### Changing Categories
Edit `config.local.yaml` to add, remove, or modify categories:
```yaml
categories:
finance:
label: "AutoOrg/Finance"
description: "Bank statements, investment updates, tax documents"
archive: false
```
After changing categories, reprocess existing emails:
```bash
python3 gmail_organizer.py --reprocess --limit 1000
```
### Changing Archive Behavior
Toggle `archive: true/false` per category in `config.local.yaml`. Archived emails are NOT deleted — they're removed from the inbox but remain accessible via the `AutoOrg/*` labels in Gmail's sidebar.
## File Structure
```
scripts/gmail-organizer/
├── gmail_organizer.py # Main script
├── config.yaml # Template config (committed to repo)
├── config.local.yaml # Your credentials (gitignored)
├── processed.db # SQLite tracking database (gitignored)
├── requirements.txt # Python dependencies
└── .gitignore # Keeps credentials and DB out of git
```
## Ollama Backend
The script uses the Ollama API at `https://a5be22681.vishinator.olares.com` running on Olares. The current model is `qwen3:32b` (30.5B parameters, Q4_K_M quantization).
The LLM prompt is minimal — it sends the email's From, Subject, and a body snippet (truncated to 2000 chars), and asks for a single-word category classification. Temperature is set to 0.1 for consistent results.
The model also has `devstral-small-2:latest` available as an alternative if needed — just change `model` in the config.
## Troubleshooting
### "Config not found" error
```bash
cp config.yaml config.local.yaml
# Edit config.local.yaml with your credentials
```
### IMAP login fails
- Verify 2FA is enabled on your Google account
- Regenerate the app password if it was revoked
- Check that the email address is correct
### Ollama request fails
- Verify Ollama is running: `curl https://a5be22681.vishinator.olares.com/api/tags`
- Check the model is loaded: look for `qwen3:32b` in the response
- The script has a 60-second timeout per classification
### Emails not archiving
- Check that `archive: true` is set for the category in `config.local.yaml`
- Run with `-v` to see archive actions in the log
### Re-sorting after config changes
```bash
# Clear the tracking database and reprocess
rm processed.db
python3 gmail_organizer.py --limit 1000 -v
```
### Cron not running
```bash
# Verify cron is set up
crontab -l
# Check the log
cat /tmp/gmail-organizer.log
# Test manually
cd /home/homelab/organized/repos/homelab/scripts/gmail-organizer
python3 gmail_organizer.py -v
```