Sanitized mirror from private repository - 2026-04-20 01:32:01 UTC
This commit is contained in:
276
docs/services/individual/gmail-organizer.md
Normal file
276
docs/services/individual/gmail-organizer.md
Normal file
@@ -0,0 +1,276 @@
|
||||
# Gmail Organizer
|
||||
|
||||
**🟢 Automation Script**
|
||||
|
||||
## Service Overview
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| **Service Name** | gmail-organizer |
|
||||
| **Host** | homelab-vm |
|
||||
| **Category** | Automation / Email |
|
||||
| **Difficulty** | 🟢 |
|
||||
| **Language** | Python 3 |
|
||||
| **Script Directory** | `scripts/gmail-organizer` |
|
||||
| **LLM Backend** | Ollama (qwen3:32b) |
|
||||
| **Schedule** | Every 30 minutes via cron |
|
||||
|
||||
## Purpose
|
||||
|
||||
Gmail Organizer is a local automation script that classifies incoming Gmail emails using a self-hosted LLM (qwen3:32b via Ollama) and automatically applies labels and archives low-priority mail. It connects to Gmail via IMAP using an app password, sends each email's metadata to Ollama for classification, applies a `AutoOrg/*` label, and optionally archives the email out of the inbox.
|
||||
|
||||
This replaces manual Gmail filters with LLM-powered classification that can understand context and intent rather than relying on simple keyword/sender rules.
|
||||
|
||||
## How It Works
|
||||
|
||||
```
|
||||
Gmail INBOX (IMAP)
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐ ┌──────────────────────┐
|
||||
│ gmail_organizer │────▶│ Ollama (qwen3:32b) │
|
||||
│ .py │◀────│ on Olares │
|
||||
└─────────────────┘ └──────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐
|
||||
│ Apply label │──▶ AutoOrg/Newsletters, AutoOrg/Receipts, etc.
|
||||
│ Archive if set │──▶ Remove from inbox (newsletters, spam, accounts)
|
||||
│ Track in SQLite │──▶ processed.db (skip on next run)
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
1. Connects to Gmail via IMAP SSL with an app password
|
||||
2. Fetches the most recent N emails (default: 50 per run)
|
||||
3. Skips emails already in the local SQLite tracking database
|
||||
4. For each unprocessed email, extracts subject, sender, and body snippet
|
||||
5. Sends the email data to Ollama for classification into one of 6 categories
|
||||
6. Applies the corresponding Gmail label via IMAP `X-GM-LABELS`
|
||||
7. If the category has `archive: true`, removes the email from inbox
|
||||
8. Records the email as processed in SQLite to avoid re-classification
|
||||
|
||||
## Categories
|
||||
|
||||
| Category | Gmail Label | Auto-Archive | Description |
|
||||
|----------|-------------|:------------:|-------------|
|
||||
| **receipts** | `AutoOrg/Receipts` | No | Purchase confirmations, invoices, payment receipts, order updates |
|
||||
| **newsletters** | `AutoOrg/Newsletters` | Yes | Mailing lists, digests, blog updates, promotional content |
|
||||
| **work** | `AutoOrg/Work` | No | Professional correspondence, meeting invites, project updates |
|
||||
| **accounts** | `AutoOrg/Accounts` | Yes | Security alerts, password resets, 2FA notifications, login alerts |
|
||||
| **spam** | `AutoOrg/Spam` | Yes | Unsolicited marketing, phishing, junk that bypassed Gmail filters |
|
||||
| **personal** | `AutoOrg/Personal` | No | Friends, family, personal accounts |
|
||||
|
||||
Categories are fully configurable in `config.local.yaml`. You can add, remove, or rename categories and toggle archiving per category.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Python 3.10+ (installed on homelab-vm)
|
||||
- `pyyaml` package (`pip install pyyaml`)
|
||||
- A Gmail account with 2FA enabled
|
||||
- A Gmail app password (see setup below)
|
||||
- Access to an Ollama instance with a model loaded
|
||||
|
||||
## Setup
|
||||
|
||||
### 1. Gmail App Password
|
||||
|
||||
Gmail requires an app password for IMAP access (regular passwords don't work with 2FA):
|
||||
|
||||
1. Go to [myaccount.google.com](https://myaccount.google.com)
|
||||
2. Navigate to **Security** > **2-Step Verification**
|
||||
3. Scroll to the bottom and click **App passwords**
|
||||
4. Name it `homelab-organizer` and click **Create**
|
||||
5. Copy the 16-character password (format: `REDACTED_APP_PASSWORD`)
|
||||
6. You'll only see this once — save it securely
|
||||
|
||||
### 2. Configure the Script
|
||||
|
||||
```bash
|
||||
cd ~/organized/repos/homelab/scripts/gmail-organizer
|
||||
|
||||
# Copy the template config
|
||||
cp config.yaml config.local.yaml
|
||||
|
||||
# Edit with your credentials
|
||||
vim config.local.yaml
|
||||
```
|
||||
|
||||
Fill in your Gmail address and app password:
|
||||
|
||||
"REDACTED_PASSWORD"
|
||||
gmail:
|
||||
email: "you@gmail.com"
|
||||
app_password: "REDACTED_PASSWORD" xxxx xxxx xxxx" # pragma: allowlist secret
|
||||
|
||||
ollama:
|
||||
url: "https://a5be22681.vishinator.olares.com"
|
||||
model: "qwen3:32b"
|
||||
```
|
||||
|
||||
> **Note:** `config.local.yaml` is gitignored — your credentials stay local.
|
||||
|
||||
### 3. Install Dependencies
|
||||
|
||||
```bash
|
||||
pip install pyyaml
|
||||
# or if pip is externally managed:
|
||||
pip install pyyaml --break-system-packages
|
||||
```
|
||||
|
||||
### 4. Test with a Dry Run
|
||||
|
||||
```bash
|
||||
# Classify 5 emails without applying any changes
|
||||
python3 gmail_organizer.py --dry-run --limit 5 -v
|
||||
```
|
||||
|
||||
You should see output like:
|
||||
```
|
||||
2026-03-22 03:51:06 INFO Connecting to Gmail as you@gmail.com
|
||||
2026-03-22 03:51:07 INFO Fetched 5 message UIDs
|
||||
2026-03-22 03:51:07 INFO [1/5] Classifying: Security alert (from: Google)
|
||||
2026-03-22 03:51:12 INFO → accounts (AutoOrg/Accounts)
|
||||
2026-03-22 03:51:12 INFO [DRY RUN] Would apply label: AutoOrg/Accounts + archive
|
||||
```
|
||||
|
||||
### 5. Run for Real
|
||||
|
||||
```bash
|
||||
# Process default batch (50 emails)
|
||||
python3 gmail_organizer.py -v
|
||||
|
||||
# Process ALL emails in inbox
|
||||
python3 gmail_organizer.py --limit 1000 -v
|
||||
```
|
||||
|
||||
### 6. Set Up Cron (Automatic Sorting)
|
||||
|
||||
The cron job runs every 30 minutes to classify new emails:
|
||||
|
||||
```bash
|
||||
crontab -e
|
||||
```
|
||||
|
||||
Add this line:
|
||||
|
||||
```cron
|
||||
*/30 * * * * cd /home/homelab/organized/repos/homelab/scripts/gmail-organizer && python3 gmail_organizer.py >> /tmp/gmail-organizer.log 2>&1
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Command-Line Options
|
||||
|
||||
```
|
||||
usage: gmail_organizer.py [-h] [-c CONFIG] [-n] [--reprocess] [--limit LIMIT] [-v]
|
||||
|
||||
Options:
|
||||
-c, --config PATH Path to config YAML (default: config.local.yaml)
|
||||
-n, --dry-run Classify but don't apply labels or archive
|
||||
--reprocess Re-classify already-processed emails
|
||||
--limit N Override batch size (default: 50)
|
||||
-v, --verbose Debug logging
|
||||
```
|
||||
|
||||
### Common Operations
|
||||
|
||||
```bash
|
||||
# Normal run (processes new emails only)
|
||||
python3 gmail_organizer.py
|
||||
|
||||
# Verbose output
|
||||
python3 gmail_organizer.py -v
|
||||
|
||||
# Preview what would happen (no changes)
|
||||
python3 gmail_organizer.py --dry-run --limit 10 -v
|
||||
|
||||
# Re-classify everything (e.g., after changing categories or archive rules)
|
||||
python3 gmail_organizer.py --reprocess --limit 1000
|
||||
|
||||
# Check the cron log
|
||||
tail -f /tmp/gmail-organizer.log
|
||||
```
|
||||
|
||||
### Changing Categories
|
||||
|
||||
Edit `config.local.yaml` to add, remove, or modify categories:
|
||||
|
||||
```yaml
|
||||
categories:
|
||||
finance:
|
||||
label: "AutoOrg/Finance"
|
||||
description: "Bank statements, investment updates, tax documents"
|
||||
archive: false
|
||||
```
|
||||
|
||||
After changing categories, reprocess existing emails:
|
||||
|
||||
```bash
|
||||
python3 gmail_organizer.py --reprocess --limit 1000
|
||||
```
|
||||
|
||||
### Changing Archive Behavior
|
||||
|
||||
Toggle `archive: true/false` per category in `config.local.yaml`. Archived emails are NOT deleted — they're removed from the inbox but remain accessible via the `AutoOrg/*` labels in Gmail's sidebar.
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
scripts/gmail-organizer/
|
||||
├── gmail_organizer.py # Main script
|
||||
├── config.yaml # Template config (committed to repo)
|
||||
├── config.local.yaml # Your credentials (gitignored)
|
||||
├── processed.db # SQLite tracking database (gitignored)
|
||||
├── requirements.txt # Python dependencies
|
||||
└── .gitignore # Keeps credentials and DB out of git
|
||||
```
|
||||
|
||||
## Ollama Backend
|
||||
|
||||
The script uses the Ollama API at `https://a5be22681.vishinator.olares.com` running on Olares. The current model is `qwen3:32b` (30.5B parameters, Q4_K_M quantization).
|
||||
|
||||
The LLM prompt is minimal — it sends the email's From, Subject, and a body snippet (truncated to 2000 chars), and asks for a single-word category classification. Temperature is set to 0.1 for consistent results.
|
||||
|
||||
The model also has `devstral-small-2:latest` available as an alternative if needed — just change `model` in the config.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Config not found" error
|
||||
```bash
|
||||
cp config.yaml config.local.yaml
|
||||
# Edit config.local.yaml with your credentials
|
||||
```
|
||||
|
||||
### IMAP login fails
|
||||
- Verify 2FA is enabled on your Google account
|
||||
- Regenerate the app password if it was revoked
|
||||
- Check that the email address is correct
|
||||
|
||||
### Ollama request fails
|
||||
- Verify Ollama is running: `curl https://a5be22681.vishinator.olares.com/api/tags`
|
||||
- Check the model is loaded: look for `qwen3:32b` in the response
|
||||
- The script has a 60-second timeout per classification
|
||||
|
||||
### Emails not archiving
|
||||
- Check that `archive: true` is set for the category in `config.local.yaml`
|
||||
- Run with `-v` to see archive actions in the log
|
||||
|
||||
### Re-sorting after config changes
|
||||
```bash
|
||||
# Clear the tracking database and reprocess
|
||||
rm processed.db
|
||||
python3 gmail_organizer.py --limit 1000 -v
|
||||
```
|
||||
|
||||
### Cron not running
|
||||
```bash
|
||||
# Verify cron is set up
|
||||
crontab -l
|
||||
|
||||
# Check the log
|
||||
cat /tmp/gmail-organizer.log
|
||||
|
||||
# Test manually
|
||||
cd /home/homelab/organized/repos/homelab/scripts/gmail-organizer
|
||||
python3 gmail_organizer.py -v
|
||||
```
|
||||
Reference in New Issue
Block a user