Files
homelab-optimized/docs/superpowers/specs/2026-04-04-homelab-dashboard-design.md
Gitea Mirror Bot fb00a325d1
Some checks failed
Documentation / Build Docusaurus (push) Failing after 5m14s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-04-18 11:19:59 UTC
2026-04-18 11:19:59 +00:00

11 KiB

Homelab Dashboard — Design Spec

Context

The homelab has 73 MCP tools, 11 automation scripts, and data scattered across 15+ services. There's no unified view — you switch between Homarr, Grafana, Portainer, and terminal logs. This dashboard consolidates everything into a single production-grade UI.

Architecture

┌─────────────────┐     ┌──────────────────┐     ┌──────────────────────┐
│  Next.js UI      │────▶│  FastAPI Backend  │────▶│  Services            │
│  (dashboard-ui)  │     │  (dashboard-api)  │     │                      │
│  Port 3000       │     │  Port 8888        │     │  Portainer (5 hosts) │
│                  │     │                   │     │  Jellyfin (olares)   │
│  - shadcn/ui     │     │  - scripts/lib/*  │     │  Ollama (olares)     │
│  - Tailwind CSS  │     │  - SQLite readers │     │  Prometheus          │
│  - SWR polling   │     │  - SSE stream     │     │  Gitea               │
│  - dark theme    │     │  - /api/* routes  │     │  Headscale           │
│                  │     │                   │     │  SQLite DBs (6)      │
└─────────────────┘     └──────────────────┘     │  expenses.csv        │
                                                   └──────────────────────┘

Docker Compose runs both containers. Mounts scripts/ read-only for the Python backend to access lib/ modules and SQLite DBs.

Tech Stack

Layer Technology Why
Frontend Next.js 15 + React 19 Best component ecosystem
UI Components shadcn/ui + Tailwind CSS Production-grade, dark mode built-in
Data Fetching SWR (stale-while-revalidate) Auto-polling with caching
Real-time EventSource (SSE) Activity feed + alerts
Backend FastAPI (Python 3.12) Reuses existing scripts/lib/ modules
Database SQLite (read-only) + CSV Existing automation data, no new DB
Deployment Docker Compose (2 containers) dashboard-ui + dashboard-api

Tabs & Content

1. Dashboard (Overview)

Quick Stats Row (5 cards, polled every 60s):

  • Total containers (sum across all Portainer endpoints) + health status
  • Hosts online (Portainer endpoint health checks)
  • GPU status (nvidia-smi via SSH to olares: temp, utilization, VRAM)
  • Emails classified today (query processed.db WHERE date = today)
  • Active alerts (count of unhealthy containers from stack-restart.db)

Activity Feed (SSE, real-time):

  • Reads from a combined event log the API builds from:
    • /tmp/stack-restart.log (container health events)
    • /tmp/backup-validator.log (backup results)
    • /tmp/gmail-organizer-dvish.log + others (email classifications)
    • /tmp/receipt-tracker.log (expense extractions)
    • /tmp/config-drift.log (drift detections)
  • Shows most recent 20 events with color-coded dots by type
  • New events push via SSE

Jellyfin Card (polled every 30s):

  • Now playing (active sessions via Jellyfin API)
  • Library item counts (movies, TV, anime, music)

Ollama Card (polled every 60s):

  • Model status (loaded/unloaded, model name)
  • VRAM usage
  • Daily call count (parsed from automation logs)

Hosts Grid (polled every 60s):

  • 5 Portainer endpoints with container counts
  • Status indicator (green/red)
  • Click to navigate to Infrastructure tab filtered by host

2. Infrastructure

Container Table (polled every 30s):

  • All containers across all Portainer endpoints
  • Columns: Name, Host, Status, Image, Uptime
  • Filter by endpoint, search by name
  • Click to view logs (modal with last 100 lines)
  • Restart button per container

Olares Pods (polled every 30s):

  • K3s pod list from kubectl get pods -A
  • GPU processes from nvidia-smi
  • Restart deployment button

Headscale Nodes (polled every 120s):

  • Node list with online/offline status
  • Last seen timestamp
  • IP addresses

3. Media

Jellyfin Now Playing (polled every 15s):

  • Active streams with user, device, title, transcode status
  • Bandwidth indicator

Download Queues (polled every 30s):

  • Sonarr queue (upcoming episodes, download status)
  • Radarr queue (upcoming movies, download status)
  • SABnzbd queue (active downloads, speed, ETA)

Library Stats (polled every 300s):

  • Jellyfin library counts
  • Recent additions (if API supports it)

4. Automations

Email Organizer Status (polled every 120s):

  • Per-account stats: lzbellina92, dvish92, admin@thevish.io
  • Today's classifications by category (bar chart)
  • Sender cache hit rate
  • Last run time + errors

Stack Restart History (polled every 60s):

  • Table from stack-restart.db: container, endpoint, duration, action taken, LLM analysis
  • Last 7 days

Backup Status (polled every 300s):

  • Parse latest /tmp/gmail-backup-daily.log
  • OK/FAIL indicator with last run time
  • Email count backed up

Config Drift (polled every 300s):

  • Table of detected drifts (if any)
  • Last scan time

Disk Predictions (polled every 3600s):

  • Table from latest disk-predictor run
  • Volumes approaching 90% highlighted

5. Expenses

Expense Table (polled every 300s):

  • Read from data/expenses.csv
  • Columns: Date, Vendor, Amount, Currency, Order#, Account
  • Sortable, filterable
  • Running total for current month

Monthly Summary (polled every 300s):

  • Total spend this month
  • Spend by vendor (top 10)
  • Spend by category (if derivable from vendor)

Subscription Audit (static, monthly):

  • Latest audit results from subscription-auditor
  • Active, dormant, marketing sender counts

FastAPI Backend Endpoints

GET  /api/health                    → backend health check

# Dashboard
GET  /api/stats/overview            → container count, host health, GPU, email count, alerts
GET  /api/activity                  → SSE stream of recent events
GET  /api/jellyfin/status           → now playing + library counts
GET  /api/ollama/status             → model, VRAM, call count

# Infrastructure
GET  /api/containers                → all containers across endpoints (?endpoint=atlantis)
GET  /api/containers/{id}/logs      → container logs (?endpoint=atlantis&tail=100)
POST /api/containers/{id}/restart   → restart container
GET  /api/olares/pods               → k3s pod list (?namespace=)
GET  /api/olares/gpu                → nvidia-smi output
GET  /api/headscale/nodes           → headscale node list

# Media
GET  /api/jellyfin/sessions         → active playback sessions
GET  /api/sonarr/queue              → download queue
GET  /api/radarr/queue              → download queue
GET  /api/sabnzbd/queue             → active downloads

# Automations
GET  /api/automations/email         → organizer stats from processed.db files
GET  /api/automations/restarts      → stack-restart history from DB
GET  /api/automations/backup        → backup log parse
GET  /api/automations/drift         → config drift status
GET  /api/automations/disk          → disk predictions

# Expenses
GET  /api/expenses                  → expenses.csv data (?month=2026-04)
GET  /api/expenses/summary          → monthly totals, top vendors
GET  /api/subscriptions             → latest subscription audit

SSE Activity Stream

The /api/activity endpoint uses Server-Sent Events:

@app.get("/api/activity")
async def activity_stream():
    async def event_generator():
        # Tail all automation log files
        # Parse new lines into structured events
        # Yield as SSE: data: {"type": "email", "message": "...", "time": "..."}
    return StreamingResponse(event_generator(), media_type="text/event-stream")

Event types: container_health, backup, email_classified, receipt_extracted, config_drift, stack_restart, pr_review.

Docker Compose

# dashboard/docker-compose.yml
services:
  dashboard-api:
    build: ./api
    ports:
      - "8888:8888"
    volumes:
      - ../../scripts:/app/scripts:ro        # access lib/ and SQLite DBs
      - ../../data:/app/data:ro              # expenses.csv
      - /tmp:/app/logs:ro                    # automation log files
    environment:
      - PORTAINER_URL=http://100.83.230.112:10000
      - PORTAINER_TOKEN=${PORTAINER_TOKEN}
      - OLLAMA_URL=http://192.168.0.145:31434
    restart: unless-stopped

  dashboard-ui:
    build: ./ui
    ports:
      - "3000:3000"
    environment:
      - API_URL=http://dashboard-api:8888
    depends_on:
      - dashboard-api
    restart: unless-stopped

File Structure

dashboard/
  docker-compose.yml
  api/
    Dockerfile
    requirements.txt        # fastapi, uvicorn, httpx
    main.py                 # FastAPI app
    routers/
      overview.py           # /api/stats, /api/activity
      containers.py         # /api/containers/*
      media.py              # /api/jellyfin/*, /api/sonarr/*, etc.
      automations.py        # /api/automations/*
      expenses.py           # /api/expenses/*
      olares.py             # /api/olares/*
  ui/
    Dockerfile
    package.json
    next.config.js
    tailwind.config.ts
    app/
      layout.tsx            # root layout with top nav
      page.tsx              # Dashboard tab (default)
      infrastructure/
        page.tsx
      media/
        page.tsx
      automations/
        page.tsx
      expenses/
        page.tsx
    components/
      nav.tsx               # top navigation bar
      stat-card.tsx         # quick stats cards
      activity-feed.tsx     # SSE-powered activity feed
      container-table.tsx   # sortable container list
      host-card.tsx         # host status card
      expense-table.tsx     # expense data table
      jellyfin-card.tsx     # now playing + library stats
      ollama-card.tsx       # LLM status card
    lib/
      api.ts                # fetch wrapper for backend API
      use-sse.ts            # SSE hook for activity feed

Polling Intervals

Data Interval Rationale
Container status 30s Detect issues quickly
Jellyfin sessions 15s Now playing should feel live
GPU / Ollama 60s Changes slowly
Email stats 120s Organizer runs every 30min
Activity feed SSE (real-time) Should feel instant
Expenses 300s Changes once/day at most
Headscale nodes 120s Rarely changes
Disk predictions 3600s Weekly report, hourly check is plenty

Design Tokens (Dark Theme)

Based on the approved mockup:

Background:     #0a0a1a (page), #0f172a (cards), #1e293b (borders)
Text:           #f1f5f9 (primary), #94a3b8 (secondary), #475569 (muted)
Accent:         #3b82f6 (blue, primary action)
Success:        #22c55e (green, healthy)
Warning:        #f59e0b (amber)
Error:          #ef4444 (red)
Purple:         #8b5cf6 (Ollama/AI indicators)

These map directly to Tailwind's slate/blue/green/amber/red/violet palette, so shadcn/ui theming is straightforward.

Verification

  1. docker compose up should start both containers
  2. http://localhost:3000 loads the dashboard
  3. All 5 tabs render REDACTED_APP_PASSWORD the homelab
  4. Activity feed updates in real-time when an automation runs
  5. Container restart button works
  6. Expenses table shows data from expenses.csv
  7. Mobile-responsive (test at 375px width)