4.6 KiB
4.6 KiB
Homelab Monitoring & Alerting Setup
Last updated: 2026-03-21
Overview
| Service | Host | Port | URL | Purpose |
|---|---|---|---|---|
| Grafana | Homelab VM | 3300 | https://gf.vish.gg |
Dashboards & visualization |
| Prometheus | Homelab VM | 9090 | http://192.168.0.210:9090 |
Metrics collection & storage |
| Alertmanager | Homelab VM | 9093 | http://192.168.0.210:9093 |
Alert routing & dedup |
| ntfy | Homelab VM | 8081 | https://ntfy.vish.gg |
Push notifications |
| Uptime Kuma | RPi 5 | 3001 | http://192.168.0.66:3001 or https://kuma.vish.gg |
Uptime monitoring (97 monitors) |
| DIUN | Atlantis | — | ntfy topic diun |
Docker image update detection |
| Scrutiny | Multiple | 8090 | http://192.168.0.210:8090 |
SMART disk health |
All monitoring services on homelab-vm are deployed as monitoring-stack via Portainer GitOps from hosts/vms/homelab-vm/monitoring.yaml.
Prometheus Targets (14 active)
| Job | Target | Type | Status |
|---|---|---|---|
| atlantis-node | atlantis | node_exporter | Up |
| atlantis-snmp | atlantis | SNMP exporter | Up |
| calypso-node | calypso | node_exporter | Up |
| calypso-snmp | calypso | SNMP exporter | Up |
| concord-nuc-node | concord-nuc | node_exporter | Up |
| homelab-node | homelab-vm | node_exporter | Up |
| node_exporter | homelab-vm | node_exporter (self) | Up |
| prometheus | localhost:9090 | self-scrape | Up |
| proxmox-node | proxmox | node_exporter | Up |
| raspberry-pis | pi-5 | node_exporter | Up |
| seattle-node | seattle | node_exporter | Up |
| setillo-node | setillo | node_exporter | Up |
| setillo-snmp | setillo | SNMP exporter | Up |
| truenas-node | guava | node_exporter | Up |
Grafana
- URL:
https://gf.vish.gg(Authentik SSO) orhttp://192.168.0.210:3300 - Login: Authentik SSO (primary), local
adminaccount (fallback) - Dashboards: Infrastructure Overview, Node Details, Synology NAS, Node Exporter Full
- Stack:
hosts/vms/homelab-vm/monitoring.yaml
ntfy Push Notifications
- Public URL:
https://ntfy.vish.gg - Local URL:
http://192.168.0.210:8081 - Primary topic:
homelab-alerts(subscribed by mobile app) - Other topics:
diun(image updates),homelab-deploys(CI deployments)
Send test notification
curl -X POST "http://localhost:8081/homelab-alerts" \
-H "Title: Test Alert" \
-H "Priority: 3" \
-d "Test notification from homelab"
Mobile app setup
- Install ntfy app (Android/iOS)
- Add server:
https://ntfy.vish.gg - Subscribe to topic:
homelab-alerts
Uptime Kuma
- URL:
http://192.168.0.66:3001orhttps://kuma.vish.gg(via Authentik forward auth) - Host: RPi 5 (pi-5)
- Monitors: 97 total (27 HTTP + 70 port checks)
- Status page:
https://kuma.vish.gg/status/homelab
Monitor types
- HTTP monitors: Check public URLs (
https://*.vish.gg) — works for all services behind NPM - Port monitors: Check Tailscale IPs (100.x.x.x) on service ports — direct connectivity check
Known limitations
- Port monitors use Tailscale IPs because pi-5 can't resolve
.tail.vish.gg(AdGuard not configured as its DNS) - Some services return 401/302 on port check (auth required) but Kuma counts any TCP response as "up"
Alertmanager
- URL:
http://192.168.0.210:9093 - Routes to: ntfy via webhook
- Active alerts: 0 (normal)
Watchtower Notifications
Watchtower runs on 3 hosts with ntfy integration:
# In common/watchtower-full.yaml
WATCHTOWER_NOTIFICATIONS=shoutrrr
WATCHTOWER_NOTIFICATION_URL=ntfy://192.168.0.210:8081/homelab-alerts?scheme=http
Manual update trigger:
curl -X POST http://192.168.0.200:8083/v1/update \
-H "Authorization: Bearer watchtower-metrics-token"
DIUN (Docker Image Update Notifier)
- Host: Atlantis
- Schedule: Mondays 09:00 UTC
- Notifications: ntfy topic
diun - What it does: Scans all running container images for new upstream digests
Scrutiny (SMART Monitoring)
Scrutiny collectors run on hosts with physical drives:
- Atlantis (8x HDD + 4x NVMe)
- Calypso (2x HDD + 2x NVMe)
- RPi 5 (NVMe)
Hub dashboard: http://192.168.0.210:8090 or https://scrutiny.vish.gg
Related Documentation
- Image Update Guide — Renovate, DIUN, Watchtower
- Ansible Playbook Guide —
health_check.yml,service_status.yml - Backup Strategy — backup monitoring
- Offline & Remote Access — accessing monitoring when internet is down