4.8 KiB
Scrutiny — SMART Disk Health Monitoring
Scrutiny runs SMART health checks on physical drives and presents results in a web UI with historical trending and alerting.
Architecture
┌─────────────────────────────────┐
│ homelab-vm (100.67.40.126) │
│ scrutiny-web :8090 │
│ scrutiny-influxdb (internal) │
└──────────────┬──────────────────┘
│ collector API
┌──────────────────────┼──────────────────────┐
│ │ │
atlantis-collector calypso-collector setillo-collector
concord-nuc-collector pi-5-collector
| Role | Host | Notes |
|---|---|---|
| Hub (web + InfluxDB) | homelab-vm | Port 8090, proxied at scrutiny.vish.gg |
| Collector | atlantis | 8-bay NAS, /dev/sda–sdh |
| Collector | calypso | 2-bay NAS, /dev/sda–sdb |
| Collector | setillo | 2-bay NAS, /dev/sda–sdb |
| Collector | concord-nuc | Intel NUC, /dev/sda (NVMe optional) |
| Collector | pi-5 | /dev/nvme0n1 (M.2 HAT) |
| Skipped | homelab-vm, seattle, matrix-ubuntu | VMs — no physical disks |
| Skipped | guava (TrueNAS) | Native TrueNAS disk monitoring |
Files
| File | Purpose |
|---|---|
hosts/vms/homelab-vm/scrutiny.yaml |
Hub (web + InfluxDB) |
hosts/synology/atlantis/scrutiny-collector.yaml |
Atlantis collector |
hosts/synology/calypso/scrutiny-collector.yaml |
Calypso collector |
hosts/synology/setillo/scrutiny-collector.yaml |
Setillo collector |
hosts/physical/concord-nuc/scrutiny-collector.yaml |
NUC collector |
hosts/edge/rpi5-vish/scrutiny-collector.yaml |
Pi-5 collector |
Deployment
Hub (homelab-vm)
Deploy via Portainer GitOps on endpoint 443399:
- Portainer → Stacks → Add stack → Git repository
- URL:
https://git.vish.gg/Vish/homelab - Compose path:
hosts/vms/homelab-vm/scrutiny.yaml
Or manually:
ssh homelab
docker compose -f /path/to/scrutiny.yaml up -d
Verify:
curl http://100.67.40.126:8090/api/health
# {"success":true}
Collectors — Synology (Atlantis, Calypso, Setillo)
Synology requires privileged: true (DSM kernel lacks nf_conntrack_netlink).
Deploy via Portainer stacks on each Synology host, or manually:
ssh atlantis
sudo /var/packages/REDACTED_APP_PASSWORD/target/usr/bin/docker compose \
-f /path/to/scrutiny-collector.yaml up -d
Important — verify drive paths first:
# List block devices on the host
lsblk -o NAME,SIZE,TYPE,MODEL
# Or for Synology:
sudo fdisk -l | grep '^Disk /dev'
Update the devices: list in the collector compose to match actual drives.
Collectors — Linux (concord-nuc, pi-5)
Deploy via Portainer edge agent or manually:
ssh vish-concord-nuc
docker compose -f scrutiny-collector.yaml up -d
Verify a collector is shipping data:
docker logs scrutiny-collector --tail 20
# Should show: "Sending device summary to Scrutiny API"
DNS / Subdomain Setup
scrutiny.vish.gg is already added to the DDNS updater on Atlantis (dynamicdnsupdater.yaml).
Still needed (manual steps):
- Cloudflare DNS: add A record
scrutiny.vish.gg → current public IP(proxied)- Or let the DDNS container create it automatically on next run
- NPM proxy host:
scrutiny.vish.gg → http://100.67.40.126:8090
Validation
# Hub health
curl http://100.67.40.126:8090/api/health
# List all tracked devices after collectors run
curl http://100.67.40.126:8090/api/devices | jq '.data[].device_name'
# Check collector logs
docker logs scrutiny-collector
# Open UI
open https://scrutiny.vish.gg
Collector Schedule
By default, collectors run a SMART scan on startup and then hourly. The schedule is controlled inside the container — no cron needed.
Troubleshooting
"permission denied" on /dev/sdX
→ Use privileged: true on Synology. On Linux, use cap_add: [SYS_RAWIO, SYS_ADMIN].
Device not found in collector
→ Run lsblk on the host, update devices: list in the compose file, recreate the container.
Hub shows no devices
→ Check collector logs for API errors. Verify COLLECTOR_API_ENDPOINT is reachable from the collector host via Tailscale (curl http://100.67.40.126:8090/api/health).
InfluxDB fails to start
→ The influxdb container initialises on first run; scrutiny-web depends on it but may start before it's ready. Wait ~30s and check docker logs scrutiny-influxdb.