# Scrutiny — SMART Disk Health Monitoring Scrutiny runs SMART health checks on physical drives and presents results in a web UI with historical trending and alerting. ## Architecture ``` ┌─────────────────────────────────┐ │ homelab-vm (100.67.40.126) │ │ scrutiny-web :8090 │ │ scrutiny-influxdb (internal) │ └──────────────┬──────────────────┘ │ collector API ┌──────────────────────┼──────────────────────┐ │ │ │ atlantis-collector calypso-collector setillo-collector concord-nuc-collector pi-5-collector ``` | Role | Host | Notes | |------|------|-------| | Hub (web + InfluxDB) | homelab-vm | Port 8090, proxied at scrutiny.vish.gg | | Collector | atlantis | 8-bay NAS, /dev/sda–sdh | | Collector | calypso | 2-bay NAS, /dev/sda–sdb | | Collector | setillo | 2-bay NAS, /dev/sda–sdb | | Collector | concord-nuc | Intel NUC, /dev/sda (NVMe optional) | | Collector | pi-5 | /dev/nvme0n1 (M.2 HAT) | | Skipped | homelab-vm, seattle, matrix-ubuntu | VMs — no physical disks | | Skipped | guava (TrueNAS) | Native TrueNAS disk monitoring | --- ## Files | File | Purpose | |------|---------| | `hosts/vms/homelab-vm/scrutiny.yaml` | Hub (web + InfluxDB) | | `hosts/synology/atlantis/scrutiny-collector.yaml` | Atlantis collector | | `hosts/synology/calypso/scrutiny-collector.yaml` | Calypso collector | | `hosts/synology/setillo/scrutiny-collector.yaml` | Setillo collector | | `hosts/physical/concord-nuc/scrutiny-collector.yaml` | NUC collector | | `hosts/edge/rpi5-vish/scrutiny-collector.yaml` | Pi-5 collector | --- ## Deployment ### Hub (homelab-vm) Deploy via Portainer GitOps on endpoint 443399: 1. Portainer → Stacks → Add stack → Git repository 2. URL: `https://git.vish.gg/Vish/homelab` 3. Compose path: `hosts/vms/homelab-vm/scrutiny.yaml` Or manually: ```bash ssh homelab docker compose -f /path/to/scrutiny.yaml up -d ``` Verify: ```bash curl http://100.67.40.126:8090/api/health # {"success":true} ``` ### Collectors — Synology (Atlantis, Calypso, Setillo) Synology requires `privileged: true` (DSM kernel lacks `nf_conntrack_netlink`). Deploy via Portainer stacks on each Synology host, or manually: ```bash ssh atlantis sudo /var/packages/REDACTED_APP_PASSWORD/target/usr/bin/docker compose \ -f /path/to/scrutiny-collector.yaml up -d ``` **Important — verify drive paths first:** ```bash # List block devices on the host lsblk -o NAME,SIZE,TYPE,MODEL # Or for Synology: sudo fdisk -l | grep '^Disk /dev' ``` Update the `devices:` list in the collector compose to match actual drives. ### Collectors — Linux (concord-nuc, pi-5) Deploy via Portainer edge agent or manually: ```bash ssh vish-concord-nuc docker compose -f scrutiny-collector.yaml up -d ``` Verify a collector is shipping data: ```bash docker logs scrutiny-collector --tail 20 # Should show: "Sending device summary to Scrutiny API" ``` --- ## DNS / Subdomain Setup `scrutiny.vish.gg` is already added to the DDNS updater on Atlantis (`dynamicdnsupdater.yaml`). Still needed (manual steps): 1. **Cloudflare DNS**: add A record `scrutiny.vish.gg → current public IP` (proxied) - Or let the DDNS container create it automatically on next run 2. **NPM proxy host**: `scrutiny.vish.gg → http://100.67.40.126:8090` --- ## Validation ```bash # Hub health curl http://100.67.40.126:8090/api/health # List all tracked devices after collectors run curl http://100.67.40.126:8090/api/devices | jq '.data[].device_name' # Check collector logs docker logs scrutiny-collector # Open UI open https://scrutiny.vish.gg ``` --- ## Collector Schedule By default, collectors run a SMART scan on startup and then hourly. The schedule is controlled inside the container — no cron needed. --- ## Troubleshooting **"permission denied" on /dev/sdX** → Use `privileged: true` on Synology. On Linux, use `cap_add: [SYS_RAWIO, SYS_ADMIN]`. **Device not found in collector** → Run `lsblk` on the host, update `devices:` list in the compose file, recreate the container. **Hub shows no devices** → Check collector logs for API errors. Verify `COLLECTOR_API_ENDPOINT` is reachable from the collector host via Tailscale (`curl http://100.67.40.126:8090/api/health`). **InfluxDB fails to start** → The influxdb container initialises on first run; `scrutiny-web` depends on it but may start before it's ready. Wait ~30s and check `docker logs scrutiny-influxdb`.