Files
homelab-optimized/docs/infrastructure/monitoring/README.md
Gitea Mirror Bot 5b8d0afef7
Some checks failed
Documentation / Build Docusaurus (push) Failing after 5m16s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-03-26 10:25:55 UTC
2026-03-26 10:25:55 +00:00

80 lines
3.5 KiB
Markdown

# Monitoring Stack
The production monitoring stack runs on **homelab_vm** as a single Portainer GitOps stack.
## Deployment
| Property | Value |
|----------|-------|
| **Stack name** | `monitoring-stack` |
| **Portainer stack ID** | 687 (endpoint 443399) |
| **Compose file** | `hosts/vms/homelab-vm/monitoring.yaml` |
| **Deployment method** | GitOps (Portainer pulls from `main` branch) |
## Services
| Service | Image | Port | Purpose |
|---------|-------|------|---------|
| `grafana` | `grafana/grafana-oss:12.4.0` | 3300 | Dashboards & visualization |
| `prometheus` | `prom/prometheus:latest` | 9090 | Metrics collection & storage |
| `node_exporter` | `prom/node-exporter:latest` | 9100 (host) | homelab-vm host metrics |
| `snmp_exporter` | `prom/snmp-exporter:latest` | 9116 | Synology NAS SNMP metrics |
## Access
| Service | URL |
|---------|-----|
| Grafana (external) | `https://gf.vish.gg` |
| Grafana (internal) | `http://192.168.0.210:3300` |
| Prometheus | `http://192.168.0.210:9090` |
| SNMP Exporter | `http://192.168.0.210:9116` |
## Grafana Dashboards
All configs are embedded as Docker `configs` in `monitoring.yaml` — no bind mounts or separate config files needed.
| Dashboard | UID | Source |
|-----------|-----|--------|
| Node Details - Full Metrics *(default home)* | `node-details-v2` | DB (imported) |
| Infrastructure Overview - All Devices | `infrastructure-overview-v2` | Provisioned in monitoring.yaml |
| Synology NAS Monitoring | `synology-dashboard-v2` | Provisioned in monitoring.yaml |
| Node Exporter Full | `rYdddlPWk` | DB (imported from grafana.com) |
The home dashboard is set via the Grafana org preferences API (persists in `grafana-data` volume).
## Prometheus Scrape Targets
| Job | Target | Instance label |
|-----|--------|---------------|
| `node_exporter` | `host.docker.internal:9100` | homelab-vm |
| `homelab-node` | `100.67.40.126:9100` | homelab-vm |
| `raspberry-pis` | `100.77.151.40:9100` | pi-5 |
| `setillo-node` | `100.125.0.20:9100` | setillo |
| `calypso-node` | `100.103.48.78:9100` | calypso |
| `atlantis-node` | `100.83.230.112:9100` | atlantis |
| `concord-nuc-node` | `100.72.55.21:9100` | concord-nuc |
| `truenas-node` | `100.75.252.64:9100` | guava |
| `seattle-node` | `100.82.197.124:9100` | seattle |
| `proxmox-node` | `100.87.12.28:9100` | proxmox |
| `setillo-snmp` | `100.125.0.20:9116` | setillo (SNMP) |
| `calypso-snmp` | `100.103.48.78:9116` | calypso (SNMP) |
| `atlantis-snmp` | `100.83.230.112:9116` | atlantis (SNMP) |
## Notes
- **Grafana 12 `kubernetesDashboards`**: This feature toggle is ON by default in Grafana 12 and causes noisy log spam. It is disabled via `GF_FEATURE_TOGGLES_DISABLE=kubernetesDashboards` in the compose file.
- **Image pinning**: Grafana is pinned to `12.4.0` to prevent unexpected breaking changes from `:latest` pulls.
- **Admin password**: `GF_SECURITY_ADMIN_PASSWORD` only applies on first run (empty DB). After that, use `grafana cli admin reset-admin-password` to change it.
- **DB-only dashboards**: `node-details-v2` and `Node Exporter Full` are not in `monitoring.yaml` — they live only in the `grafana-data` volume. They would need to be re-imported if the volume is deleted.
## Related Documentation
- `docs/services/individual/grafana.md` — full Grafana service reference
- `docs/admin/monitoring-setup.md` — monitoring stack quick reference
- `docs/admin/monitoring.md` — full monitoring & observability guide
- `hosts/vms/homelab-vm/monitoring.yaml` — compose file (source of truth)
---
**Last Updated**: 2026-03-08