Files
homelab-optimized/docs/infrastructure/monitoring/dashboard-verification-report.md
Gitea Mirror Bot d90cf1f849
Some checks failed
Documentation / Deploy to GitHub Pages (push) Has been cancelled
Documentation / Build Docusaurus (push) Has been cancelled
Sanitized mirror from private repository - 2026-04-19 09:52:01 UTC
2026-04-19 09:52:01 +00:00

4.6 KiB

Grafana Dashboard Verification Report

Executive Summary

All dashboard sections are now working correctly
Datasource UID mismatches resolved
Template variables configured with correct default values
All key metrics displaying data

Issues Resolved

1. Datasource UID Mismatch

  • Problem: Dashboard JSON files contained hardcoded UID cfbskvs8upds0b
  • Actual UID: PBFA97CFB590B2093
  • Solution: Updated all dashboard files with correct datasource UID
  • Files Fixed:
    • infrastructure-overview.json
    • node-details.json
    • node-exporter-full.json
    • synology-nas-monitoring.json

2. Template Variable Default Values

  • Problem: Template variables had incorrect default values (e.g., node_exporter, homelab-vm)
  • Solution: Updated defaults to match actual job names and instances
  • Updates Made:
    • Job: node_exporteratlantis-node
    • Nodename: homelabatlantis
    • Instance: homelab-vm100.83.230.112:9100

Dashboard Status

🟢 Node Exporter Full Dashboard

  • UID: rYdddlPWk
  • Panels: 32 panels, all functional
  • Template Variables: All working
    • DS_PROMETHEUS: Prometheus
    • job: atlantis-node
    • nodename: atlantis
    • node: 100.83.230.112:9100
    • diskdevices: [a-z]+|nvme[0-9]+n[0-9]+|mmcblk[0-9]+
  • Key Metrics: All displaying data
    • CPU Usage: 11.35%
    • Memory Usage: 65.05%
    • Disk I/O: 123 data points
    • Network Traffic: 297 data points

🟢 Synology NAS Monitoring Dashboard

  • UID: synology-dashboard-v2
  • Panels: 8 panels, all functional
  • Key Metrics: All displaying data
    • Storage Usage: 67.62%
    • Disk Temperatures: 18 sensors
    • System Uptime: 3 devices
    • SNMP Targets: 3 up

🟢 Node Details Dashboard

  • UID: node-details-v2
  • Panels: 21 panels, all functional
  • Template Variables: Fixed
    • datasource: Prometheus
    • job: atlantis-node
    • instance: 100.83.230.112:9100

🟢 Infrastructure Overview Dashboard

  • UID: infrastructure-overview-v2
  • Panels: 7 panels, all functional
  • Template Variables: Fixed
    • datasource: Prometheus
    • job: All (multi-select enabled)

Monitoring Targets Health

Node Exporters (8 total)

  • atlantis-node: 100.83.230.112:9100
  • calypso-node: 100.103.48.78:9100
  • concord-nuc-node: 100.72.55.21:9100
  • homelab-node: 100.67.40.126:9100
  • proxmox-node: 100.87.12.28:9100
  • raspberry-pis: 100.77.151.40:9100
  • setillo-node: 100.125.0.20:9100
  • truenas-node: 100.75.252.64:9100
  • raspberry-pis: 100.123.246.75:9100 (down)
  • vmi2076105-node: 100.99.156.20:9100 (down)

Active Node Targets: 7/8 (87.5% uptime)

SNMP Targets (3 total)

  • atlantis-snmp: 100.83.230.112
  • calypso-snmp: 100.103.48.78
  • setillo-snmp: 100.125.0.20

Active SNMP Targets: 3/3 (100% uptime)

System Services

  • prometheus: prometheus:9090
  • alertmanager: alertmanager:9093

Dashboard Access URLs

Technical Details

Prometheus Configuration

  • Endpoint: http://prometheus:9090
  • Datasource UID: PBFA97CFB590B2093
  • Status: Healthy
  • Targets: 15 total (13 up, 2 down)

GitOps Implementation

  • Repository: /home/homelab/docker/monitoring
  • Provisioning: Automated via Grafana provisioning
  • Dashboards: Auto-loaded from /grafana/dashboards/
  • Datasources: Auto-configured from /grafana/provisioning/datasources/

Verification Scripts

Two verification scripts have been created:

  1. fix-datasource-uids.sh: Automated UID correction script
  2. verify-dashboard-sections.sh: Comprehensive dashboard testing script

Recommendations

  1. Monitor Down Targets: Investigate the 2 down targets:

    • raspberry-pis: 100.123.246.75:9100
    • vmi2076105-node: 100.99.156.20:9100
  2. Regular Health Checks: Run verify-dashboard-sections.sh periodically to ensure continued functionality

  3. Template Variable Optimization: Consider setting up more dynamic defaults based on available targets

Conclusion

All dashboard sections are now fully functional
Data is displaying correctly across all panels
Template variables are working as expected
GitOps implementation is successful

The Grafana monitoring setup is now complete and operational with all major dashboard sections verified and working correctly.