Files
homelab-optimized/docs/infrastructure/MONITORING_ARCHITECTURE.md
Gitea Mirror Bot ac5a4ca940
Some checks failed
Documentation / Build Docusaurus (push) Failing after 5m3s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-04-01 04:44:34 UTC
2026-04-01 04:44:34 +00:00

4.7 KiB

Homelab Monitoring Architecture

This document explains the different monitoring setups in the homelab and their purposes.

🏗️ Architecture Overview

The homelab has three distinct monitoring deployments serving different purposes:

1. Production GitOps Monitoring (Primary)

  • Location: hosts/vms/homelab-vm/monitoring.yaml
  • Deployment: Portainer GitOps on homelab-vm
  • Purpose: Production monitoring for all homelab infrastructure
  • Access: https://gf.vish.gg (with Authentik SSO)
  • Status: ACTIVE - This is the canonical monitoring stack

Features:

  • Monitors all homelab devices (Synology NAS, nodes, VMs)
  • Authentik OAuth2 SSO integration
  • Embedded dashboard configs in Docker Compose
  • Auto-provisioned datasources and dashboards
  • SNMP monitoring for Synology devices

2. Fixed Development Stack (New)

  • Location: docker/monitoring/
  • Deployment: Standalone Docker Compose
  • Purpose: Development/testing with fixed dashboard issues
  • Access: http://localhost:3300 (admin/admin)
  • Status: 🔧 DEVELOPMENT - For testing and dashboard fixes

Features:

  • All dashboard datasource UIDs fixed
  • Template variables working correctly
  • Instance filters properly configured
  • Verification scripts included
  • Backup/restore functionality

3. Atlantis Legacy Setup (Deprecated)

  • Location: hosts/synology/atlantis/grafana_prometheus/
  • Deployment: Synology Docker on Atlantis
  • Purpose: Legacy monitoring setup
  • Status: 📦 ARCHIVED - Kept for reference

🔄 GitOps Workflow

Production Deployment (homelab-vm)

# GitOps automatically deploys from:
hosts/vms/homelab-vm/monitoring.yaml

# Portainer Stack Details:
# - Stack ID: 476
# - Endpoint: 443399
# - Auto-updates from git repository

Development Testing (docker/monitoring)

# Manual deployment for testing:
cd docker/monitoring
docker-compose up -d

# Verify dashboards:
./verify-dashboard-sections.sh

📊 Dashboard Status

Dashboard Production (GitOps) Development (Fixed) Status
Infrastructure Overview Working Fixed Both functional
Synology NAS Monitoring ⚠️ Needs UID fix Fixed Dev has fixes
Node Exporter Full ⚠️ Needs UID fix Fixed Dev has fixes
Node Details ⚠️ Needs UID fix Fixed Dev has fixes

🔧 Applying Fixes to Production

To apply the dashboard fixes to the production GitOps deployment:

  1. Extract fixed dashboards from docker/monitoring/grafana/dashboards/
  2. Update the embedded configs in hosts/vms/homelab-vm/monitoring.yaml
  3. Test locally using the development stack
  4. Commit changes - GitOps will auto-deploy

Example: Updating Synology Dashboard in GitOps

# 1. Extract the fixed dashboard JSON
cat docker/monitoring/grafana/dashboards/synology-nas-monitoring.json

# 2. Update the embedded config in monitoring.yaml
# Replace the dashboard_synology config content with the fixed JSON

# 3. Commit and push - GitOps handles deployment
git add hosts/vms/homelab-vm/monitoring.yaml
git commit -m "Fix Synology dashboard datasource UID in GitOps"
git push

🚀 Deployment Commands

Production (GitOps - Automatic)

# No manual deployment needed
# Portainer GitOps auto-deploys from git repository
# Access: https://gf.vish.gg

Development (Manual)

cd docker/monitoring
docker-compose up -d
# Access: http://localhost:3300
cd hosts/synology/atlantis/grafana_prometheus
# Deploy via Synology Docker UI

📋 Maintenance

Updating Production Dashboards

  1. Test fixes in docker/monitoring/ first
  2. Update embedded configs in hosts/vms/homelab-vm/monitoring.yaml
  3. Commit changes for GitOps auto-deployment

Backup Strategy

  • Production: Automated via GitOps repository
  • Development: Use backup.sh and restore.sh scripts
  • Legacy: Manual Synology backup

🔍 Troubleshooting

Dashboard "No Data" Issues

  1. Check datasource UID matches Prometheus instance
  2. Verify template variables have correct queries
  3. Ensure instance filters are not empty
  4. Use development stack to test fixes first

GitOps Deployment Issues

  1. Check Portainer stack logs
  2. Verify git repository connectivity
  3. Ensure Docker configs are valid YAML
  4. Test locally with development stack