151 lines
4.7 KiB
Markdown
151 lines
4.7 KiB
Markdown
# Homelab Monitoring Architecture
|
|
|
|
This document explains the different monitoring setups in the homelab and their purposes.
|
|
|
|
## 🏗️ Architecture Overview
|
|
|
|
The homelab has **three distinct monitoring deployments** serving different purposes:
|
|
|
|
### 1. **Production GitOps Monitoring** (Primary)
|
|
- **Location**: `hosts/vms/homelab-vm/monitoring.yaml`
|
|
- **Deployment**: Portainer GitOps on homelab-vm
|
|
- **Purpose**: Production monitoring for all homelab infrastructure
|
|
- **Access**: https://gf.vish.gg (with Authentik SSO)
|
|
- **Status**: ✅ **ACTIVE** - This is the canonical monitoring stack
|
|
|
|
**Features:**
|
|
- Monitors all homelab devices (Synology NAS, nodes, VMs)
|
|
- Authentik OAuth2 SSO integration
|
|
- Embedded dashboard configs in Docker Compose
|
|
- Auto-provisioned datasources and dashboards
|
|
- SNMP monitoring for Synology devices
|
|
|
|
### 2. **Fixed Development Stack** (New)
|
|
- **Location**: `docker/monitoring/`
|
|
- **Deployment**: Standalone Docker Compose
|
|
- **Purpose**: Development/testing with fixed dashboard issues
|
|
- **Access**: http://localhost:3300 (admin/admin)
|
|
- **Status**: 🔧 **DEVELOPMENT** - For testing and dashboard fixes
|
|
|
|
**Features:**
|
|
- All dashboard datasource UIDs fixed
|
|
- Template variables working correctly
|
|
- Instance filters properly configured
|
|
- Verification scripts included
|
|
- Backup/restore functionality
|
|
|
|
### 3. **Atlantis Legacy Setup** (Deprecated)
|
|
- **Location**: `hosts/synology/atlantis/grafana_prometheus/`
|
|
- **Deployment**: Synology Docker on Atlantis
|
|
- **Purpose**: Legacy monitoring setup
|
|
- **Status**: 📦 **ARCHIVED** - Kept for reference
|
|
|
|
## 🔄 GitOps Workflow
|
|
|
|
### Production Deployment (homelab-vm)
|
|
```bash
|
|
# GitOps automatically deploys from:
|
|
hosts/vms/homelab-vm/monitoring.yaml
|
|
|
|
# Portainer Stack Details:
|
|
# - Stack ID: 476
|
|
# - Endpoint: 443399
|
|
# - Auto-updates from git repository
|
|
```
|
|
|
|
### Development Testing (docker/monitoring)
|
|
```bash
|
|
# Manual deployment for testing:
|
|
cd docker/monitoring
|
|
docker-compose up -d
|
|
|
|
# Verify dashboards:
|
|
./verify-dashboard-sections.sh
|
|
```
|
|
|
|
## 📊 Dashboard Status
|
|
|
|
| Dashboard | Production (GitOps) | Development (Fixed) | Status |
|
|
|-----------|-------------------|-------------------|---------|
|
|
| Infrastructure Overview | ✅ Working | ✅ Fixed | Both functional |
|
|
| Synology NAS Monitoring | ⚠️ Needs UID fix | ✅ Fixed | Dev has fixes |
|
|
| Node Exporter Full | ⚠️ Needs UID fix | ✅ Fixed | Dev has fixes |
|
|
| Node Details | ⚠️ Needs UID fix | ✅ Fixed | Dev has fixes |
|
|
|
|
## 🔧 Applying Fixes to Production
|
|
|
|
To apply the dashboard fixes to the production GitOps deployment:
|
|
|
|
1. **Extract fixed dashboards** from `docker/monitoring/grafana/dashboards/`
|
|
2. **Update the embedded configs** in `hosts/vms/homelab-vm/monitoring.yaml`
|
|
3. **Test locally** using the development stack
|
|
4. **Commit changes** - GitOps will auto-deploy
|
|
|
|
### Example: Updating Synology Dashboard in GitOps
|
|
|
|
```bash
|
|
# 1. Extract the fixed dashboard JSON
|
|
cat docker/monitoring/grafana/dashboards/synology-nas-monitoring.json
|
|
|
|
# 2. Update the embedded config in monitoring.yaml
|
|
# Replace the dashboard_synology config content with the fixed JSON
|
|
|
|
# 3. Commit and push - GitOps handles deployment
|
|
git add hosts/vms/homelab-vm/monitoring.yaml
|
|
git commit -m "Fix Synology dashboard datasource UID in GitOps"
|
|
git push
|
|
```
|
|
|
|
## 🚀 Deployment Commands
|
|
|
|
### Production (GitOps - Automatic)
|
|
```bash
|
|
# No manual deployment needed
|
|
# Portainer GitOps auto-deploys from git repository
|
|
# Access: https://gf.vish.gg
|
|
```
|
|
|
|
### Development (Manual)
|
|
```bash
|
|
cd docker/monitoring
|
|
docker-compose up -d
|
|
# Access: http://localhost:3300
|
|
```
|
|
|
|
### Legacy (Manual - Not Recommended)
|
|
```bash
|
|
cd hosts/synology/atlantis/grafana_prometheus
|
|
# Deploy via Synology Docker UI
|
|
```
|
|
|
|
## 📋 Maintenance
|
|
|
|
### Updating Production Dashboards
|
|
1. Test fixes in `docker/monitoring/` first
|
|
2. Update embedded configs in `hosts/vms/homelab-vm/monitoring.yaml`
|
|
3. Commit changes for GitOps auto-deployment
|
|
|
|
### Backup Strategy
|
|
- **Production**: Automated via GitOps repository
|
|
- **Development**: Use `backup.sh` and `restore.sh` scripts
|
|
- **Legacy**: Manual Synology backup
|
|
|
|
## 🔍 Troubleshooting
|
|
|
|
### Dashboard "No Data" Issues
|
|
1. Check datasource UID matches Prometheus instance
|
|
2. Verify template variables have correct queries
|
|
3. Ensure instance filters are not empty
|
|
4. Use development stack to test fixes first
|
|
|
|
### GitOps Deployment Issues
|
|
1. Check Portainer stack logs
|
|
2. Verify git repository connectivity
|
|
3. Ensure Docker configs are valid YAML
|
|
4. Test locally with development stack
|
|
|
|
## 📚 Related Documentation
|
|
|
|
- [Dashboard Verification Report](docker/monitoring/dashboard-verification-report.md)
|
|
- [Synology Dashboard Fix Report](docker/monitoring/synology-dashboard-fix-report.md)
|
|
- [Development Stack README](docker/monitoring/README.md) |