169 lines
4.5 KiB
Markdown
169 lines
4.5 KiB
Markdown
# GitOps Deployment Guide
|
|
|
|
This guide explains how to apply the fixed dashboard configurations to the production GitOps monitoring stack.
|
|
|
|
## 🎯 Overview
|
|
|
|
The production monitoring stack is deployed via **Portainer GitOps** on `homelab-vm` and automatically syncs from this repository. The configuration is embedded in `hosts/vms/homelab-vm/monitoring.yaml`.
|
|
|
|
## 🔧 Applying Dashboard Fixes
|
|
|
|
### Current Status
|
|
- **Production GitOps**: Uses embedded dashboard configs (may have datasource UID issues)
|
|
- **Development Stack**: Has all fixes applied (`docker/monitoring/`)
|
|
|
|
### Step-by-Step Fix Process
|
|
|
|
#### 1. Test Fixes Locally
|
|
```bash
|
|
# Deploy the fixed development stack
|
|
cd docker/monitoring
|
|
docker-compose up -d
|
|
|
|
# Verify all dashboards work
|
|
./verify-dashboard-sections.sh
|
|
|
|
# Access: http://localhost:3300 (admin/admin)
|
|
```
|
|
|
|
#### 2. Extract Fixed Dashboard JSON
|
|
```bash
|
|
# Get the fixed Synology dashboard
|
|
cat docker/monitoring/grafana/dashboards/synology-nas-monitoring.json
|
|
|
|
# Get other fixed dashboards
|
|
cat docker/monitoring/grafana/dashboards/node-exporter-full.json
|
|
cat docker/monitoring/grafana/dashboards/node-details.json
|
|
cat docker/monitoring/grafana/dashboards/infrastructure-overview.json
|
|
```
|
|
|
|
#### 3. Update GitOps Configuration
|
|
|
|
Edit `hosts/vms/homelab-vm/monitoring.yaml` and replace the embedded dashboard configs:
|
|
|
|
```yaml
|
|
configs:
|
|
# Replace this section with fixed JSON
|
|
dashboard_synology:
|
|
content: |
|
|
{
|
|
# Paste the fixed JSON from docker/monitoring/grafana/dashboards/synology-nas-monitoring.json
|
|
# Make sure to update the datasource UID to: PBFA97CFB590B2093
|
|
}
|
|
```
|
|
|
|
#### 4. Key Fixes to Apply
|
|
|
|
**Datasource UID Fix:**
|
|
```json
|
|
"datasource": {
|
|
"type": "prometheus",
|
|
"uid": "PBFA97CFB590B2093" // ← Ensure this matches your Prometheus UID
|
|
}
|
|
```
|
|
|
|
**Template Variable Fix:**
|
|
```json
|
|
"templating": {
|
|
"list": [
|
|
{
|
|
"current": {
|
|
"selected": false,
|
|
"text": "All",
|
|
"value": "$__all" // ← Ensure proper current value
|
|
}
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
**Instance Filter Fix:**
|
|
```json
|
|
"targets": [
|
|
{
|
|
"expr": "up{instance=~\"$instance\"}", // ← Fix empty instance filters
|
|
"legendFormat": "{{instance}}"
|
|
}
|
|
]
|
|
```
|
|
|
|
#### 5. Deploy via GitOps
|
|
|
|
```bash
|
|
# Commit the updated configuration
|
|
git add hosts/vms/homelab-vm/monitoring.yaml
|
|
git commit -m "Fix dashboard datasource UIDs and template variables in GitOps
|
|
|
|
- Updated Synology NAS dashboard with correct Prometheus UID
|
|
- Fixed template variables with proper current values
|
|
- Corrected instance filters in all dashboard queries
|
|
- Verified fixes work in development stack first
|
|
|
|
Fixes applied from docker/monitoring/ development stack."
|
|
|
|
# Push to trigger GitOps deployment
|
|
git push origin main
|
|
```
|
|
|
|
#### 6. Verify Production Deployment
|
|
|
|
1. **Check Portainer**: Monitor the stack update in Portainer
|
|
2. **Access Grafana**: https://gf.vish.gg
|
|
3. **Test Dashboards**: Verify all panels show data
|
|
4. **Check Logs**: Review container logs if issues occur
|
|
|
|
## 🚨 Rollback Process
|
|
|
|
If the GitOps deployment fails:
|
|
|
|
```bash
|
|
# Revert the commit
|
|
git revert HEAD
|
|
|
|
# Push the rollback
|
|
git push origin main
|
|
|
|
# Or restore from backup
|
|
git checkout HEAD~1 -- hosts/vms/homelab-vm/monitoring.yaml
|
|
git commit -m "Rollback monitoring configuration"
|
|
git push origin main
|
|
```
|
|
|
|
## 📋 Validation Checklist
|
|
|
|
Before applying to production:
|
|
|
|
- [ ] Development stack works correctly (`docker/monitoring/`)
|
|
- [ ] All dashboard panels display data
|
|
- [ ] Template variables function properly
|
|
- [ ] Instance filters are not empty
|
|
- [ ] Datasource UIDs match production Prometheus
|
|
- [ ] JSON syntax is valid (use `jq` to validate)
|
|
- [ ] Backup of current GitOps config exists
|
|
|
|
## 🔍 Troubleshooting
|
|
|
|
### Dashboard Shows "No Data"
|
|
1. Check datasource UID matches production Prometheus
|
|
2. Verify Prometheus is accessible from Grafana container
|
|
3. Check template variable queries
|
|
4. Ensure instance filters are properly formatted
|
|
|
|
### GitOps Deployment Fails
|
|
1. Check Portainer stack logs
|
|
2. Validate YAML syntax in monitoring.yaml
|
|
3. Ensure Docker configs are properly formatted
|
|
4. Verify git repository connectivity
|
|
|
|
### Container Won't Start
|
|
1. Check Docker Compose syntax
|
|
2. Verify config file formatting
|
|
3. Check volume mounts and permissions
|
|
4. Review container logs for specific errors
|
|
|
|
## 📚 Related Files
|
|
|
|
- **Production Config**: `hosts/vms/homelab-vm/monitoring.yaml`
|
|
- **Development Stack**: `docker/monitoring/`
|
|
- **Fixed Dashboards**: `docker/monitoring/grafana/dashboards/`
|
|
- **Architecture Docs**: `MONITORING_ARCHITECTURE.md` |