Sanitized mirror from private repository - 2026-03-25 09:22:06 UTC
This commit is contained in:
169
docs/admin/GITOPS_DEPLOYMENT_GUIDE.md
Normal file
169
docs/admin/GITOPS_DEPLOYMENT_GUIDE.md
Normal file
@@ -0,0 +1,169 @@
|
||||
# GitOps Deployment Guide
|
||||
|
||||
This guide explains how to apply the fixed dashboard configurations to the production GitOps monitoring stack.
|
||||
|
||||
## 🎯 Overview
|
||||
|
||||
The production monitoring stack is deployed via **Portainer GitOps** on `homelab-vm` and automatically syncs from this repository. The configuration is embedded in `hosts/vms/homelab-vm/monitoring.yaml`.
|
||||
|
||||
## 🔧 Applying Dashboard Fixes
|
||||
|
||||
### Current Status
|
||||
- **Production GitOps**: Uses embedded dashboard configs (may have datasource UID issues)
|
||||
- **Development Stack**: Has all fixes applied (`docker/monitoring/`)
|
||||
|
||||
### Step-by-Step Fix Process
|
||||
|
||||
#### 1. Test Fixes Locally
|
||||
```bash
|
||||
# Deploy the fixed development stack
|
||||
cd docker/monitoring
|
||||
docker-compose up -d
|
||||
|
||||
# Verify all dashboards work
|
||||
./verify-dashboard-sections.sh
|
||||
|
||||
# Access: http://localhost:3300 (admin/admin)
|
||||
```
|
||||
|
||||
#### 2. Extract Fixed Dashboard JSON
|
||||
```bash
|
||||
# Get the fixed Synology dashboard
|
||||
cat docker/monitoring/grafana/dashboards/synology-nas-monitoring.json
|
||||
|
||||
# Get other fixed dashboards
|
||||
cat docker/monitoring/grafana/dashboards/node-exporter-full.json
|
||||
cat docker/monitoring/grafana/dashboards/node-details.json
|
||||
cat docker/monitoring/grafana/dashboards/infrastructure-overview.json
|
||||
```
|
||||
|
||||
#### 3. Update GitOps Configuration
|
||||
|
||||
Edit `hosts/vms/homelab-vm/monitoring.yaml` and replace the embedded dashboard configs:
|
||||
|
||||
```yaml
|
||||
configs:
|
||||
# Replace this section with fixed JSON
|
||||
dashboard_synology:
|
||||
content: |
|
||||
{
|
||||
# Paste the fixed JSON from docker/monitoring/grafana/dashboards/synology-nas-monitoring.json
|
||||
# Make sure to update the datasource UID to: PBFA97CFB590B2093
|
||||
}
|
||||
```
|
||||
|
||||
#### 4. Key Fixes to Apply
|
||||
|
||||
**Datasource UID Fix:**
|
||||
```json
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "PBFA97CFB590B2093" // ← Ensure this matches your Prometheus UID
|
||||
}
|
||||
```
|
||||
|
||||
**Template Variable Fix:**
|
||||
```json
|
||||
"templating": {
|
||||
"list": [
|
||||
{
|
||||
"current": {
|
||||
"selected": false,
|
||||
"text": "All",
|
||||
"value": "$__all" // ← Ensure proper current value
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Instance Filter Fix:**
|
||||
```json
|
||||
"targets": [
|
||||
{
|
||||
"expr": "up{instance=~\"$instance\"}", // ← Fix empty instance filters
|
||||
"legendFormat": "{{instance}}"
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
#### 5. Deploy via GitOps
|
||||
|
||||
```bash
|
||||
# Commit the updated configuration
|
||||
git add hosts/vms/homelab-vm/monitoring.yaml
|
||||
git commit -m "Fix dashboard datasource UIDs and template variables in GitOps
|
||||
|
||||
- Updated Synology NAS dashboard with correct Prometheus UID
|
||||
- Fixed template variables with proper current values
|
||||
- Corrected instance filters in all dashboard queries
|
||||
- Verified fixes work in development stack first
|
||||
|
||||
Fixes applied from docker/monitoring/ development stack."
|
||||
|
||||
# Push to trigger GitOps deployment
|
||||
git push origin main
|
||||
```
|
||||
|
||||
#### 6. Verify Production Deployment
|
||||
|
||||
1. **Check Portainer**: Monitor the stack update in Portainer
|
||||
2. **Access Grafana**: https://gf.vish.gg
|
||||
3. **Test Dashboards**: Verify all panels show data
|
||||
4. **Check Logs**: Review container logs if issues occur
|
||||
|
||||
## 🚨 Rollback Process
|
||||
|
||||
If the GitOps deployment fails:
|
||||
|
||||
```bash
|
||||
# Revert the commit
|
||||
git revert HEAD
|
||||
|
||||
# Push the rollback
|
||||
git push origin main
|
||||
|
||||
# Or restore from backup
|
||||
git checkout HEAD~1 -- hosts/vms/homelab-vm/monitoring.yaml
|
||||
git commit -m "Rollback monitoring configuration"
|
||||
git push origin main
|
||||
```
|
||||
|
||||
## 📋 Validation Checklist
|
||||
|
||||
Before applying to production:
|
||||
|
||||
- [ ] Development stack works correctly (`docker/monitoring/`)
|
||||
- [ ] All dashboard panels display data
|
||||
- [ ] Template variables function properly
|
||||
- [ ] Instance filters are not empty
|
||||
- [ ] Datasource UIDs match production Prometheus
|
||||
- [ ] JSON syntax is valid (use `jq` to validate)
|
||||
- [ ] Backup of current GitOps config exists
|
||||
|
||||
## 🔍 Troubleshooting
|
||||
|
||||
### Dashboard Shows "No Data"
|
||||
1. Check datasource UID matches production Prometheus
|
||||
2. Verify Prometheus is accessible from Grafana container
|
||||
3. Check template variable queries
|
||||
4. Ensure instance filters are properly formatted
|
||||
|
||||
### GitOps Deployment Fails
|
||||
1. Check Portainer stack logs
|
||||
2. Validate YAML syntax in monitoring.yaml
|
||||
3. Ensure Docker configs are properly formatted
|
||||
4. Verify git repository connectivity
|
||||
|
||||
### Container Won't Start
|
||||
1. Check Docker Compose syntax
|
||||
2. Verify config file formatting
|
||||
3. Check volume mounts and permissions
|
||||
4. Review container logs for specific errors
|
||||
|
||||
## 📚 Related Files
|
||||
|
||||
- **Production Config**: `hosts/vms/homelab-vm/monitoring.yaml`
|
||||
- **Development Stack**: `docker/monitoring/`
|
||||
- **Fixed Dashboards**: `docker/monitoring/grafana/dashboards/`
|
||||
- **Architecture Docs**: `MONITORING_ARCHITECTURE.md`
|
||||
Reference in New Issue
Block a user