Files
homelab-optimized/docs/admin/GITOPS_COMPREHENSIVE_GUIDE.md
Gitea Mirror Bot fb88e1b6d4
Some checks failed
Documentation / Build Docusaurus (push) Failing after 5m1s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-03-27 11:43:05 UTC
2026-03-27 11:43:05 +00:00

444 lines
17 KiB
Markdown

# GitOps Deployment Comprehensive Guide
*Last Updated: March 8, 2026*
## 🎯 Overview
This homelab infrastructure is deployed using **GitOps methodology** with **Portainer Enterprise Edition** as the orchestration platform. All services are defined as Docker Compose files in this Git repository and automatically deployed across multiple hosts.
## 🏗️ GitOps Architecture
### Core Components
- **Git Repository**: Source of truth for all infrastructure configurations
- **Portainer EE**: GitOps orchestration and container management (v2.33.7)
- **Docker Compose**: Service definition and deployment format
- **Multi-Host Deployment**: Services distributed across Synology NAS, VMs, and edge devices
### Current Deployment Status
**Verified Active Stacks**: 81 compose stacks across 5 endpoints — all GitOps-managed
**Total Containers**: 157+ containers across infrastructure
**Management Interface**: https://192.168.0.200:9443 (Portainer EE)
## 📊 Active GitOps Deployments
All 5 endpoints are fully GitOps-managed. Every stack uses the canonical `hosts/` path.
### Atlantis (Primary NAS, ep=2) — 24 Stacks
| Stack Name | Config Path | Status |
|------------|-------------|--------|
| **arr-stack** | `hosts/synology/atlantis/arr-suite/docker-compose.yml` | ✅ Running |
| **audiobookshelf-stack** | `hosts/synology/atlantis/audiobookshelf.yaml` | ✅ Running |
| **baikal-stack** | `hosts/synology/atlantis/baikal/baikal.yaml` | ✅ Running |
| **calibre-stack** | `hosts/synology/atlantis/calibre.yaml` | ⏸ Stopped (intentional) |
| **dokuwiki-stack** | `hosts/synology/atlantis/dokuwiki.yml` | ✅ Running |
| **dyndns-updater-stack** | `hosts/synology/atlantis/dynamicdnsupdater.yaml` | ✅ Running |
| **fenrus-stack** | `hosts/synology/atlantis/fenrus.yaml` | ✅ Running |
| **homarr-stack** | `hosts/synology/atlantis/homarr.yaml` | ✅ Running |
| **immich-stack** | `hosts/synology/atlantis/immich/docker-compose.yml` | ✅ Running |
| **iperf3-stack** | `hosts/synology/atlantis/iperf3.yaml` | ✅ Running |
| **it_tools-stack** | `hosts/synology/atlantis/it_tools.yml` | ✅ Running |
| **jitsi-stack** | `hosts/synology/atlantis/jitsi/jitsi.yml` | ✅ Running |
| **joplin-stack** | `hosts/synology/atlantis/joplin.yml` | ✅ Running |
| **node-exporter-stack** | `hosts/synology/atlantis/grafana_prometheus/atlantis_node_exporter.yaml` | ✅ Running |
| **ollama-stack** | `hosts/synology/atlantis/ollama/docker-compose.yml` | ⏸ Stopped (intentional) |
| **syncthing-stack** | `hosts/synology/atlantis/syncthing.yml` | ✅ Running |
| **theme-park-stack** | `hosts/synology/atlantis/theme-park/theme-park.yaml` | ✅ Running |
| **vaultwarden-stack** | `hosts/synology/atlantis/vaultwarden.yaml` | ✅ Running |
| **watchtower-stack** | `common/watchtower-full.yaml` | ✅ Running |
| **youtubedl-stack** | `hosts/synology/atlantis/youtubedl.yaml` | ✅ Running |
### Calypso (Secondary NAS, ep=443397) — 23 Stacks
22 managed stacks fully GitOps; `gitea` (id=249) intentionally kept as manual (bootstrap dependency).
| Stack Name | Config Path | Status |
|------------|-------------|--------|
| **actual-budget-stack** | `hosts/synology/calypso/actualbudget.yml` | ✅ Running |
| **adguard-stack** | `hosts/synology/calypso/adguard.yaml` | ✅ Running |
| **apt-cacher-ng-stack** | `hosts/synology/calypso/apt-cacher-ng/apt-cacher-ng.yml` | ✅ Running |
| **arr-stack** | `hosts/synology/calypso/arr_suite_with_dracula.yml` | ✅ Running |
| **authentik-sso-stack** | `hosts/synology/calypso/authentik/docker-compose.yaml` | ✅ Running |
| **diun-stack** | `hosts/synology/calypso/diun.yaml` | ✅ Running |
| **dozzle-agent-stack** | `hosts/synology/calypso/dozzle-agent.yaml` | ✅ Running |
| **gitea** (manual) | — | ✅ Running |
| **gitea-runner-stack** | `hosts/synology/calypso/gitea-runner.yaml` | ✅ Running |
| **immich-stack** | `hosts/synology/calypso/immich/docker-compose.yml` | ✅ Running |
| **iperf3-stack** | `hosts/synology/calypso/iperf3.yml` | ✅ Running |
| **node-exporter-stack** | `hosts/synology/calypso/node-exporter.yaml` | ✅ Running |
| **openspeedtest-stack** | `hosts/synology/calypso/openspeedtest.yaml` | ✅ Running |
| **paperless-ai-stack** | `hosts/synology/calypso/paperless/paperless-ai.yml` | ✅ Running |
| **paperless-stack** | `hosts/synology/calypso/paperless/docker-compose.yml` | ✅ Running |
| **rackula-stack** | `hosts/synology/calypso/rackula.yml` | ✅ Running |
| **retro-site-stack** | `hosts/synology/calypso/retro-site.yaml` | ✅ Running |
| **rustdesk-stack** | `hosts/synology/calypso/rustdesk.yaml` | ✅ Running |
| **scrutiny-collector-stack** | `hosts/synology/calypso/scrutiny-collector.yaml` | ✅ Running |
| **seafile-new-stack** | `hosts/synology/calypso/seafile-new.yaml` | ✅ Running |
| **syncthing-stack** | `hosts/synology/calypso/syncthing.yaml` | ✅ Running |
| **watchtower-stack** | `common/watchtower-full.yaml` | ✅ Running |
| **wireguard-stack** | `hosts/synology/calypso/wireguard-server.yaml` | ✅ Running |
### Concord NUC (ep=443398) — 11 Stacks
| Stack Name | Config Path | Status |
|------------|-------------|--------|
| **adguard-stack** | `hosts/physical/concord-nuc/adguard.yaml` | ✅ Running |
| **diun-stack** | `hosts/physical/concord-nuc/diun.yaml` | ✅ Running |
| **dozzle-agent-stack** | `hosts/physical/concord-nuc/dozzle-agent.yaml` | ✅ Running |
| **dyndns-updater-stack** | `hosts/physical/concord-nuc/dyndns_updater.yaml` | ✅ Running |
| **homeassistant-stack** | `hosts/physical/concord-nuc/homeassistant.yaml` | ✅ Running |
| **invidious-stack** | `hosts/physical/concord-nuc/invidious/invidious.yaml` | ✅ Running |
| **plex-stack** | `hosts/physical/concord-nuc/plex.yaml` | ✅ Running |
| **scrutiny-collector-stack** | `hosts/physical/concord-nuc/scrutiny-collector.yaml` | ✅ Running |
| **syncthing-stack** | `hosts/physical/concord-nuc/syncthing.yaml` | ✅ Running |
| **wireguard-stack** | `hosts/physical/concord-nuc/wireguard.yaml` | ✅ Running |
| **yourspotify-stack** | `hosts/physical/concord-nuc/yourspotify.yaml` | ✅ Running |
### Homelab VM (ep=443399) — 19 Stacks
| Stack Name | Config Path | Status |
|------------|-------------|--------|
| **alerting-stack** | `hosts/vms/homelab-vm/alerting.yaml` | ✅ Running |
| **archivebox-stack** | `hosts/vms/homelab-vm/archivebox.yaml` | ✅ Running |
| **binternet-stack** | `hosts/vms/homelab-vm/binternet.yaml` | ✅ Running |
| **diun-stack** | `hosts/vms/homelab-vm/diun.yaml` | ✅ Running |
| **dozzle-agent-stack** | `hosts/vms/homelab-vm/dozzle-agent.yaml` | ✅ Running |
| **drawio-stack** | `hosts/vms/homelab-vm/drawio.yml` | ✅ Running |
| **hoarder-karakeep-stack** | `hosts/vms/homelab-vm/hoarder.yaml` | ✅ Running |
| **monitoring-stack** | `hosts/vms/homelab-vm/monitoring.yaml` | ✅ Running |
| **ntfy-stack** | `hosts/vms/homelab-vm/ntfy.yaml` | ✅ Running |
| **openhands-stack** | `hosts/vms/homelab-vm/openhands.yaml` | ✅ Running |
| **perplexica-stack** | `hosts/vms/homelab-vm/perplexica.yaml` | ✅ Running |
| **proxitok-stack** | `hosts/vms/homelab-vm/proxitok.yaml` | ✅ Running |
| **redlib-stack** | `hosts/vms/homelab-vm/redlib.yaml` | ✅ Running |
| **scrutiny-stack** | `hosts/vms/homelab-vm/scrutiny.yaml` | ✅ Running |
| **signal-api-stack** | `hosts/vms/homelab-vm/signal_api.yaml` | ✅ Running |
| **syncthing-stack** | `hosts/vms/homelab-vm/syncthing.yml` | ✅ Running |
| **watchyourlan-stack** | `hosts/vms/homelab-vm/watchyourlan.yaml` | ✅ Running |
| **watchtower-stack** | `common/watchtower-full.yaml` | ✅ Running |
| **webcheck-stack** | `hosts/vms/homelab-vm/webcheck.yaml` | ✅ Running |
### Raspberry Pi 5 (ep=443395) — 4 Stacks
| Stack Name | Config Path | Status |
|------------|-------------|--------|
| **diun-stack** | `hosts/edge/rpi5-vish/diun.yaml` | ✅ Running |
| **glances-stack** | `hosts/edge/rpi5-vish/glances.yaml` | ✅ Running |
| **portainer-agent-stack** | `hosts/edge/rpi5-vish/portainer_agent.yaml` | ✅ Running |
| **uptime-kuma-stack** | `hosts/edge/rpi5-vish/uptime-kuma.yaml` | ✅ Running |
## 🚀 GitOps Workflow
### 1. Service Definition
Services are defined using Docker Compose YAML files in the repository:
```yaml
# Example: Atlantis/new-service.yaml
version: '3.8'
services:
new-service:
image: example/service:latest
container_name: new-service
ports:
- "8080:8080"
environment:
- ENV_VAR=value
volumes:
- /volume1/docker/new-service:/data
restart: unless-stopped
```
### 2. Git Commit & Push
```bash
# Add new service configuration
git add Atlantis/new-service.yaml
git commit -m "Add new service deployment
- Configure new-service with proper volumes
- Set up environment variables
- Enable auto-restart policy"
# Push to trigger GitOps deployment
git push origin main
```
### 3. Automatic Deployment
- Portainer monitors the Git repository for changes
- New commits trigger automatic stack updates
- Services are deployed/updated across the infrastructure
- Health checks verify successful deployment
### 4. Monitoring & Verification
```bash
# Check deployment status
ssh -p 60000 vish@192.168.0.200 "sudo /usr/local/bin/docker compose ls"
# Verify service health
ssh -p 60000 vish@192.168.0.200 "sudo /usr/local/bin/docker ps | grep new-service"
```
## 📁 Repository Structure for GitOps
### Host-Specific Configurations
All stacks use canonical `hosts/` paths. The root-level legacy directories (`Atlantis/`, `Calypso/`, etc.) are symlinks kept only for backwards compatibility — do not use them for new stacks.
```
homelab/
├── hosts/
│ ├── synology/
│ │ ├── atlantis/ # Synology DS1823xs+ (Primary NAS)
│ │ │ ├── arr-suite/ # Media automation stack
│ │ │ ├── immich/ # Photo management
│ │ │ ├── ollama/ # AI/LLM services
│ │ │ └── *.yaml # Individual service configs
│ │ └── calypso/ # Synology DS723+ (Secondary NAS)
│ │ ├── authentik/ # SSO platform
│ │ ├── immich/ # Photo backup
│ │ ├── paperless/ # Document management
│ │ └── *.yaml # Service configurations
│ ├── physical/
│ │ └── concord-nuc/ # Intel NUC (Edge Computing)
│ │ ├── homeassistant.yaml
│ │ ├── invidious/ # YouTube frontend
│ │ └── *.yaml
│ ├── vms/
│ │ └── homelab-vm/ # Proxmox VM
│ │ ├── monitoring.yaml # Prometheus + Grafana
│ │ └── *.yaml # Cloud service configs
│ └── edge/
│ └── rpi5-vish/ # Raspberry Pi 5 (IoT/Edge)
│ └── *.yaml
└── common/ # Shared configurations
└── watchtower-full.yaml # Auto-update (all hosts)
```
### Service Categories
- **Media & Entertainment**: Plex, Jellyfin, *arr suite, Immich
- **Development & DevOps**: Gitea, Portainer, monitoring stack
- **Productivity**: PaperlessNGX, Joplin, Syncthing
- **Network & Infrastructure**: AdGuard, Nginx Proxy Manager, Authentik
- **Communication**: Stoatchat, Matrix, Jitsi
- **Utilities**: Watchtower, theme-park, IT Tools
## 🔧 Service Management Operations
### Adding a New Service
1. **Create Service Configuration**
```bash
# Create new service file
cat > Atlantis/new-service.yaml << 'EOF'
version: '3.8'
services:
new-service:
image: example/service:latest
container_name: new-service
ports:
- "8080:8080"
volumes:
- /volume1/docker/new-service:/data
restart: unless-stopped
EOF
```
2. **Commit and Deploy**
```bash
git add Atlantis/new-service.yaml
git commit -m "Add new-service deployment"
git push origin main
```
3. **Verify Deployment**
```bash
# Check if stack was created
ssh -p 60000 vish@192.168.0.200 "sudo /usr/local/bin/docker compose ls | grep new-service"
# Verify container is running
ssh -p 60000 vish@192.168.0.200 "sudo /usr/local/bin/docker ps | grep new-service"
```
### Updating an Existing Service
1. **Modify Configuration**
```bash
# Edit existing service
nano Atlantis/existing-service.yaml
```
2. **Commit Changes**
```bash
git add Atlantis/existing-service.yaml
git commit -m "Update existing-service configuration
- Upgrade to latest image version
- Add new environment variables
- Update volume mounts"
git push origin main
```
3. **Monitor Update**
- Portainer will automatically pull changes
- Service will be redeployed with new configuration
- Check Portainer UI for deployment status
### Removing a Service
1. **Remove Configuration File**
```bash
git rm Atlantis/old-service.yaml
git commit -m "Remove old-service deployment"
git push origin main
```
2. **Manual Cleanup (if needed)**
```bash
# Remove any persistent volumes or data
ssh -p 60000 vish@192.168.0.200 "sudo rm -rf /volume1/docker/old-service"
```
## 🔍 Monitoring & Troubleshooting
### GitOps Health Checks
#### Check Portainer Status
```bash
# Verify Portainer is running
curl -k -s "https://192.168.0.200:9443/api/system/status"
# Check container status
ssh -p 60000 vish@192.168.0.200 "sudo /usr/local/bin/docker ps | grep portainer"
```
#### Verify Git Sync Status
```bash
# Check if Portainer can access Git repository
# (Check via Portainer UI: Stacks → Repository sync status)
# Verify latest commits are reflected
git log --oneline -5
```
#### Monitor Stack Deployments
```bash
# List all active stacks
ssh -p 60000 vish@192.168.0.200 "sudo /usr/local/bin/docker compose ls"
# Check specific stack status
ssh -p 60000 vish@192.168.0.200 "sudo /usr/local/bin/docker compose -f /path/to/stack.yaml ps"
```
### Common Issues & Solutions
#### Stack Deployment Fails
1. **Check YAML Syntax**
```bash
# Validate YAML syntax
yamllint Atlantis/service.yaml
# Check Docker Compose syntax
docker-compose -f Atlantis/service.yaml config
```
2. **Review Portainer Logs**
```bash
ssh -p 60000 vish@192.168.0.200 "sudo /usr/local/bin/docker logs portainer"
```
3. **Check Resource Constraints**
```bash
# Verify disk space
ssh -p 60000 vish@192.168.0.200 "df -h"
# Check memory usage
ssh -p 60000 vish@192.168.0.200 "free -h"
```
#### Git Repository Access Issues
1. **Verify Repository URL**
2. **Check Authentication credentials**
3. **Confirm network connectivity**
#### Service Won't Start
1. **Check container logs**
```bash
ssh -p 60000 vish@192.168.0.200 "sudo /usr/local/bin/docker logs service-name"
```
2. **Verify port conflicts**
```bash
ssh -p 60000 vish@192.168.0.200 "sudo netstat -tulpn | grep :PORT"
```
3. **Check volume mounts**
```bash
ssh -p 60000 vish@192.168.0.200 "ls -la /volume1/docker/service-name"
```
## 🔐 Security Considerations
### GitOps Security Best Practices
- **Repository Access**: Secure Git repository with appropriate access controls
- **Secrets Management**: Use Docker secrets or external secret management
- **Network Security**: Services deployed on isolated Docker networks
- **Regular Updates**: Watchtower ensures containers stay updated
### Access Control
- **Portainer Authentication**: Multi-user access with role-based permissions
- **SSH Access**: Key-based authentication for server management
- **Service Authentication**: Individual service authentication where applicable
## 📈 Performance & Scaling
### Resource Monitoring
- **Container Metrics**: Monitor CPU, memory, and disk usage
- **Network Performance**: Track bandwidth and connection metrics
- **Storage Utilization**: Monitor disk space across all hosts
### Scaling Strategies
- **Horizontal Scaling**: Deploy services across multiple hosts
- **Load Balancing**: Use Nginx Proxy Manager for traffic distribution
- **Resource Optimization**: Optimize container resource limits
## 🔄 Backup & Disaster Recovery
### GitOps Backup Strategy
- **Repository Backup**: Git repository is the source of truth
- **Configuration Backup**: All service configurations version controlled
- **Data Backup**: Persistent volumes backed up separately
### Recovery Procedures
1. **Service Recovery**: Redeploy from Git repository
2. **Data Recovery**: Restore from backup volumes
3. **Full Infrastructure Recovery**: Bootstrap new hosts with GitOps
## 📚 Related Documentation
- [GITOPS_DEPLOYMENT_GUIDE.md](../GITOPS_DEPLOYMENT_GUIDE.md) - Original deployment guide
- [MONITORING_ARCHITECTURE.md](../MONITORING_ARCHITECTURE.md) - Monitoring setup
- [docs/admin/portainer-backup.md](portainer-backup.md) - Portainer backup procedures
- [docs/runbooks/add-new-service.md](../runbooks/add-new-service.md) - Service deployment runbook
## 🎯 Next Steps
### Short Term
- [ ] Set up automated GitOps health monitoring
- [ ] Create service deployment templates
- [ ] Implement automated testing for configurations
### Medium Term
- [ ] Expand GitOps to additional hosts
- [ ] Implement blue-green deployments
- [ ] Add configuration validation pipelines
### Long Term
- [ ] Migrate to Kubernetes GitOps (ArgoCD/Flux)
- [ ] Implement infrastructure as code (Terraform)
- [ ] Add automated disaster recovery testing
---
**Document Status**: ✅ Active
**Deployment Method**: GitOps via Portainer EE
**Last Verified**: March 8, 2026
**Next Review**: April 8, 2026