# 🤖 AGENTS.md - Homelab Repository Guide *AI Agent contributor guide for Vish's homelab infrastructure repository* ## Agent Identity - **Nickname**: Vesper ## Repository Overview This is a **GitOps-managed homelab infrastructure** repository containing Docker Compose configurations, documentation, and automation scripts for a comprehensive homelab setup. ### Key Characteristics - **65+ active Portainer stacks** deployed via GitOps - **Multi-host architecture**: Atlantis, Calypso, homelab_vm, concord_nuc, raspberry-pi-5-vish - **Production environment**: Live services with 24/7 uptime requirements - **Comprehensive monitoring**: Prometheus, Grafana, AlertManager stack - **Documentation-heavy**: Extensive markdown documentation with cross-references ## Repository Structure ``` homelab/ ├── hosts/ # Host-specific configurations │ ├── Atlantis/ # Primary NAS (Synology DS1821+) │ ├── Calypso/ # Secondary NAS/compute │ ├── homelab_vm/ # Main VM services │ ├── concord_nuc/ # Intel NUC services │ └── raspberry-pi-5-vish/ # Pi-based services ├── docs/ # Comprehensive documentation │ ├── getting-started/ # Beginner guides │ ├── infrastructure/ # Infrastructure docs │ ├── services/ # Service documentation │ ├── admin/ # Administrative guides │ └── troubleshooting/ # Problem resolution ├── common/ # Shared configurations ├── scripts/ # Automation utilities ├── ansible/ # Ansible playbooks └── archive/ # Deprecated configurations ``` ## Critical Guidelines ### 🚨 Production Environment - **NEVER modify production compose files** without understanding impact - **Test changes in development** before applying to production - **Verify GitOps compatibility** - Portainer pulls from this repo - **Maintain service availability** - 65+ services depend on these configs ### 📝 Documentation Standards - **Fix broken links** when found (currently ~4 remaining) - **Update cross-references** when moving/renaming files - **Maintain INDEX.md** as the central navigation hub - **Use relative paths** for internal documentation links ### 🔧 GitOps Workflow - **All changes go through Git** - Portainer auto-deploys from main branch - **Preserve file paths** - Stacks reference specific file locations - **Test deployments** before pushing to main - **Monitor stack health** after changes ## Common Tasks ### Adding New Services 1. **Choose appropriate host** based on resource requirements 2. **Create docker-compose.yml** in host directory 3. **Add documentation** in `docs/services/individual/` 4. **Update service inventory** in `docs/services/` 5. **Test deployment** via Portainer 6. **Monitor service health** ### Documentation Updates 1. **Check for broken links** using link checker scripts 2. **Update INDEX.md** if adding new major sections 3. **Maintain consistent formatting** with existing docs 4. **Test all cross-references** after changes ### Infrastructure Changes 1. **Document changes** in appropriate infrastructure docs 2. **Update monitoring** if adding new hosts/services 3. **Verify backup coverage** for new systems 4. **Update network documentation** if needed ## Service Categories ### Core Infrastructure - **Monitoring**: Prometheus, Grafana, AlertManager - **Networking**: Nginx Proxy Manager, Pi-hole, WireGuard - **Storage**: Syncthing, Seafile, backup services - **Authentication**: Authentik SSO ### Media & Entertainment - **Streaming**: Plex, Jellyfin - **Management**: Arr suite (Sonarr, Radarr, etc.) - **Books**: Calibre, AudioBookShelf ### Productivity - **Communication**: Matrix, Mattermost, Mastodon - **Documents**: Paperless-ngx, Stirling PDF - **Development**: Gitea, OpenHands, CI/CD runners ### Home Automation - **Platform**: Home Assistant - **Protocols**: Zigbee2MQTT, Z-Wave - **Monitoring**: Various IoT sensors ## Monitoring & Alerting ### Key Metrics - **Service availability**: All services monitored via Uptime Kuma - **System resources**: CPU, memory, disk, network - **Container health**: Docker container status - **Network performance**: Latency, throughput ### Alert Channels - **NTFY**: Push notifications for critical alerts - **Email**: Backup notification channel - **Dashboard**: Grafana visual alerts ## Backup Strategy ### Data Protection - **3-2-1 rule**: 3 copies, 2 different media, 1 offsite - **Automated backups**: Daily incremental, weekly full - **Configuration backups**: Docker volumes, configs - **Documentation backups**: Git repository mirroring ### Recovery Procedures - **Service restoration**: Docker stack redeployment - **Data recovery**: Backup restoration procedures - **Disaster recovery**: Complete infrastructure rebuild ## Security Considerations ### Access Control - **VPN-only access**: Tailscale mesh network - **SSO integration**: Authentik for centralized auth - **Network segmentation**: VLANs for different service tiers - **Regular updates**: Automated security patching ### Data Protection - **Encryption**: Data at rest and in transit - **Secrets management**: Docker secrets, environment variables - **Audit logging**: Comprehensive access logging - **Vulnerability scanning**: Regular security assessments ## Development Workflow ### Local Development 1. **Clone repository** to development environment 2. **Test changes** in isolated environment 3. **Validate compose files** using validation scripts 4. **Check documentation links** before committing ### Deployment Process 1. **Commit changes** to feature branch 2. **Test deployment** in staging environment 3. **Merge to main** after validation 4. **Monitor GitOps deployment** via Portainer 5. **Verify service health** post-deployment ## Troubleshooting ### Common Issues - **Service startup failures**: Check logs, resource constraints - **Network connectivity**: Verify network configuration - **Storage issues**: Check disk space, mount points - **Authentication problems**: Verify SSO configuration ### Diagnostic Tools - **Portainer**: Container management and logs - **Grafana**: Performance metrics and alerts - **SSH access**: Direct system administration - **Log aggregation**: Centralized logging system ## Best Practices ### Code Quality - **Use official images** when possible - **Pin image versions** for stability - **Document environment variables** and volumes - **Follow Docker best practices** for security ### Documentation - **Keep docs current** with infrastructure changes - **Use clear, descriptive titles** and sections - **Include troubleshooting steps** for common issues - **Maintain consistent formatting** across all docs ### Monitoring - **Monitor everything** that matters to service availability - **Set appropriate thresholds** to avoid alert fatigue - **Document alert procedures** for quick response - **Regular health checks** for all critical services --- **Remember**: This is a production homelab with real users and services. Changes should be made thoughtfully with proper testing and documentation. When in doubt, ask questions and test thoroughly before deploying to production. **Status**: ✅ Repository actively maintained with 65+ production services