7.2 KiB
7.2 KiB
🤖 AGENTS.md - Homelab Repository Guide
AI Agent contributor guide for Vish's homelab infrastructure repository
Agent Identity
- Nickname: Vesper
Repository Overview
This is a GitOps-managed homelab infrastructure repository containing Docker Compose configurations, documentation, and automation scripts for a comprehensive homelab setup.
Key Characteristics
- 65+ active Portainer stacks deployed via GitOps
- Multi-host architecture: Atlantis, Calypso, homelab_vm, concord_nuc, raspberry-pi-5-vish
- Production environment: Live services with 24/7 uptime requirements
- Comprehensive monitoring: Prometheus, Grafana, AlertManager stack
- Documentation-heavy: Extensive markdown documentation with cross-references
Repository Structure
homelab/
├── hosts/ # Host-specific configurations
│ ├── Atlantis/ # Primary NAS (Synology DS1821+)
│ ├── Calypso/ # Secondary NAS/compute
│ ├── homelab_vm/ # Main VM services
│ ├── concord_nuc/ # Intel NUC services
│ └── raspberry-pi-5-vish/ # Pi-based services
├── docs/ # Comprehensive documentation
│ ├── getting-started/ # Beginner guides
│ ├── infrastructure/ # Infrastructure docs
│ ├── services/ # Service documentation
│ ├── admin/ # Administrative guides
│ └── troubleshooting/ # Problem resolution
├── common/ # Shared configurations
├── scripts/ # Automation utilities
├── ansible/ # Ansible playbooks
└── archive/ # Deprecated configurations
Critical Guidelines
🚨 Production Environment
- NEVER modify production compose files without understanding impact
- Test changes in development before applying to production
- Verify GitOps compatibility - Portainer pulls from this repo
- Maintain service availability - 65+ services depend on these configs
📝 Documentation Standards
- Fix broken links when found (currently ~4 remaining)
- Update cross-references when moving/renaming files
- Maintain INDEX.md as the central navigation hub
- Use relative paths for internal documentation links
🔧 GitOps Workflow
- All changes go through Git - Portainer auto-deploys from main branch
- Preserve file paths - Stacks reference specific file locations
- Test deployments before pushing to main
- Monitor stack health after changes
Common Tasks
Adding New Services
- Choose appropriate host based on resource requirements
- Create docker-compose.yml in host directory
- Add documentation in
docs/services/individual/ - Update service inventory in
docs/services/ - Test deployment via Portainer
- Monitor service health
Documentation Updates
- Check for broken links using link checker scripts
- Update INDEX.md if adding new major sections
- Maintain consistent formatting with existing docs
- Test all cross-references after changes
Infrastructure Changes
- Document changes in appropriate infrastructure docs
- Update monitoring if adding new hosts/services
- Verify backup coverage for new systems
- Update network documentation if needed
Service Categories
Core Infrastructure
- Monitoring: Prometheus, Grafana, AlertManager
- Networking: Nginx Proxy Manager, Pi-hole, WireGuard
- Storage: Syncthing, Seafile, backup services
- Authentication: Authentik SSO
Media & Entertainment
- Streaming: Plex, Jellyfin
- Management: Arr suite (Sonarr, Radarr, etc.)
- Books: Calibre, AudioBookShelf
Productivity
- Communication: Matrix, Mattermost, Mastodon
- Documents: Paperless-ngx, Stirling PDF
- Development: Gitea, OpenHands, CI/CD runners
Home Automation
- Platform: Home Assistant
- Protocols: Zigbee2MQTT, Z-Wave
- Monitoring: Various IoT sensors
Monitoring & Alerting
Key Metrics
- Service availability: All services monitored via Uptime Kuma
- System resources: CPU, memory, disk, network
- Container health: Docker container status
- Network performance: Latency, throughput
Alert Channels
- NTFY: Push notifications for critical alerts
- Email: Backup notification channel
- Dashboard: Grafana visual alerts
Backup Strategy
Data Protection
- 3-2-1 rule: 3 copies, 2 different media, 1 offsite
- Automated backups: Daily incremental, weekly full
- Configuration backups: Docker volumes, configs
- Documentation backups: Git repository mirroring
Recovery Procedures
- Service restoration: Docker stack redeployment
- Data recovery: Backup restoration procedures
- Disaster recovery: Complete infrastructure rebuild
Security Considerations
Access Control
- VPN-only access: Tailscale mesh network
- SSO integration: Authentik for centralized auth
- Network segmentation: VLANs for different service tiers
- Regular updates: Automated security patching
Data Protection
- Encryption: Data at rest and in transit
- Secrets management: Docker secrets, environment variables
- Audit logging: Comprehensive access logging
- Vulnerability scanning: Regular security assessments
Development Workflow
Local Development
- Clone repository to development environment
- Test changes in isolated environment
- Validate compose files using validation scripts
- Check documentation links before committing
Deployment Process
- Commit changes to feature branch
- Test deployment in staging environment
- Merge to main after validation
- Monitor GitOps deployment via Portainer
- Verify service health post-deployment
Troubleshooting
Common Issues
- Service startup failures: Check logs, resource constraints
- Network connectivity: Verify network configuration
- Storage issues: Check disk space, mount points
- Authentication problems: Verify SSO configuration
Diagnostic Tools
- Portainer: Container management and logs
- Grafana: Performance metrics and alerts
- SSH access: Direct system administration
- Log aggregation: Centralized logging system
Best Practices
Code Quality
- Use official images when possible
- Pin image versions for stability
- Document environment variables and volumes
- Follow Docker best practices for security
Documentation
- Keep docs current with infrastructure changes
- Use clear, descriptive titles and sections
- Include troubleshooting steps for common issues
- Maintain consistent formatting across all docs
Monitoring
- Monitor everything that matters to service availability
- Set appropriate thresholds to avoid alert fatigue
- Document alert procedures for quick response
- Regular health checks for all critical services
Remember: This is a production homelab with real users and services. Changes should be made thoughtfully with proper testing and documentation. When in doubt, ask questions and test thoroughly before deploying to production.
Status: ✅ Repository actively maintained with 65+ production services