200 lines
7.2 KiB
Markdown
200 lines
7.2 KiB
Markdown
# 🤖 AGENTS.md - Homelab Repository Guide
|
|
|
|
*AI Agent contributor guide for Vish's homelab infrastructure repository*
|
|
|
|
## Agent Identity
|
|
|
|
- **Nickname**: Vesper
|
|
|
|
## Repository Overview
|
|
|
|
This is a **GitOps-managed homelab infrastructure** repository containing Docker Compose configurations, documentation, and automation scripts for a comprehensive homelab setup.
|
|
|
|
### Key Characteristics
|
|
- **65+ active Portainer stacks** deployed via GitOps
|
|
- **Multi-host architecture**: Atlantis, Calypso, homelab_vm, concord_nuc, raspberry-pi-5-vish
|
|
- **Production environment**: Live services with 24/7 uptime requirements
|
|
- **Comprehensive monitoring**: Prometheus, Grafana, AlertManager stack
|
|
- **Documentation-heavy**: Extensive markdown documentation with cross-references
|
|
|
|
## Repository Structure
|
|
|
|
```
|
|
homelab/
|
|
├── hosts/ # Host-specific configurations
|
|
│ ├── Atlantis/ # Primary NAS (Synology DS1821+)
|
|
│ ├── Calypso/ # Secondary NAS/compute
|
|
│ ├── homelab_vm/ # Main VM services
|
|
│ ├── concord_nuc/ # Intel NUC services
|
|
│ └── raspberry-pi-5-vish/ # Pi-based services
|
|
├── docs/ # Comprehensive documentation
|
|
│ ├── getting-started/ # Beginner guides
|
|
│ ├── infrastructure/ # Infrastructure docs
|
|
│ ├── services/ # Service documentation
|
|
│ ├── admin/ # Administrative guides
|
|
│ └── troubleshooting/ # Problem resolution
|
|
├── common/ # Shared configurations
|
|
├── scripts/ # Automation utilities
|
|
├── ansible/ # Ansible playbooks
|
|
└── archive/ # Deprecated configurations
|
|
```
|
|
|
|
## Critical Guidelines
|
|
|
|
### 🚨 Production Environment
|
|
- **NEVER modify production compose files** without understanding impact
|
|
- **Test changes in development** before applying to production
|
|
- **Verify GitOps compatibility** - Portainer pulls from this repo
|
|
- **Maintain service availability** - 65+ services depend on these configs
|
|
|
|
### 📝 Documentation Standards
|
|
- **Fix broken links** when found (currently ~4 remaining)
|
|
- **Update cross-references** when moving/renaming files
|
|
- **Maintain INDEX.md** as the central navigation hub
|
|
- **Use relative paths** for internal documentation links
|
|
|
|
### 🔧 GitOps Workflow
|
|
- **All changes go through Git** - Portainer auto-deploys from main branch
|
|
- **Preserve file paths** - Stacks reference specific file locations
|
|
- **Test deployments** before pushing to main
|
|
- **Monitor stack health** after changes
|
|
|
|
## Common Tasks
|
|
|
|
### Adding New Services
|
|
1. **Choose appropriate host** based on resource requirements
|
|
2. **Create docker-compose.yml** in host directory
|
|
3. **Add documentation** in `docs/services/individual/`
|
|
4. **Update service inventory** in `docs/services/`
|
|
5. **Test deployment** via Portainer
|
|
6. **Monitor service health**
|
|
|
|
### Documentation Updates
|
|
1. **Check for broken links** using link checker scripts
|
|
2. **Update INDEX.md** if adding new major sections
|
|
3. **Maintain consistent formatting** with existing docs
|
|
4. **Test all cross-references** after changes
|
|
|
|
### Infrastructure Changes
|
|
1. **Document changes** in appropriate infrastructure docs
|
|
2. **Update monitoring** if adding new hosts/services
|
|
3. **Verify backup coverage** for new systems
|
|
4. **Update network documentation** if needed
|
|
|
|
## Service Categories
|
|
|
|
### Core Infrastructure
|
|
- **Monitoring**: Prometheus, Grafana, AlertManager
|
|
- **Networking**: Nginx Proxy Manager, Pi-hole, WireGuard
|
|
- **Storage**: Syncthing, Seafile, backup services
|
|
- **Authentication**: Authentik SSO
|
|
|
|
### Media & Entertainment
|
|
- **Streaming**: Plex, Jellyfin
|
|
- **Management**: Arr suite (Sonarr, Radarr, etc.)
|
|
- **Books**: Calibre, AudioBookShelf
|
|
|
|
### Productivity
|
|
- **Communication**: Matrix, Mattermost, Mastodon
|
|
- **Documents**: Paperless-ngx, Stirling PDF
|
|
- **Development**: Gitea, OpenHands, CI/CD runners
|
|
|
|
### Home Automation
|
|
- **Platform**: Home Assistant
|
|
- **Protocols**: Zigbee2MQTT, Z-Wave
|
|
- **Monitoring**: Various IoT sensors
|
|
|
|
## Monitoring & Alerting
|
|
|
|
### Key Metrics
|
|
- **Service availability**: All services monitored via Uptime Kuma
|
|
- **System resources**: CPU, memory, disk, network
|
|
- **Container health**: Docker container status
|
|
- **Network performance**: Latency, throughput
|
|
|
|
### Alert Channels
|
|
- **NTFY**: Push notifications for critical alerts
|
|
- **Email**: Backup notification channel
|
|
- **Dashboard**: Grafana visual alerts
|
|
|
|
## Backup Strategy
|
|
|
|
### Data Protection
|
|
- **3-2-1 rule**: 3 copies, 2 different media, 1 offsite
|
|
- **Automated backups**: Daily incremental, weekly full
|
|
- **Configuration backups**: Docker volumes, configs
|
|
- **Documentation backups**: Git repository mirroring
|
|
|
|
### Recovery Procedures
|
|
- **Service restoration**: Docker stack redeployment
|
|
- **Data recovery**: Backup restoration procedures
|
|
- **Disaster recovery**: Complete infrastructure rebuild
|
|
|
|
## Security Considerations
|
|
|
|
### Access Control
|
|
- **VPN-only access**: Tailscale mesh network
|
|
- **SSO integration**: Authentik for centralized auth
|
|
- **Network segmentation**: VLANs for different service tiers
|
|
- **Regular updates**: Automated security patching
|
|
|
|
### Data Protection
|
|
- **Encryption**: Data at rest and in transit
|
|
- **Secrets management**: Docker secrets, environment variables
|
|
- **Audit logging**: Comprehensive access logging
|
|
- **Vulnerability scanning**: Regular security assessments
|
|
|
|
## Development Workflow
|
|
|
|
### Local Development
|
|
1. **Clone repository** to development environment
|
|
2. **Test changes** in isolated environment
|
|
3. **Validate compose files** using validation scripts
|
|
4. **Check documentation links** before committing
|
|
|
|
### Deployment Process
|
|
1. **Commit changes** to feature branch
|
|
2. **Test deployment** in staging environment
|
|
3. **Merge to main** after validation
|
|
4. **Monitor GitOps deployment** via Portainer
|
|
5. **Verify service health** post-deployment
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
- **Service startup failures**: Check logs, resource constraints
|
|
- **Network connectivity**: Verify network configuration
|
|
- **Storage issues**: Check disk space, mount points
|
|
- **Authentication problems**: Verify SSO configuration
|
|
|
|
### Diagnostic Tools
|
|
- **Portainer**: Container management and logs
|
|
- **Grafana**: Performance metrics and alerts
|
|
- **SSH access**: Direct system administration
|
|
- **Log aggregation**: Centralized logging system
|
|
|
|
## Best Practices
|
|
|
|
### Code Quality
|
|
- **Use official images** when possible
|
|
- **Pin image versions** for stability
|
|
- **Document environment variables** and volumes
|
|
- **Follow Docker best practices** for security
|
|
|
|
### Documentation
|
|
- **Keep docs current** with infrastructure changes
|
|
- **Use clear, descriptive titles** and sections
|
|
- **Include troubleshooting steps** for common issues
|
|
- **Maintain consistent formatting** across all docs
|
|
|
|
### Monitoring
|
|
- **Monitor everything** that matters to service availability
|
|
- **Set appropriate thresholds** to avoid alert fatigue
|
|
- **Document alert procedures** for quick response
|
|
- **Regular health checks** for all critical services
|
|
|
|
---
|
|
|
|
**Remember**: This is a production homelab with real users and services. Changes should be made thoughtfully with proper testing and documentation. When in doubt, ask questions and test thoroughly before deploying to production.
|
|
|
|
**Status**: ✅ Repository actively maintained with 65+ production services |