# 🏠 Homelab Ansible Playbooks Comprehensive automation playbooks for managing your homelab infrastructure. These playbooks provide operational automation beyond the existing health monitoring and system management. ## 📋 Quick Reference | Category | Playbook | Purpose | Priority | |----------|----------|---------|----------| | **Service Management** | `service_status.yml` | Get status of all services | ⭐⭐⭐ | | | `restart_service.yml` | Restart services with dependencies | ⭐⭐⭐ | | | `container_logs.yml` | Collect logs for troubleshooting | ⭐⭐⭐ | | **Backup & Recovery** | `backup_databases.yml` | Automated database backups | ⭐⭐⭐ | | | `backup_configs.yml` | Configuration and data backups | ⭐⭐⭐ | | | `disaster_recovery_test.yml` | Test DR procedures | ⭐⭐ | | **Storage Management** | `disk_usage_report.yml` | Monitor storage usage | ⭐⭐⭐ | | | `prune_containers.yml` | Clean up Docker resources | ⭐⭐ | | | `log_rotation.yml` | Manage log files | ⭐⭐ | | **Security** | `security_updates.yml` | Automated security patches | ⭐⭐⭐ | | | `certificate_renewal.yml` | SSL certificate management | ⭐⭐ | | **Monitoring** | `service_health_deep.yml` | Comprehensive health checks | ⭐⭐ | ## 🚀 Quick Start ### Prerequisites - Ansible 2.12+ - SSH access to all hosts via Tailscale - Existing inventory from `/home/homelab/organized/repos/homelab/ansible/automation/hosts.ini` ### Run Your First Playbook ```bash cd /home/homelab/organized/repos/homelab/ansible/automation # Check status of all services ansible-playbook playbooks/service_status.yml # Check disk usage across all hosts ansible-playbook playbooks/disk_usage_report.yml # Backup all databases ansible-playbook playbooks/backup_databases.yml ``` ## 📦 Service Management Playbooks ### `service_status.yml` - Service Status Check Get comprehensive status of all services across your homelab. ```bash # Check all hosts ansible-playbook playbooks/service_status.yml # Check specific host ansible-playbook playbooks/service_status.yml --limit atlantis # Generate JSON reports ansible-playbook playbooks/service_status.yml # Reports saved to: /tmp/HOSTNAME_status_TIMESTAMP.json ``` **Features:** - System resource usage - Container status and health - Critical service monitoring - Network connectivity checks - JSON output for automation ### `restart_service.yml` - Service Restart with Dependencies Restart services with proper dependency handling and health checks. ```bash # Restart a service ansible-playbook playbooks/restart_service.yml -e "service_name=plex host_target=atlantis" # Restart with custom wait time ansible-playbook playbooks/restart_service.yml -e "service_name=immich-server host_target=atlantis wait_time=30" # Force restart if graceful stop fails ansible-playbook playbooks/restart_service.yml -e "service_name=problematic-service force_restart=true" ``` **Features:** - Dependency-aware restart order - Health check validation - Graceful stop with force option - Pre/post restart logging - Service-specific wait times ### `container_logs.yml` - Log Collection Collect logs from multiple containers for troubleshooting. ```bash # Collect logs for specific service ansible-playbook playbooks/container_logs.yml -e "service_name=plex" # Collect logs matching pattern ansible-playbook playbooks/container_logs.yml -e "service_pattern=immich" # Collect all container logs ansible-playbook playbooks/container_logs.yml -e "collect_all=true" # Custom log parameters ansible-playbook playbooks/container_logs.yml -e "service_name=plex log_lines=500 log_since=2h" ``` **Features:** - Pattern-based container selection - Error analysis and counting - Resource usage reporting - Structured log organization - Archive option for long-term storage ## 💾 Backup & Recovery Playbooks ### `backup_databases.yml` - Database Backup Automation Automated backup of all PostgreSQL and MySQL databases. ```bash # Backup all databases ansible-playbook playbooks/backup_databases.yml # Full backup with verification ansible-playbook playbooks/backup_databases.yml -e "backup_type=full verify_backups=true" # Specific host backup ansible-playbook playbooks/backup_databases.yml --limit atlantis # Custom retention ansible-playbook playbooks/backup_databases.yml -e "backup_retention_days=60" ``` **Supported Databases:** - **Atlantis**: Immich, Vaultwarden, Joplin, Firefly - **Calypso**: Authentik, Paperless - **Homelab VM**: Mastodon, Matrix **Features:** - Automatic database discovery - Compression and verification - Retention management - Backup integrity testing - Multiple storage locations ### `backup_configs.yml` - Configuration Backup Backup docker-compose files, configs, and important data. ```bash # Backup configurations ansible-playbook playbooks/backup_configs.yml # Include secrets (use with caution) ansible-playbook playbooks/backup_configs.yml -e "include_secrets=true" # Backup without compression ansible-playbook playbooks/backup_configs.yml -e "compress_backups=false" ``` **Backup Includes:** - Docker configurations - SSH configurations - Service-specific data - System information snapshots - Docker-compose files ### `disaster_recovery_test.yml` - DR Testing Test disaster recovery procedures and validate backup integrity. ```bash # Basic DR test (dry run) ansible-playbook playbooks/disaster_recovery_test.yml # Full DR test with restore validation ansible-playbook playbooks/disaster_recovery_test.yml -e "test_type=full dry_run=false" # Test with failover procedures ansible-playbook playbooks/disaster_recovery_test.yml -e "test_failover=true" ``` **Test Components:** - Backup validation and integrity - Database restore testing - RTO (Recovery Time Objective) analysis - Service failover procedures - DR readiness scoring ## 💿 Storage Management Playbooks ### `disk_usage_report.yml` - Storage Monitoring Monitor storage usage and generate comprehensive reports. ```bash # Basic disk usage report ansible-playbook playbooks/disk_usage_report.yml # Detailed analysis with performance data ansible-playbook playbooks/disk_usage_report.yml -e "detailed_analysis=true include_performance=true" # Set custom alert thresholds ansible-playbook playbooks/disk_usage_report.yml -e "alert_threshold=90 warning_threshold=80" # Send alerts for critical usage ansible-playbook playbooks/disk_usage_report.yml -e "send_alerts=true" ``` **Features:** - Filesystem usage monitoring - Docker storage analysis - Large file identification - Temporary file analysis - Alert thresholds and notifications - JSON output for automation ### `prune_containers.yml` - Docker Cleanup Clean up unused containers, images, volumes, and networks. ```bash # Basic cleanup (dry run) ansible-playbook playbooks/prune_containers.yml # Live cleanup ansible-playbook playbooks/prune_containers.yml -e "dry_run=false" # Aggressive cleanup (removes old images) ansible-playbook playbooks/prune_containers.yml -e "aggressive_cleanup=true dry_run=false" # Custom retention and log cleanup ansible-playbook playbooks/prune_containers.yml -e "keep_images_days=14 cleanup_logs=true max_log_size=50m" ``` **Cleanup Actions:** - Remove stopped containers - Remove dangling images - Remove unused volumes (optional) - Remove unused networks - Truncate large container logs - System-wide Docker prune ### `log_rotation.yml` - Log Management Manage log files across all services and system components. ```bash # Basic log rotation (dry run) ansible-playbook playbooks/log_rotation.yml # Live log rotation with compression ansible-playbook playbooks/log_rotation.yml -e "dry_run=false compress_old_logs=true" # Aggressive cleanup ansible-playbook playbooks/log_rotation.yml -e "aggressive_cleanup=true max_log_age_days=14" # Custom log size limits ansible-playbook playbooks/log_rotation.yml -e "max_log_size=50M" ``` **Log Management:** - System log rotation - Docker container log truncation - Application log cleanup - Log compression - Retention policies - Logrotate configuration ## 🔒 Security Playbooks ### `security_updates.yml` - Automated Security Updates Apply security patches and system updates. ```bash # Security updates only ansible-playbook playbooks/security_updates.yml # Security updates with reboot if needed ansible-playbook playbooks/security_updates.yml -e "reboot_if_required=true" # Full system update ansible-playbook playbooks/security_updates.yml -e "security_only=false" # Include Docker updates ansible-playbook playbooks/security_updates.yml -e "update_docker=true" ``` **Features:** - Security-only or full updates - Pre-update configuration backup - Kernel update detection - Automatic reboot handling - Service verification after updates - Update reporting and logging ### `certificate_renewal.yml` - SSL Certificate Management Manage Let's Encrypt certificates and other SSL certificates. ```bash # Check certificate status ansible-playbook playbooks/certificate_renewal.yml -e "check_only=true" # Renew certificates ansible-playbook playbooks/certificate_renewal.yml # Force renewal ansible-playbook playbooks/certificate_renewal.yml -e "force_renewal=true" # Custom renewal threshold ansible-playbook playbooks/certificate_renewal.yml -e "renewal_threshold_days=45" ``` **Certificate Support:** - Let's Encrypt via Certbot - Nginx Proxy Manager certificates - Traefik certificates - Synology DSM certificates ## 🏥 Monitoring Playbooks ### `service_health_deep.yml` - Comprehensive Health Checks Deep health monitoring for all homelab services. ```bash # Deep health check ansible-playbook playbooks/service_health_deep.yml # Include performance metrics ansible-playbook playbooks/service_health_deep.yml -e "include_performance=true" # Enable alerting ansible-playbook playbooks/service_health_deep.yml -e "alert_on_issues=true" # Custom timeout ansible-playbook playbooks/service_health_deep.yml -e "health_check_timeout=60" ``` **Health Checks:** - Container health status - Service endpoint testing - Database connectivity - Redis connectivity - System performance metrics - Log error analysis - Dependency validation ## 🔧 Advanced Usage ### Combining Playbooks ```bash # Complete maintenance routine ansible-playbook playbooks/service_status.yml ansible-playbook playbooks/backup_databases.yml ansible-playbook playbooks/security_updates.yml ansible-playbook playbooks/disk_usage_report.yml ansible-playbook playbooks/prune_containers.yml -e "dry_run=false" ``` ### Scheduling with Cron ```bash # Add to crontab for automated execution # Daily backups at 2 AM 0 2 * * * cd /home/homelab/organized/repos/homelab/ansible/automation && ansible-playbook playbooks/backup_databases.yml # Weekly cleanup on Sundays at 3 AM 0 3 * * 0 cd /home/homelab/organized/repos/homelab/ansible/automation && ansible-playbook playbooks/prune_containers.yml -e "dry_run=false" # Monthly DR test on first Sunday at 4 AM 0 4 1-7 * 0 cd /home/homelab/organized/repos/homelab/ansible/automation && ansible-playbook playbooks/disaster_recovery_test.yml ``` ### Custom Variables Create host-specific variable files: ```bash # host_vars/atlantis.yml backup_retention_days: 60 max_log_size: "200M" alert_threshold: 90 # host_vars/homelab_vm.yml security_only: false reboot_if_required: true ``` ## 📊 Monitoring and Alerting ### Integration with Existing Monitoring These playbooks integrate with your existing Prometheus/Grafana stack: ```bash # Generate metrics for Prometheus ansible-playbook playbooks/service_status.yml ansible-playbook playbooks/disk_usage_report.yml # JSON outputs can be parsed by monitoring systems # Reports saved to /tmp/ directories with timestamps ``` ### Alert Configuration ```bash # Enable alerts in playbooks ansible-playbook playbooks/disk_usage_report.yml -e "send_alerts=true alert_threshold=85" ansible-playbook playbooks/service_health_deep.yml -e "alert_on_issues=true" ansible-playbook playbooks/disaster_recovery_test.yml -e "send_alerts=true" ``` ## 🚨 Emergency Procedures ### Service Recovery ```bash # Quick service restart ansible-playbook playbooks/restart_service.yml -e "service_name=SERVICE_NAME host_target=HOST" # Collect logs for troubleshooting ansible-playbook playbooks/container_logs.yml -e "service_name=SERVICE_NAME" # Check service health ansible-playbook playbooks/service_health_deep.yml --limit HOST ``` ### Storage Emergency ```bash # Check disk usage immediately ansible-playbook playbooks/disk_usage_report.yml -e "alert_threshold=95" # Emergency cleanup ansible-playbook playbooks/prune_containers.yml -e "aggressive_cleanup=true dry_run=false" ansible-playbook playbooks/log_rotation.yml -e "aggressive_cleanup=true dry_run=false" ``` ### Security Incident ```bash # Apply security updates immediately ansible-playbook playbooks/security_updates.yml -e "reboot_if_required=true" # Check certificate status ansible-playbook playbooks/certificate_renewal.yml -e "check_only=true" ``` ## 🔍 Troubleshooting ### Common Issues **Playbook Fails with Permission Denied** ```bash # Check SSH connectivity ansible all -m ping # Verify sudo access ansible all -m shell -a "sudo whoami" --become ``` **Docker Commands Fail** ```bash # Check Docker daemon status ansible-playbook playbooks/service_status.yml --limit HOSTNAME # Verify Docker group membership ansible HOST -m shell -a "groups $USER" ``` **Backup Failures** ```bash # Check backup directory permissions ansible HOST -m file -a "path=/volume1/backups state=directory" --become # Test database connectivity ansible-playbook playbooks/service_health_deep.yml --limit HOST ``` ### Debug Mode ```bash # Run with verbose output ansible-playbook playbooks/PLAYBOOK.yml -vvv # Check specific tasks ansible-playbook playbooks/PLAYBOOK.yml --list-tasks ansible-playbook playbooks/PLAYBOOK.yml --start-at-task="TASK_NAME" ``` ## 📚 Integration with Existing Automation These playbooks complement your existing automation: ### With Current Health Monitoring ```bash # Existing health checks ansible-playbook playbooks/synology_health.yml ansible-playbook playbooks/check_apt_proxy.yml # New comprehensive checks ansible-playbook playbooks/service_health_deep.yml ansible-playbook playbooks/disk_usage_report.yml ``` ### With GitOps Deployment ```bash # After GitOps deployment ansible-playbook playbooks/service_status.yml ansible-playbook playbooks/backup_configs.yml ``` ## 🎯 Best Practices ### Regular Maintenance Schedule - **Daily**: `backup_databases.yml` - **Weekly**: `security_updates.yml`, `disk_usage_report.yml` - **Monthly**: `disaster_recovery_test.yml`, `prune_containers.yml` - **As Needed**: `service_health_deep.yml`, `restart_service.yml` ### Safety Guidelines - Always test with `dry_run=true` first - Use `--limit` for single host testing - Keep backups before major changes - Monitor service status after automation ### Performance Optimization - Run resource-intensive playbooks during low-usage hours - Use `--forks` to control parallelism - Monitor system resources during execution ## 📞 Support For issues with these playbooks: 1. Check the troubleshooting section above 2. Review playbook logs in `/tmp/` directories 3. Use debug mode (`-vvv`) for detailed output 4. Verify integration with existing automation --- **Last Updated**: {{ ansible_date_time.date if ansible_date_time is defined else 'Manual Update Required' }} **Total Playbooks**: 10+ comprehensive automation playbooks **Coverage**: Complete operational automation for homelab management