Sanitized mirror from private repository - 2026-04-04 03:23:14 UTC
This commit is contained in:
281
docs/admin/ANSIBLE_PLAYBOOK_GUIDE.md
Normal file
281
docs/admin/ANSIBLE_PLAYBOOK_GUIDE.md
Normal file
@@ -0,0 +1,281 @@
|
||||
# Ansible Playbook Guide for Homelab
|
||||
|
||||
Last updated: 2026-03-17 (runners: homelab, calypso, pi5)
|
||||
|
||||
## Overview
|
||||
|
||||
This guide explains how to run Ansible playbooks in the homelab infrastructure. Ansible is used for automation, configuration management, and system maintenance across all hosts in the Tailscale network.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
/home/homelab/organized/repos/homelab/ansible/
|
||||
├── inventory.yml # Primary inventory (YAML format)
|
||||
├── automation/
|
||||
│ ├── playbooks/ # Automation and maintenance playbooks
|
||||
│ ├── hosts.ini # Legacy INI inventory
|
||||
│ ├── host_vars/ # Per-host variables
|
||||
│ └── group_vars/ # Group-level variables
|
||||
├── playbooks/ # Deployment and infrastructure playbooks
|
||||
│ ├── common/ # Reusable operational playbooks
|
||||
│ └── deploy_*.yml # Per-host deployment playbooks
|
||||
└── homelab/
|
||||
├── playbooks/ # Duplicate of above (legacy)
|
||||
└── roles/ # Reusable Ansible roles
|
||||
```
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. **Ansible installed** on the control node (homelab machine)
|
||||
2. **SSH access** to target hosts (configured via Tailscale)
|
||||
3. **Primary inventory**: `ansible/inventory.yml`
|
||||
|
||||
## Running Playbooks
|
||||
|
||||
### Basic Syntax
|
||||
|
||||
```bash
|
||||
cd /home/homelab/organized/repos/homelab/
|
||||
|
||||
# Using the primary YAML inventory
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/<playbook>.yml
|
||||
|
||||
# Target specific hosts
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/<playbook>.yml --limit "homelab,pi-5"
|
||||
|
||||
# Dry run (no changes)
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/<playbook>.yml --check
|
||||
|
||||
# Verbose output
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/<playbook>.yml -vvv
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Complete Playbook Reference
|
||||
|
||||
### System Updates & Package Management
|
||||
|
||||
| Playbook | Targets | Sudo | Description |
|
||||
|----------|---------|------|-------------|
|
||||
| `update_system.yml` | all (Debian) | yes | Apt update + dist-upgrade on all Debian hosts |
|
||||
| `update_ansible.yml` | debian_clients | yes | Upgrades Ansible on Linux hosts (excludes Synology) |
|
||||
| `update_ansible_targeted.yml` | configurable | yes | Targeted Ansible upgrade on specific hosts |
|
||||
| `security_updates.yml` | all | yes | Automated security patches with optional reboot |
|
||||
| `cleanup.yml` | debian_clients | yes | Runs autoremove and cleans temp files |
|
||||
| `install_tools.yml` | configurable | yes | Installs common diagnostic packages across hosts |
|
||||
|
||||
### APT Cache / Proxy Management
|
||||
|
||||
| Playbook | Targets | Sudo | Description |
|
||||
|----------|---------|------|-------------|
|
||||
| `check_apt_proxy.yml` | debian_clients | partial | Validates APT proxy config, connectivity, and provides recommendations |
|
||||
| `configure_apt_proxy.yml` | debian_clients | yes | Sets up `/etc/apt/apt.conf.d/01proxy` pointing to calypso (100.103.48.78:3142) |
|
||||
|
||||
### Health Checks & Monitoring
|
||||
|
||||
| Playbook | Targets | Sudo | Description |
|
||||
|----------|---------|------|-------------|
|
||||
| `health_check.yml` | all | no | Comprehensive health check including critical services |
|
||||
| `service_health_deep.yml` | all | no | Deep health monitoring with optional performance data |
|
||||
| `service_status.yml` | all | no | Service status check across all hosts |
|
||||
| `ansible_status_check.yml` | all | no | Verifies Ansible is working, optionally upgrades it |
|
||||
| `tailscale_health.yml` | active | no | Checks Tailscale connectivity and status |
|
||||
| `network_connectivity.yml` | all | no | Full mesh connectivity: Tailscale, ping, SSH, HTTP checks |
|
||||
| `ntp_check.yml` | all | no | Audits time synchronization, alerts on clock drift |
|
||||
| `alert_check.yml` | all | no | Monitors conditions and sends alerts when thresholds exceeded |
|
||||
| `system_monitoring.yml` | all | no | Collects system metrics with configurable retention |
|
||||
| `system_metrics.yml` | all | no | Detailed system metrics collection for analysis |
|
||||
| `disk_usage_report.yml` | all | no | Storage usage report with alert thresholds |
|
||||
|
||||
### Container Management
|
||||
|
||||
| Playbook | Targets | Sudo | Description |
|
||||
|----------|---------|------|-------------|
|
||||
| `container_update_orchestrator.yml` | all | yes | Orchestrates container updates with rollback support |
|
||||
| `container_dependency_map.yml` | all | no | Maps container dependencies for ordered restarts |
|
||||
| `container_dependency_orchestrator.yml` | all | yes | Smart restart ordering with cross-host dependency management |
|
||||
| `container_resource_optimizer.yml` | all | no | Analyzes and recommends container resource adjustments |
|
||||
| `container_logs.yml` | configurable | no | Collects container logs for troubleshooting |
|
||||
| `prune_containers.yml` | all | yes | Removes unused containers, images, volumes, networks |
|
||||
| `restart_service.yml` | configurable | yes | Restarts a service with dependency-aware ordering |
|
||||
| `configure_docker_logging.yml` | linux hosts | yes | Sets daemon-level log rotation (10MB x 3 files) |
|
||||
| `update_portainer_agent.yml` | portainer_edge_agents | yes | Updates Portainer Edge Agent across all hosts |
|
||||
|
||||
### Backups & Disaster Recovery
|
||||
|
||||
| Playbook | Targets | Sudo | Description |
|
||||
|----------|---------|------|-------------|
|
||||
| `backup_configs.yml` | all | no | Backs up docker-compose files, configs, and secrets |
|
||||
| `backup_databases.yml` | all | yes | Automated PostgreSQL/MySQL backup across all hosts |
|
||||
| `backup_verification.yml` | all | no | Validates backup integrity and tests restore procedures |
|
||||
| `synology_backup_orchestrator.yml` | synology | no | Coordinates backups across Synology devices |
|
||||
| `disaster_recovery_test.yml` | all | no | Tests DR procedures and validates backup integrity |
|
||||
| `disaster_recovery_orchestrator.yml` | all | yes | Full infrastructure backup and recovery procedures |
|
||||
|
||||
### Infrastructure & Discovery
|
||||
|
||||
| Playbook | Targets | Sudo | Description |
|
||||
|----------|---------|------|-------------|
|
||||
| `service_inventory.yml` | all | no | Inventories all services and generates documentation |
|
||||
| `prometheus_target_discovery.yml` | all | no | Auto-discovers containers for Prometheus monitoring |
|
||||
| `proxmox_management.yml` | pve | yes | Health check and management for VMs/LXCs on PVE |
|
||||
| `cron_audit.yml` | all | yes | Inventories cron jobs and systemd timers |
|
||||
| `security_audit.yml` | all | no | Audits security posture and generates reports |
|
||||
| `certificate_renewal.yml` | all | yes | Manages and renews SSL/Let's Encrypt certs |
|
||||
| `log_rotation.yml` | all | yes | Manages log files across services and system components |
|
||||
| `setup_gitea_runner.yml` | configurable | yes | Deploys a Gitea Actions runner for CI |
|
||||
|
||||
### Utility
|
||||
|
||||
| Playbook | Targets | Sudo | Description |
|
||||
|----------|---------|------|-------------|
|
||||
| `system_info.yml` | all | no | Gathers and prints system details from all hosts |
|
||||
| `add_ssh_keys.yml` | configurable | no | Distributes homelab SSH public key to all hosts |
|
||||
|
||||
---
|
||||
|
||||
## Infrastructure Playbooks (`ansible/playbooks/`)
|
||||
|
||||
### Platform Health
|
||||
|
||||
| Playbook | Targets | Description |
|
||||
|----------|---------|-------------|
|
||||
| `synology_health.yml` | synology | Health check for Synology NAS devices |
|
||||
| `truenas_health.yml` | truenas-scale | Health check for TrueNAS SCALE |
|
||||
| `tailscale_management.yml` | all | Manages Tailscale across hosts with reporting |
|
||||
| `tailscale_mesh_management.yml` | all | Validates mesh connectivity, manages keys |
|
||||
| `portainer_stack_management.yml` | localhost | Manages GitOps stacks via Portainer API |
|
||||
|
||||
### Deployment Playbooks (`deploy_*.yml`)
|
||||
|
||||
Per-host deployment playbooks that deploy Docker stacks to specific machines. All accept `--check` for dry-run.
|
||||
|
||||
| Playbook | Target Host |
|
||||
|----------|-------------|
|
||||
| `deploy_atlantis.yml` | atlantis (primary Synology NAS) |
|
||||
| `deploy_calypso.yml` | calypso (secondary Synology NAS) |
|
||||
| `deploy_setillo.yml` | setillo (Seattle offsite NAS) |
|
||||
| `deploy_homelab_vm.yml` | homelab (primary VM) |
|
||||
| `deploy_rpi5_vish.yml` | pi-5 (Raspberry Pi 5) |
|
||||
| `deploy_concord_nuc.yml` | vish-concord-nuc (Intel NUC) |
|
||||
| `deploy_seattle.yml` | seattle (Contabo VPS) |
|
||||
| `deploy_guava.yml` | guava (TrueNAS Scale) |
|
||||
| `deploy_matrix_ubuntu_vm.yml` | matrix-ubuntu (Matrix/Mattermost VM) |
|
||||
| `deploy_anubis.yml` | anubis (physical host) |
|
||||
| `deploy_bulgaria_vm.yml` | bulgaria-vm |
|
||||
| `deploy_chicago_vm.yml` | chicago-vm |
|
||||
| `deploy_contabo_vm.yml` | contabo-vm |
|
||||
| `deploy_lxc.yml` | LXC container on PVE |
|
||||
|
||||
### Common / Reusable Playbooks (`playbooks/common/`)
|
||||
|
||||
| Playbook | Description |
|
||||
|----------|-------------|
|
||||
| `backup_configs.yml` | Back up docker-compose configs and data |
|
||||
| `install_docker.yml` | Install Docker on non-Synology hosts |
|
||||
| `restart_service.yml` | Restart a named Docker service |
|
||||
| `setup_directories.yml` | Create base directory structure for Docker |
|
||||
| `logs.yml` | Show logs for a specific container |
|
||||
| `status.yml` | List running Docker containers |
|
||||
| `update_containers.yml` | Pull new images and recreate containers |
|
||||
|
||||
---
|
||||
|
||||
## Host Groups Reference
|
||||
|
||||
From `ansible/inventory.yml`:
|
||||
|
||||
| Group | Hosts | Purpose |
|
||||
|-------|-------|---------|
|
||||
| `synology` | atlantis, calypso, setillo | Synology NAS devices |
|
||||
| `rpi` | pi-5, pi-5-kevin | Raspberry Pi nodes |
|
||||
| `hypervisors` | pve, truenas-scale, homeassistant | Virtualization/appliance hosts |
|
||||
| `remote` | vish-concord-nuc, seattle | Remote/physical compute hosts |
|
||||
| `local_vms` | homelab, matrix-ubuntu | On-site VMs |
|
||||
| `debian_clients` | homelab, pi-5, pi-5-kevin, vish-concord-nuc, pve, matrix-ubuntu, seattle | Debian/Ubuntu hosts using APT cache proxy |
|
||||
| `portainer_edge_agents` | homelab, vish-concord-nuc, pi-5, calypso | Hosts running Portainer Edge Agent |
|
||||
| `active` | all groups | All reachable managed hosts |
|
||||
|
||||
---
|
||||
|
||||
## Important Notes & Warnings
|
||||
|
||||
- **TrueNAS SCALE**: Do NOT run apt update — use the web UI only. Excluded from `debian_clients`.
|
||||
- **Home Assistant**: Manages its own packages. Excluded from `debian_clients`.
|
||||
- **pi-5-kevin**: Frequently offline — expect `UNREACHABLE` errors.
|
||||
- **Synology**: `ansible_become: false` — DSM does not use standard sudo.
|
||||
- **InfluxDB on pi-5**: If apt fails with GPG errors, the source file must use `signed-by=/usr/share/keyrings/influxdata-archive.gpg` (the packaged keyring), not a manually imported key.
|
||||
|
||||
## Common Workflows
|
||||
|
||||
### Weekly Maintenance
|
||||
|
||||
```bash
|
||||
# 1. Check all hosts are reachable
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/ansible_status_check.yml
|
||||
|
||||
# 2. Verify APT cache proxy
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/check_apt_proxy.yml
|
||||
|
||||
# 3. Update all debian_clients
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/update_system.yml --limit debian_clients
|
||||
|
||||
# 4. Clean up old packages
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/cleanup.yml
|
||||
|
||||
# 5. Check Tailscale connectivity
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/tailscale_health.yml
|
||||
```
|
||||
|
||||
### Adding a New Host
|
||||
|
||||
```bash
|
||||
# 1. Add host to ansible/inventory.yml (and to debian_clients if Debian/Ubuntu)
|
||||
# 2. Test connectivity
|
||||
ansible -i ansible/inventory.yml <new-host> -m ping
|
||||
|
||||
# 3. Add SSH keys
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/add_ssh_keys.yml --limit <new-host>
|
||||
|
||||
# 4. Configure APT proxy
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/configure_apt_proxy.yml --limit <new-host>
|
||||
|
||||
# 5. Install standard tools
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/install_tools.yml --limit <new-host>
|
||||
|
||||
# 6. Update system
|
||||
ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/update_system.yml --limit <new-host>
|
||||
```
|
||||
|
||||
## Ad-Hoc Commands
|
||||
|
||||
```bash
|
||||
# Ping all hosts
|
||||
ansible -i ansible/inventory.yml all -m ping
|
||||
|
||||
# Check disk space
|
||||
ansible -i ansible/inventory.yml all -m shell -a "df -h" --become
|
||||
|
||||
# Restart Docker on a host
|
||||
ansible -i ansible/inventory.yml homelab -m systemd -a "name=docker state=restarted" --become
|
||||
|
||||
# Check uptime
|
||||
ansible -i ansible/inventory.yml all -m command -a "uptime"
|
||||
```
|
||||
|
||||
## Quick Reference Card
|
||||
|
||||
| Task | Command |
|
||||
|------|---------|
|
||||
| Update debian hosts | `ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/update_system.yml --limit debian_clients` |
|
||||
| Check APT proxy | `ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/check_apt_proxy.yml` |
|
||||
| Full health check | `ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/health_check.yml` |
|
||||
| Ping all hosts | `ansible -i ansible/inventory.yml all -m ping` |
|
||||
| System info | `ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/system_info.yml` |
|
||||
| Clean up systems | `ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/cleanup.yml` |
|
||||
| Prune containers | `ansible-playbook -i ansible/inventory.yml ansible/automation/playbooks/prune_containers.yml` |
|
||||
| Synology health | `ansible-playbook -i ansible/inventory.yml ansible/playbooks/synology_health.yml` |
|
||||
| Dry run | add `--check` to any command |
|
||||
| Verbose output | add `-vvv` to any command |
|
||||
| Target one host | add `--limit <host>` to any command |
|
||||
Reference in New Issue
Block a user