Sanitized mirror from private repository - 2026-03-16 11:25:27 UTC
This commit is contained in:
522
docs/admin/ANSIBLE_PLAYBOOK_GUIDE.md
Normal file
522
docs/admin/ANSIBLE_PLAYBOOK_GUIDE.md
Normal file
@@ -0,0 +1,522 @@
|
||||
# Ansible Playbook Guide for Homelab
|
||||
|
||||
Last updated: 2026-02-17
|
||||
|
||||
## Overview
|
||||
|
||||
This guide explains how to run Ansible playbooks in the homelab infrastructure. Ansible is used for automation, configuration management, and system maintenance across all hosts in the Tailscale network.
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
/home/homelab/organized/repos/homelab/ansible/
|
||||
├── automation/
|
||||
│ ├── playbooks/ # Automation and maintenance playbooks
|
||||
│ ├── hosts.ini # Inventory file (defines all hosts)
|
||||
│ ├── host_vars/ # Per-host variables
|
||||
│ └── group_vars/ # Group-level variables
|
||||
└── homelab/
|
||||
├── playbooks/ # Deployment playbooks
|
||||
├── inventory.yml # Alternative inventory format
|
||||
└── roles/ # Reusable Ansible roles
|
||||
```
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. **Ansible installed** on the control node (homelab machine)
|
||||
2. **SSH access** to target hosts (configured via Tailscale)
|
||||
3. **Proper working directory**: Run playbooks from `/home/homelab/organized/repos/homelab/ansible/automation/`
|
||||
|
||||
## Basic Ansible Concepts
|
||||
|
||||
- **Inventory**: List of hosts organized into groups (defined in `hosts.ini`)
|
||||
- **Playbook**: YAML file containing automation tasks
|
||||
- **Host Groups**: Logical grouping of hosts (e.g., `debian_clients`, `synology`)
|
||||
- **Tasks**: Individual automation steps (e.g., "update packages")
|
||||
- **Become**: Privilege escalation (sudo) for administrative tasks
|
||||
|
||||
## Available Playbooks
|
||||
|
||||
### Important Notes and Limitations
|
||||
|
||||
⚠️ **TrueNAS SCALE**: Cannot be updated via apt! Package management is disabled on TrueNAS appliances. Updates must be performed through the TrueNAS web interface only. Attempting to update via apt can result in a nonfunctional system.
|
||||
|
||||
```bash
|
||||
# Exclude TrueNAS from apt updates
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "all:!truenas-scale"
|
||||
```
|
||||
|
||||
⚠️ **Raspberry Pi GPG Keys**: If pi-5 fails with GPG signature errors for InfluxDB repository, fix with:
|
||||
```bash
|
||||
ansible -i hosts.ini pi-5 -m shell -a "curl -sL https://repos.influxdata.com/influxdata-archive_compat.key | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/influxdata.gpg" --become
|
||||
```
|
||||
|
||||
⚠️ **Home Assistant**: Uses its own package management system and should be excluded from apt updates.
|
||||
|
||||
### System Maintenance
|
||||
|
||||
#### 1. `update_system.yml`
|
||||
Updates apt cache and upgrades all packages on Debian-based systems.
|
||||
|
||||
**Hosts**: All hosts with Debian/Ubuntu (exclude TrueNAS and Home Assistant)
|
||||
**Requires sudo**: Yes
|
||||
**Use case**: Regular system updates
|
||||
|
||||
```bash
|
||||
# Recommended: Exclude TrueNAS
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "all:!truenas-scale:!homeassistant"
|
||||
|
||||
# Or update specific hosts only
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "homelab,pve,pi-5,vish-concord-nuc"
|
||||
```
|
||||
|
||||
#### 2. `update_ansible.yml`
|
||||
Updates apt cache and specifically upgrades Ansible on Linux hosts (excludes Synology).
|
||||
|
||||
**Hosts**: `debian_clients` (excluding Synology and Home Assistant)
|
||||
**Requires sudo**: Yes
|
||||
**Use case**: Keep Ansible up-to-date on managed hosts
|
||||
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/update_ansible.yml
|
||||
```
|
||||
|
||||
#### 3. `update_ansible_targeted.yml`
|
||||
Same as `update_ansible.yml` but allows targeting specific hosts or groups.
|
||||
|
||||
**Hosts**: Configurable via `--limit`
|
||||
**Requires sudo**: Yes
|
||||
**Use case**: Update Ansible on specific hosts only
|
||||
|
||||
```bash
|
||||
# Update only on homelab and pi-5
|
||||
ansible-playbook -i hosts.ini playbooks/update_ansible_targeted.yml --limit "homelab,pi-5"
|
||||
|
||||
# Update only on Raspberry Pis
|
||||
ansible-playbook -i hosts.ini playbooks/update_ansible_targeted.yml --limit "rpi"
|
||||
```
|
||||
|
||||
### APT Cache / Proxy Management
|
||||
|
||||
#### 4. `check_apt_proxy.yml`
|
||||
Comprehensive health check for APT cache proxy configuration. Verifies that hosts are properly configured to use Calypso's apt-cacher-ng service.
|
||||
|
||||
**Hosts**: `debian_clients`
|
||||
**Requires sudo**: Partially (for some checks)
|
||||
**Use case**: Verify apt-cacher-ng is working correctly
|
||||
**Expected proxy**: calypso (100.103.48.78:3142)
|
||||
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml
|
||||
```
|
||||
|
||||
**What it checks**:
|
||||
- APT proxy configuration file exists (`/etc/apt/apt.conf.d/01proxy`)
|
||||
- Proxy points to correct server (Calypso)
|
||||
- Network connectivity to proxy server
|
||||
- APT configuration is valid
|
||||
- Provides recommendations for misconfigured hosts
|
||||
|
||||
#### 5. `configure_apt_proxy.yml`
|
||||
Configures hosts to use Calypso's APT cache proxy.
|
||||
|
||||
**Hosts**: `debian_clients`
|
||||
**Requires sudo**: Yes
|
||||
**Use case**: Set up apt-cacher-ng on new hosts
|
||||
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/configure_apt_proxy.yml
|
||||
```
|
||||
|
||||
### Health Checks
|
||||
|
||||
#### 6. `ansible_status_check.yml`
|
||||
Checks Ansible installation and connectivity across all hosts.
|
||||
|
||||
**Hosts**: All
|
||||
**Requires sudo**: No
|
||||
**Use case**: Verify Ansible can communicate with all hosts
|
||||
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/ansible_status_check.yml
|
||||
```
|
||||
|
||||
#### 7. `synology_health.yml`
|
||||
Health check specific to Synology NAS devices.
|
||||
|
||||
**Hosts**: `synology` group
|
||||
**Requires sudo**: No
|
||||
**Use case**: Monitor Synology system health
|
||||
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/synology_health.yml
|
||||
```
|
||||
|
||||
#### 8. `tailscale_health.yml`
|
||||
Checks Tailscale connectivity and status.
|
||||
|
||||
**Hosts**: All
|
||||
**Requires sudo**: No
|
||||
**Use case**: Verify Tailscale VPN is working
|
||||
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/tailscale_health.yml
|
||||
```
|
||||
|
||||
### Utility Playbooks
|
||||
|
||||
#### 9. `system_info.yml`
|
||||
Gathers and displays system information from all hosts.
|
||||
|
||||
**Hosts**: All
|
||||
**Requires sudo**: No
|
||||
**Use case**: Quick inventory of system specs
|
||||
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/system_info.yml
|
||||
```
|
||||
|
||||
#### 10. `add_ssh_keys.yml`
|
||||
Adds SSH keys to target hosts for passwordless authentication.
|
||||
|
||||
**Hosts**: Configurable
|
||||
**Requires sudo**: No
|
||||
**Use case**: Set up SSH access for new hosts
|
||||
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/add_ssh_keys.yml
|
||||
```
|
||||
|
||||
#### 11. `cleanup.yml`
|
||||
Performs system cleanup tasks (apt autoclean, autoremove, etc.).
|
||||
|
||||
**Hosts**: `debian_clients`
|
||||
**Requires sudo**: Yes
|
||||
**Use case**: Free up disk space
|
||||
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/cleanup.yml
|
||||
```
|
||||
|
||||
#### 12. `install_tools.yml`
|
||||
Installs common tools and utilities on hosts.
|
||||
|
||||
**Hosts**: Configurable
|
||||
**Requires sudo**: Yes
|
||||
**Use case**: Standardize tool installation
|
||||
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/install_tools.yml
|
||||
```
|
||||
|
||||
## Host Groups Reference
|
||||
|
||||
From `hosts.ini`:
|
||||
|
||||
| Group | Hosts | Purpose |
|
||||
|-------|-------|---------|
|
||||
| `homelab` | homelab | Main management node |
|
||||
| `synology` | atlantis, calypso, setillo | Synology NAS devices |
|
||||
| `rpi` | pi-5, pi-5-kevin | Raspberry Pi nodes |
|
||||
| `hypervisors` | pve, truenas-scale, homeassistant | Virtualization hosts |
|
||||
| `remote` | vish-concord-nuc | Remote systems |
|
||||
| `debian_clients` | homelab, pi-5, pi-5-kevin, vish-concord-nuc, pve, homeassistant, truenas-scale | All Debian/Ubuntu hosts using APT cache (⚠️ exclude truenas-scale and homeassistant from apt updates) |
|
||||
| `all` | All hosts | Every host in inventory |
|
||||
|
||||
## Running Playbooks
|
||||
|
||||
### Basic Syntax
|
||||
|
||||
```bash
|
||||
cd /home/homelab/organized/repos/homelab/ansible/automation/
|
||||
|
||||
ansible-playbook -i hosts.ini playbooks/<playbook-name>.yml
|
||||
```
|
||||
|
||||
### Common Options
|
||||
|
||||
#### Target Specific Hosts
|
||||
```bash
|
||||
# Single host
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit homelab
|
||||
|
||||
# Multiple hosts
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "homelab,pi-5"
|
||||
|
||||
# All hosts in a group
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "rpi"
|
||||
|
||||
# All except specific hosts
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "all:!synology"
|
||||
```
|
||||
|
||||
#### Check Mode (Dry Run)
|
||||
Preview what would change without actually making changes:
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml --check
|
||||
```
|
||||
|
||||
#### Verbose Output
|
||||
Get more detailed information about what Ansible is doing:
|
||||
```bash
|
||||
# Basic verbose
|
||||
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml -v
|
||||
|
||||
# More verbose (connection info)
|
||||
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml -vv
|
||||
|
||||
# Very verbose (includes module info)
|
||||
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml -vvv
|
||||
|
||||
# Debug level (everything)
|
||||
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml -vvvv
|
||||
```
|
||||
|
||||
#### Ask for Sudo Password
|
||||
If SSH user doesn't have passwordless sudo:
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml --ask-become-pass
|
||||
# or short form:
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml -K
|
||||
```
|
||||
|
||||
#### Ask for SSH Password
|
||||
If using password authentication instead of SSH keys:
|
||||
```bash
|
||||
ansible-playbook -i hosts.ini playbooks/system_info.yml --ask-pass
|
||||
# or short form:
|
||||
ansible-playbook -i hosts.ini playbooks/system_info.yml -k
|
||||
```
|
||||
|
||||
## Common Workflows
|
||||
|
||||
### Weekly Maintenance Routine
|
||||
|
||||
```bash
|
||||
cd /home/homelab/organized/repos/homelab/ansible/automation/
|
||||
|
||||
# 1. Check that all hosts are reachable
|
||||
ansible-playbook -i hosts.ini playbooks/ansible_status_check.yml
|
||||
|
||||
# 2. Verify APT cache proxy is working
|
||||
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml
|
||||
|
||||
# 3. Update all systems
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml
|
||||
|
||||
# 4. Clean up old packages
|
||||
ansible-playbook -i hosts.ini playbooks/cleanup.yml
|
||||
|
||||
# 5. Check Tailscale connectivity
|
||||
ansible-playbook -i hosts.ini playbooks/tailscale_health.yml
|
||||
```
|
||||
|
||||
### Adding a New Host
|
||||
|
||||
```bash
|
||||
# 1. Edit hosts.ini and add the new host to appropriate groups
|
||||
nano hosts.ini
|
||||
|
||||
# 2. Test connectivity
|
||||
ansible -i hosts.ini <new-host> -m ping
|
||||
|
||||
# 3. Add SSH keys
|
||||
ansible-playbook -i hosts.ini playbooks/add_ssh_keys.yml --limit <new-host>
|
||||
|
||||
# 4. Configure APT proxy
|
||||
ansible-playbook -i hosts.ini playbooks/configure_apt_proxy.yml --limit <new-host>
|
||||
|
||||
# 5. Install standard tools
|
||||
ansible-playbook -i hosts.ini playbooks/install_tools.yml --limit <new-host>
|
||||
|
||||
# 6. Update system
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit <new-host>
|
||||
```
|
||||
|
||||
### Troubleshooting a Host
|
||||
|
||||
```bash
|
||||
# 1. Get system info
|
||||
ansible-playbook -i hosts.ini playbooks/system_info.yml --limit <host>
|
||||
|
||||
# 2. Check Ansible status
|
||||
ansible-playbook -i hosts.ini playbooks/ansible_status_check.yml --limit <host>
|
||||
|
||||
# 3. Check Tailscale connectivity
|
||||
ansible-playbook -i hosts.ini playbooks/tailscale_health.yml --limit <host>
|
||||
|
||||
# 4. Verify APT configuration
|
||||
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml --limit <host>
|
||||
```
|
||||
|
||||
## Ad-Hoc Commands
|
||||
|
||||
For quick one-off tasks, use ansible directly:
|
||||
|
||||
```bash
|
||||
# Ping all hosts
|
||||
ansible -i hosts.ini all -m ping
|
||||
|
||||
# Check disk space
|
||||
ansible -i hosts.ini all -m shell -a "df -h" --become
|
||||
|
||||
# Restart a service
|
||||
ansible -i hosts.ini homelab -m systemd -a "name=docker state=restarted" --become
|
||||
|
||||
# Check uptime
|
||||
ansible -i hosts.ini all -m command -a "uptime"
|
||||
|
||||
# Get memory info
|
||||
ansible -i hosts.ini all -m shell -a "free -h"
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Connection Issues
|
||||
|
||||
**Problem**: "Connection timeout" or "Host unreachable"
|
||||
```bash
|
||||
# Test direct ping
|
||||
ping <host-ip>
|
||||
|
||||
# Test SSH manually
|
||||
ssh <user>@<host-ip>
|
||||
|
||||
# Check Tailscale status
|
||||
tailscale status
|
||||
```
|
||||
|
||||
**Problem**: "Permission denied (publickey)"
|
||||
```bash
|
||||
# Add your SSH key to the host
|
||||
ssh-copy-id <user>@<host-ip>
|
||||
|
||||
# Or use password authentication
|
||||
ansible-playbook -i hosts.ini playbooks/<playbook>.yml -k
|
||||
```
|
||||
|
||||
### Privilege Escalation Issues
|
||||
|
||||
**Problem**: "This command has to be run under the root user"
|
||||
```bash
|
||||
# Use --ask-become-pass
|
||||
ansible-playbook -i hosts.ini playbooks/<playbook>.yml -K
|
||||
|
||||
# Or configure passwordless sudo on target host:
|
||||
# sudo visudo
|
||||
# Add: <user> ALL=(ALL) NOPASSWD:ALL
|
||||
```
|
||||
|
||||
### Playbook Failures
|
||||
|
||||
**Problem**: Task fails on some hosts
|
||||
```bash
|
||||
# Run in verbose mode to see detailed errors
|
||||
ansible-playbook -i hosts.ini playbooks/<playbook>.yml -vvv
|
||||
|
||||
# Use --limit to retry only failed hosts
|
||||
ansible-playbook -i hosts.ini playbooks/<playbook>.yml --limit @/tmp/retry_hosts.txt
|
||||
```
|
||||
|
||||
**Problem**: "Module not found"
|
||||
```bash
|
||||
# Update Ansible on control node
|
||||
sudo apt update && sudo apt upgrade ansible -y
|
||||
|
||||
# Check Ansible version
|
||||
ansible --version
|
||||
```
|
||||
|
||||
### APT Update Failures
|
||||
|
||||
**Problem**: "Failed to update apt cache: unknown reason" (especially on Raspberry Pi)
|
||||
```bash
|
||||
# Often caused by missing GPG keys. Test manually:
|
||||
ansible -i hosts.ini <host> -m shell -a "sudo apt-get update 2>&1" --become
|
||||
|
||||
# Fix missing GPG keys (InfluxDB example):
|
||||
ansible -i hosts.ini <host> -m shell -a "curl -sL https://repos.influxdata.com/influxdata-archive_compat.key | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/influxdata.gpg" --become
|
||||
|
||||
# Workaround: Use shell commands instead of apt module
|
||||
ansible -i hosts.ini <host> -m shell -a "sudo apt-get update && sudo apt-get upgrade -y" --become
|
||||
```
|
||||
|
||||
**Problem**: TrueNAS apt update fails with "rc: -9" or package management disabled
|
||||
```bash
|
||||
# This is expected behavior - TrueNAS disables apt for system stability
|
||||
# Solution: Update TrueNAS only through its web interface
|
||||
# Exclude from playbooks:
|
||||
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "all:!truenas-scale"
|
||||
```
|
||||
|
||||
**Problem**: "Package lock" or "Unable to acquire dpkg lock"
|
||||
```bash
|
||||
# Check if another process is using apt
|
||||
ansible -i hosts.ini <host> -m shell -a "sudo lsof /var/lib/dpkg/lock-frontend" --become
|
||||
|
||||
# Kill stuck apt processes (use with caution)
|
||||
ansible -i hosts.ini <host> -m shell -a "sudo killall apt apt-get" --become
|
||||
|
||||
# Remove lock files if no process is running
|
||||
ansible -i hosts.ini <host> -m shell -a "sudo rm /var/lib/dpkg/lock-frontend /var/lib/dpkg/lock" --become
|
||||
```
|
||||
|
||||
### Inventory Issues
|
||||
|
||||
**Problem**: "Could not match supplied host pattern"
|
||||
```bash
|
||||
# List all hosts in inventory
|
||||
ansible -i hosts.ini all --list-hosts
|
||||
|
||||
# List hosts in a specific group
|
||||
ansible -i hosts.ini debian_clients --list-hosts
|
||||
|
||||
# Verify inventory file syntax
|
||||
ansible-inventory -i hosts.ini --list
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
1. **Always use version control**: Commit playbook changes to git
|
||||
2. **Test with --check first**: Use dry-run mode for risky changes
|
||||
3. **Start small**: Test on a single host before running on all hosts
|
||||
4. **Document changes**: Add comments to playbooks explaining what they do
|
||||
5. **Use tags**: Tag tasks for selective execution
|
||||
6. **Keep playbooks idempotent**: Running multiple times should be safe
|
||||
7. **Monitor logs**: Check `/var/log/ansible.log` on managed hosts
|
||||
8. **Backup before major changes**: Create snapshots of important systems
|
||||
|
||||
## Security Considerations
|
||||
|
||||
1. **SSH Keys**: Use SSH keys instead of passwords when possible
|
||||
2. **Vault**: Use Ansible Vault for sensitive data (passwords, API keys)
|
||||
3. **Least Privilege**: Don't run playbooks with more privileges than needed
|
||||
4. **Audit Trail**: Keep git history of all playbook changes
|
||||
5. **Network Isolation**: Use Tailscale for secure communication
|
||||
|
||||
## Quick Reference Card
|
||||
|
||||
| Task | Command |
|
||||
|------|---------|
|
||||
| Update all systems | `ansible-playbook -i hosts.ini playbooks/update_system.yml` |
|
||||
| Check APT proxy | `ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml` |
|
||||
| Update Ansible | `ansible-playbook -i hosts.ini playbooks/update_ansible.yml` |
|
||||
| Ping all hosts | `ansible -i hosts.ini all -m ping` |
|
||||
| Get system info | `ansible-playbook -i hosts.ini playbooks/system_info.yml` |
|
||||
| Clean up systems | `ansible-playbook -i hosts.ini playbooks/cleanup.yml` |
|
||||
| Dry run (no changes) | `ansible-playbook -i hosts.ini playbooks/<playbook>.yml --check` |
|
||||
| Verbose output | `ansible-playbook -i hosts.ini playbooks/<playbook>.yml -vvv` |
|
||||
| Target one host | `ansible-playbook -i hosts.ini playbooks/<playbook>.yml --limit <host>` |
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- [Ansible Documentation](https://docs.ansible.com/)
|
||||
- [Ansible Best Practices](https://docs.ansible.com/ansible/latest/user_guide/playbooks_best_practices.html)
|
||||
- [Ansible Galaxy](https://galaxy.ansible.com/) - Community roles and playbooks
|
||||
- Repository: `/home/homelab/organized/repos/homelab/ansible/`
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Git Branches Guide](./GIT_BRANCHES_GUIDE.md) - Version control for playbook changes
|
||||
- [Infrastructure Overview](../infrastructure/MONITORING_ARCHITECTURE.md) - Homelab infrastructure details
|
||||
- Ansible host vars: `/home/homelab/organized/repos/homelab/ansible/automation/host_vars/`
|
||||
Reference in New Issue
Block a user