523 lines
16 KiB
Markdown
523 lines
16 KiB
Markdown
# Ansible Playbook Guide for Homelab
|
|
|
|
Last updated: 2026-02-17
|
|
|
|
## Overview
|
|
|
|
This guide explains how to run Ansible playbooks in the homelab infrastructure. Ansible is used for automation, configuration management, and system maintenance across all hosts in the Tailscale network.
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
/home/homelab/organized/repos/homelab/ansible/
|
|
├── automation/
|
|
│ ├── playbooks/ # Automation and maintenance playbooks
|
|
│ ├── hosts.ini # Inventory file (defines all hosts)
|
|
│ ├── host_vars/ # Per-host variables
|
|
│ └── group_vars/ # Group-level variables
|
|
└── homelab/
|
|
├── playbooks/ # Deployment playbooks
|
|
├── inventory.yml # Alternative inventory format
|
|
└── roles/ # Reusable Ansible roles
|
|
```
|
|
|
|
## Prerequisites
|
|
|
|
1. **Ansible installed** on the control node (homelab machine)
|
|
2. **SSH access** to target hosts (configured via Tailscale)
|
|
3. **Proper working directory**: Run playbooks from `/home/homelab/organized/repos/homelab/ansible/automation/`
|
|
|
|
## Basic Ansible Concepts
|
|
|
|
- **Inventory**: List of hosts organized into groups (defined in `hosts.ini`)
|
|
- **Playbook**: YAML file containing automation tasks
|
|
- **Host Groups**: Logical grouping of hosts (e.g., `debian_clients`, `synology`)
|
|
- **Tasks**: Individual automation steps (e.g., "update packages")
|
|
- **Become**: Privilege escalation (sudo) for administrative tasks
|
|
|
|
## Available Playbooks
|
|
|
|
### Important Notes and Limitations
|
|
|
|
⚠️ **TrueNAS SCALE**: Cannot be updated via apt! Package management is disabled on TrueNAS appliances. Updates must be performed through the TrueNAS web interface only. Attempting to update via apt can result in a nonfunctional system.
|
|
|
|
```bash
|
|
# Exclude TrueNAS from apt updates
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "all:!truenas-scale"
|
|
```
|
|
|
|
⚠️ **Raspberry Pi GPG Keys**: If pi-5 fails with GPG signature errors for InfluxDB repository, fix with:
|
|
```bash
|
|
ansible -i hosts.ini pi-5 -m shell -a "curl -sL https://repos.influxdata.com/influxdata-archive_compat.key | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/influxdata.gpg" --become
|
|
```
|
|
|
|
⚠️ **Home Assistant**: Uses its own package management system and should be excluded from apt updates.
|
|
|
|
### System Maintenance
|
|
|
|
#### 1. `update_system.yml`
|
|
Updates apt cache and upgrades all packages on Debian-based systems.
|
|
|
|
**Hosts**: All hosts with Debian/Ubuntu (exclude TrueNAS and Home Assistant)
|
|
**Requires sudo**: Yes
|
|
**Use case**: Regular system updates
|
|
|
|
```bash
|
|
# Recommended: Exclude TrueNAS
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "all:!truenas-scale:!homeassistant"
|
|
|
|
# Or update specific hosts only
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "homelab,pve,pi-5,vish-concord-nuc"
|
|
```
|
|
|
|
#### 2. `update_ansible.yml`
|
|
Updates apt cache and specifically upgrades Ansible on Linux hosts (excludes Synology).
|
|
|
|
**Hosts**: `debian_clients` (excluding Synology and Home Assistant)
|
|
**Requires sudo**: Yes
|
|
**Use case**: Keep Ansible up-to-date on managed hosts
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/update_ansible.yml
|
|
```
|
|
|
|
#### 3. `update_ansible_targeted.yml`
|
|
Same as `update_ansible.yml` but allows targeting specific hosts or groups.
|
|
|
|
**Hosts**: Configurable via `--limit`
|
|
**Requires sudo**: Yes
|
|
**Use case**: Update Ansible on specific hosts only
|
|
|
|
```bash
|
|
# Update only on homelab and pi-5
|
|
ansible-playbook -i hosts.ini playbooks/update_ansible_targeted.yml --limit "homelab,pi-5"
|
|
|
|
# Update only on Raspberry Pis
|
|
ansible-playbook -i hosts.ini playbooks/update_ansible_targeted.yml --limit "rpi"
|
|
```
|
|
|
|
### APT Cache / Proxy Management
|
|
|
|
#### 4. `check_apt_proxy.yml`
|
|
Comprehensive health check for APT cache proxy configuration. Verifies that hosts are properly configured to use Calypso's apt-cacher-ng service.
|
|
|
|
**Hosts**: `debian_clients`
|
|
**Requires sudo**: Partially (for some checks)
|
|
**Use case**: Verify apt-cacher-ng is working correctly
|
|
**Expected proxy**: calypso (100.103.48.78:3142)
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml
|
|
```
|
|
|
|
**What it checks**:
|
|
- APT proxy configuration file exists (`/etc/apt/apt.conf.d/01proxy`)
|
|
- Proxy points to correct server (Calypso)
|
|
- Network connectivity to proxy server
|
|
- APT configuration is valid
|
|
- Provides recommendations for misconfigured hosts
|
|
|
|
#### 5. `configure_apt_proxy.yml`
|
|
Configures hosts to use Calypso's APT cache proxy.
|
|
|
|
**Hosts**: `debian_clients`
|
|
**Requires sudo**: Yes
|
|
**Use case**: Set up apt-cacher-ng on new hosts
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/configure_apt_proxy.yml
|
|
```
|
|
|
|
### Health Checks
|
|
|
|
#### 6. `ansible_status_check.yml`
|
|
Checks Ansible installation and connectivity across all hosts.
|
|
|
|
**Hosts**: All
|
|
**Requires sudo**: No
|
|
**Use case**: Verify Ansible can communicate with all hosts
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/ansible_status_check.yml
|
|
```
|
|
|
|
#### 7. `synology_health.yml`
|
|
Health check specific to Synology NAS devices.
|
|
|
|
**Hosts**: `synology` group
|
|
**Requires sudo**: No
|
|
**Use case**: Monitor Synology system health
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/synology_health.yml
|
|
```
|
|
|
|
#### 8. `tailscale_health.yml`
|
|
Checks Tailscale connectivity and status.
|
|
|
|
**Hosts**: All
|
|
**Requires sudo**: No
|
|
**Use case**: Verify Tailscale VPN is working
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/tailscale_health.yml
|
|
```
|
|
|
|
### Utility Playbooks
|
|
|
|
#### 9. `system_info.yml`
|
|
Gathers and displays system information from all hosts.
|
|
|
|
**Hosts**: All
|
|
**Requires sudo**: No
|
|
**Use case**: Quick inventory of system specs
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/system_info.yml
|
|
```
|
|
|
|
#### 10. `add_ssh_keys.yml`
|
|
Adds SSH keys to target hosts for passwordless authentication.
|
|
|
|
**Hosts**: Configurable
|
|
**Requires sudo**: No
|
|
**Use case**: Set up SSH access for new hosts
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/add_ssh_keys.yml
|
|
```
|
|
|
|
#### 11. `cleanup.yml`
|
|
Performs system cleanup tasks (apt autoclean, autoremove, etc.).
|
|
|
|
**Hosts**: `debian_clients`
|
|
**Requires sudo**: Yes
|
|
**Use case**: Free up disk space
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/cleanup.yml
|
|
```
|
|
|
|
#### 12. `install_tools.yml`
|
|
Installs common tools and utilities on hosts.
|
|
|
|
**Hosts**: Configurable
|
|
**Requires sudo**: Yes
|
|
**Use case**: Standardize tool installation
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/install_tools.yml
|
|
```
|
|
|
|
## Host Groups Reference
|
|
|
|
From `hosts.ini`:
|
|
|
|
| Group | Hosts | Purpose |
|
|
|-------|-------|---------|
|
|
| `homelab` | homelab | Main management node |
|
|
| `synology` | atlantis, calypso, setillo | Synology NAS devices |
|
|
| `rpi` | pi-5, pi-5-kevin | Raspberry Pi nodes |
|
|
| `hypervisors` | pve, truenas-scale, homeassistant | Virtualization hosts |
|
|
| `remote` | vish-concord-nuc | Remote systems |
|
|
| `debian_clients` | homelab, pi-5, pi-5-kevin, vish-concord-nuc, pve, homeassistant, truenas-scale | All Debian/Ubuntu hosts using APT cache (⚠️ exclude truenas-scale and homeassistant from apt updates) |
|
|
| `all` | All hosts | Every host in inventory |
|
|
|
|
## Running Playbooks
|
|
|
|
### Basic Syntax
|
|
|
|
```bash
|
|
cd /home/homelab/organized/repos/homelab/ansible/automation/
|
|
|
|
ansible-playbook -i hosts.ini playbooks/<playbook-name>.yml
|
|
```
|
|
|
|
### Common Options
|
|
|
|
#### Target Specific Hosts
|
|
```bash
|
|
# Single host
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit homelab
|
|
|
|
# Multiple hosts
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "homelab,pi-5"
|
|
|
|
# All hosts in a group
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "rpi"
|
|
|
|
# All except specific hosts
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "all:!synology"
|
|
```
|
|
|
|
#### Check Mode (Dry Run)
|
|
Preview what would change without actually making changes:
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml --check
|
|
```
|
|
|
|
#### Verbose Output
|
|
Get more detailed information about what Ansible is doing:
|
|
```bash
|
|
# Basic verbose
|
|
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml -v
|
|
|
|
# More verbose (connection info)
|
|
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml -vv
|
|
|
|
# Very verbose (includes module info)
|
|
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml -vvv
|
|
|
|
# Debug level (everything)
|
|
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml -vvvv
|
|
```
|
|
|
|
#### Ask for Sudo Password
|
|
If SSH user doesn't have passwordless sudo:
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml --ask-become-pass
|
|
# or short form:
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml -K
|
|
```
|
|
|
|
#### Ask for SSH Password
|
|
If using password authentication instead of SSH keys:
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/system_info.yml --ask-pass
|
|
# or short form:
|
|
ansible-playbook -i hosts.ini playbooks/system_info.yml -k
|
|
```
|
|
|
|
## Common Workflows
|
|
|
|
### Weekly Maintenance Routine
|
|
|
|
```bash
|
|
cd /home/homelab/organized/repos/homelab/ansible/automation/
|
|
|
|
# 1. Check that all hosts are reachable
|
|
ansible-playbook -i hosts.ini playbooks/ansible_status_check.yml
|
|
|
|
# 2. Verify APT cache proxy is working
|
|
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml
|
|
|
|
# 3. Update all systems
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml
|
|
|
|
# 4. Clean up old packages
|
|
ansible-playbook -i hosts.ini playbooks/cleanup.yml
|
|
|
|
# 5. Check Tailscale connectivity
|
|
ansible-playbook -i hosts.ini playbooks/tailscale_health.yml
|
|
```
|
|
|
|
### Adding a New Host
|
|
|
|
```bash
|
|
# 1. Edit hosts.ini and add the new host to appropriate groups
|
|
nano hosts.ini
|
|
|
|
# 2. Test connectivity
|
|
ansible -i hosts.ini <new-host> -m ping
|
|
|
|
# 3. Add SSH keys
|
|
ansible-playbook -i hosts.ini playbooks/add_ssh_keys.yml --limit <new-host>
|
|
|
|
# 4. Configure APT proxy
|
|
ansible-playbook -i hosts.ini playbooks/configure_apt_proxy.yml --limit <new-host>
|
|
|
|
# 5. Install standard tools
|
|
ansible-playbook -i hosts.ini playbooks/install_tools.yml --limit <new-host>
|
|
|
|
# 6. Update system
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit <new-host>
|
|
```
|
|
|
|
### Troubleshooting a Host
|
|
|
|
```bash
|
|
# 1. Get system info
|
|
ansible-playbook -i hosts.ini playbooks/system_info.yml --limit <host>
|
|
|
|
# 2. Check Ansible status
|
|
ansible-playbook -i hosts.ini playbooks/ansible_status_check.yml --limit <host>
|
|
|
|
# 3. Check Tailscale connectivity
|
|
ansible-playbook -i hosts.ini playbooks/tailscale_health.yml --limit <host>
|
|
|
|
# 4. Verify APT configuration
|
|
ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml --limit <host>
|
|
```
|
|
|
|
## Ad-Hoc Commands
|
|
|
|
For quick one-off tasks, use ansible directly:
|
|
|
|
```bash
|
|
# Ping all hosts
|
|
ansible -i hosts.ini all -m ping
|
|
|
|
# Check disk space
|
|
ansible -i hosts.ini all -m shell -a "df -h" --become
|
|
|
|
# Restart a service
|
|
ansible -i hosts.ini homelab -m systemd -a "name=docker state=restarted" --become
|
|
|
|
# Check uptime
|
|
ansible -i hosts.ini all -m command -a "uptime"
|
|
|
|
# Get memory info
|
|
ansible -i hosts.ini all -m shell -a "free -h"
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Connection Issues
|
|
|
|
**Problem**: "Connection timeout" or "Host unreachable"
|
|
```bash
|
|
# Test direct ping
|
|
ping <host-ip>
|
|
|
|
# Test SSH manually
|
|
ssh <user>@<host-ip>
|
|
|
|
# Check Tailscale status
|
|
tailscale status
|
|
```
|
|
|
|
**Problem**: "Permission denied (publickey)"
|
|
```bash
|
|
# Add your SSH key to the host
|
|
ssh-copy-id <user>@<host-ip>
|
|
|
|
# Or use password authentication
|
|
ansible-playbook -i hosts.ini playbooks/<playbook>.yml -k
|
|
```
|
|
|
|
### Privilege Escalation Issues
|
|
|
|
**Problem**: "This command has to be run under the root user"
|
|
```bash
|
|
# Use --ask-become-pass
|
|
ansible-playbook -i hosts.ini playbooks/<playbook>.yml -K
|
|
|
|
# Or configure passwordless sudo on target host:
|
|
# sudo visudo
|
|
# Add: <user> ALL=(ALL) NOPASSWD:ALL
|
|
```
|
|
|
|
### Playbook Failures
|
|
|
|
**Problem**: Task fails on some hosts
|
|
```bash
|
|
# Run in verbose mode to see detailed errors
|
|
ansible-playbook -i hosts.ini playbooks/<playbook>.yml -vvv
|
|
|
|
# Use --limit to retry only failed hosts
|
|
ansible-playbook -i hosts.ini playbooks/<playbook>.yml --limit @/tmp/retry_hosts.txt
|
|
```
|
|
|
|
**Problem**: "Module not found"
|
|
```bash
|
|
# Update Ansible on control node
|
|
sudo apt update && sudo apt upgrade ansible -y
|
|
|
|
# Check Ansible version
|
|
ansible --version
|
|
```
|
|
|
|
### APT Update Failures
|
|
|
|
**Problem**: "Failed to update apt cache: unknown reason" (especially on Raspberry Pi)
|
|
```bash
|
|
# Often caused by missing GPG keys. Test manually:
|
|
ansible -i hosts.ini <host> -m shell -a "sudo apt-get update 2>&1" --become
|
|
|
|
# Fix missing GPG keys (InfluxDB example):
|
|
ansible -i hosts.ini <host> -m shell -a "curl -sL https://repos.influxdata.com/influxdata-archive_compat.key | sudo gpg --dearmor -o /etc/apt/trusted.gpg.d/influxdata.gpg" --become
|
|
|
|
# Workaround: Use shell commands instead of apt module
|
|
ansible -i hosts.ini <host> -m shell -a "sudo apt-get update && sudo apt-get upgrade -y" --become
|
|
```
|
|
|
|
**Problem**: TrueNAS apt update fails with "rc: -9" or package management disabled
|
|
```bash
|
|
# This is expected behavior - TrueNAS disables apt for system stability
|
|
# Solution: Update TrueNAS only through its web interface
|
|
# Exclude from playbooks:
|
|
ansible-playbook -i hosts.ini playbooks/update_system.yml --limit "all:!truenas-scale"
|
|
```
|
|
|
|
**Problem**: "Package lock" or "Unable to acquire dpkg lock"
|
|
```bash
|
|
# Check if another process is using apt
|
|
ansible -i hosts.ini <host> -m shell -a "sudo lsof /var/lib/dpkg/lock-frontend" --become
|
|
|
|
# Kill stuck apt processes (use with caution)
|
|
ansible -i hosts.ini <host> -m shell -a "sudo killall apt apt-get" --become
|
|
|
|
# Remove lock files if no process is running
|
|
ansible -i hosts.ini <host> -m shell -a "sudo rm /var/lib/dpkg/lock-frontend /var/lib/dpkg/lock" --become
|
|
```
|
|
|
|
### Inventory Issues
|
|
|
|
**Problem**: "Could not match supplied host pattern"
|
|
```bash
|
|
# List all hosts in inventory
|
|
ansible -i hosts.ini all --list-hosts
|
|
|
|
# List hosts in a specific group
|
|
ansible -i hosts.ini debian_clients --list-hosts
|
|
|
|
# Verify inventory file syntax
|
|
ansible-inventory -i hosts.ini --list
|
|
```
|
|
|
|
## Best Practices
|
|
|
|
1. **Always use version control**: Commit playbook changes to git
|
|
2. **Test with --check first**: Use dry-run mode for risky changes
|
|
3. **Start small**: Test on a single host before running on all hosts
|
|
4. **Document changes**: Add comments to playbooks explaining what they do
|
|
5. **Use tags**: Tag tasks for selective execution
|
|
6. **Keep playbooks idempotent**: Running multiple times should be safe
|
|
7. **Monitor logs**: Check `/var/log/ansible.log` on managed hosts
|
|
8. **Backup before major changes**: Create snapshots of important systems
|
|
|
|
## Security Considerations
|
|
|
|
1. **SSH Keys**: Use SSH keys instead of passwords when possible
|
|
2. **Vault**: Use Ansible Vault for sensitive data (passwords, API keys)
|
|
3. **Least Privilege**: Don't run playbooks with more privileges than needed
|
|
4. **Audit Trail**: Keep git history of all playbook changes
|
|
5. **Network Isolation**: Use Tailscale for secure communication
|
|
|
|
## Quick Reference Card
|
|
|
|
| Task | Command |
|
|
|------|---------|
|
|
| Update all systems | `ansible-playbook -i hosts.ini playbooks/update_system.yml` |
|
|
| Check APT proxy | `ansible-playbook -i hosts.ini playbooks/check_apt_proxy.yml` |
|
|
| Update Ansible | `ansible-playbook -i hosts.ini playbooks/update_ansible.yml` |
|
|
| Ping all hosts | `ansible -i hosts.ini all -m ping` |
|
|
| Get system info | `ansible-playbook -i hosts.ini playbooks/system_info.yml` |
|
|
| Clean up systems | `ansible-playbook -i hosts.ini playbooks/cleanup.yml` |
|
|
| Dry run (no changes) | `ansible-playbook -i hosts.ini playbooks/<playbook>.yml --check` |
|
|
| Verbose output | `ansible-playbook -i hosts.ini playbooks/<playbook>.yml -vvv` |
|
|
| Target one host | `ansible-playbook -i hosts.ini playbooks/<playbook>.yml --limit <host>` |
|
|
|
|
## Additional Resources
|
|
|
|
- [Ansible Documentation](https://docs.ansible.com/)
|
|
- [Ansible Best Practices](https://docs.ansible.com/ansible/latest/user_guide/playbooks_best_practices.html)
|
|
- [Ansible Galaxy](https://galaxy.ansible.com/) - Community roles and playbooks
|
|
- Repository: `/home/homelab/organized/repos/homelab/ansible/`
|
|
|
|
## Related Documentation
|
|
|
|
- [Git Branches Guide](./GIT_BRANCHES_GUIDE.md) - Version control for playbook changes
|
|
- [Infrastructure Overview](../infrastructure/MONITORING_ARCHITECTURE.md) - Homelab infrastructure details
|
|
- Ansible host vars: `/home/homelab/organized/repos/homelab/ansible/automation/host_vars/`
|