1154 lines
44 KiB
Markdown
1154 lines
44 KiB
Markdown
# New Playbooks Implementation Plan
|
|
|
|
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
|
|
|
|
**Goal:** Add 5 new Ansible playbooks covering network connectivity health, Proxmox management, TrueNAS health, NTP sync auditing, and cron job inventory.
|
|
|
|
**Architecture:** Each playbook is standalone, follows existing patterns (read-only shell tasks with `changed_when: false`, `failed_when: false` for non-fatal checks, ntfy alerting via `ntfy_url` var, JSON reports in `/tmp/<category>_reports/`). Platform detection is done inline via command availability checks rather than Ansible facts to keep cross-platform compatibility with Synology/TrueNAS.
|
|
|
|
**Tech Stack:** Ansible, bash shell commands, Tailscale CLI, Proxmox `qm`/`pct`/`pvesh` CLI, ZFS `zpool`/`zfs` tools, `chronyc`/`timedatectl`, `smartctl`, standard POSIX cron paths.
|
|
|
|
---
|
|
|
|
## Conventions to Follow (read this first)
|
|
|
|
These patterns appear in every existing playbook — match them exactly:
|
|
|
|
```yaml
|
|
# Read-only tasks always have:
|
|
changed_when: false
|
|
failed_when: false # (or ignore_errors: yes)
|
|
|
|
# Report directories:
|
|
delegate_to: localhost
|
|
run_once: true
|
|
|
|
# Variable defaults:
|
|
my_var: "{{ my_var | default('fallback') }}"
|
|
|
|
# Module names use fully-qualified form:
|
|
ansible.builtin.shell
|
|
ansible.builtin.debug
|
|
ansible.builtin.assert
|
|
|
|
# ntfy alerting (used in alert_check.yml — copy that pattern):
|
|
ntfy_url: "{{ ntfy_url | default('https://ntfy.sh/REDACTED_TOPIC') }}"
|
|
```
|
|
|
|
Reference files to read before each task:
|
|
- `playbooks/synology_health.yml` — pattern for platform-specific health checks
|
|
- `playbooks/tailscale_health.yml` — pattern for binary detection + JSON parsing
|
|
- `playbooks/disk_usage_report.yml` — pattern for threshold variables + report dirs
|
|
- `playbooks/alert_check.yml` — pattern for ntfy notifications
|
|
|
|
---
|
|
|
|
## Task 1: `network_connectivity.yml` — Full mesh connectivity check
|
|
|
|
**Files:**
|
|
- Create: `playbooks/network_connectivity.yml`
|
|
|
|
**What it does:** For every host in inventory, check Tailscale is Running, ping all other hosts by their `ansible_host` IP, test SSH port reachability, and verify HTTP endpoints for key services. Outputs a connectivity matrix and sends ntfy alert on failures.
|
|
|
|
**Step 1: Create the playbook file**
|
|
|
|
```yaml
|
|
---
|
|
# Network Connectivity Health Check
|
|
# Verifies Tailscale mesh connectivity between all inventory hosts
|
|
# and checks HTTP/HTTPS endpoints for key services.
|
|
#
|
|
# Usage: ansible-playbook -i hosts.ini playbooks/network_connectivity.yml
|
|
# Usage: ansible-playbook -i hosts.ini playbooks/network_connectivity.yml --limit homelab
|
|
|
|
- name: Network Connectivity Health Check
|
|
hosts: "{{ host_target | default('active') }}"
|
|
gather_facts: yes
|
|
ignore_unreachable: true
|
|
vars:
|
|
report_dir: "/tmp/connectivity_reports"
|
|
ts_candidates:
|
|
- /usr/bin/tailscale
|
|
- /var/packages/Tailscale/target/bin/tailscale
|
|
warn_on_failure: true
|
|
ntfy_url: "{{ ntfy_url | default('https://ntfy.sh/REDACTED_TOPIC') }}"
|
|
|
|
# HTTP endpoints to verify — add/remove per your services
|
|
http_endpoints:
|
|
- name: Portainer (homelab)
|
|
url: "http://100.67.40.126:9000"
|
|
- name: Gitea (homelab)
|
|
url: "http://100.67.40.126:3000"
|
|
- name: Immich (homelab)
|
|
url: "http://100.67.40.126:2283"
|
|
- name: Home Assistant
|
|
url: "http://100.112.186.90:8123"
|
|
|
|
tasks:
|
|
- name: Create connectivity report directory
|
|
ansible.builtin.file:
|
|
path: "{{ report_dir }}"
|
|
state: directory
|
|
mode: '0755'
|
|
delegate_to: localhost
|
|
run_once: true
|
|
|
|
# ── Tailscale status ──────────────────────────────────────────────
|
|
- name: Detect Tailscale binary
|
|
ansible.builtin.shell: |
|
|
for p in {{ ts_candidates | join(' ') }}; do
|
|
[ -x "$p" ] && echo "$p" && exit 0
|
|
done
|
|
echo ""
|
|
register: ts_bin
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Get Tailscale status JSON
|
|
ansible.builtin.command: "{{ ts_bin.stdout }} status --json"
|
|
register: ts_status_raw
|
|
changed_when: false
|
|
failed_when: false
|
|
when: ts_bin.stdout | length > 0
|
|
|
|
- name: Parse Tailscale state
|
|
ansible.builtin.set_fact:
|
|
ts_parsed: "{{ ts_status_raw.stdout | from_json }}"
|
|
ts_backend: "{{ (ts_status_raw.stdout | from_json).BackendState | default('unknown') }}"
|
|
ts_ip: "{{ ((ts_status_raw.stdout | from_json).Self.TailscaleIPs | default([]) | first) | default('n/a') }}"
|
|
when:
|
|
- ts_bin.stdout | length > 0
|
|
- ts_status_raw.rc | default(1) == 0
|
|
- ts_status_raw.stdout | default('') | length > 0
|
|
- ts_status_raw.stdout is search('{')
|
|
failed_when: false
|
|
|
|
# ── Peer reachability (ping each inventory host by Tailscale IP) ──
|
|
- name: Ping all inventory hosts
|
|
ansible.builtin.shell: |
|
|
ping -c 2 -W 2 {{ hostvars[item]['ansible_host'] }} > /dev/null 2>&1 && echo "OK" || echo "FAIL"
|
|
register: ping_results
|
|
changed_when: false
|
|
failed_when: false
|
|
loop: "{{ groups['active'] | select('ne', inventory_hostname) | list }}"
|
|
loop_control:
|
|
label: "{{ item }}"
|
|
|
|
- name: Summarise ping results
|
|
ansible.builtin.set_fact:
|
|
ping_summary: "{{ ping_summary | default({}) | combine({item.item: item.stdout | trim}) }}"
|
|
loop: "{{ ping_results.results }}"
|
|
loop_control:
|
|
label: "{{ item.item }}"
|
|
|
|
# ── SSH port check ────────────────────────────────────────────────
|
|
- name: Check SSH port on all inventory hosts
|
|
ansible.builtin.shell: |
|
|
port="{{ hostvars[item]['ansible_port'] | default(22) }}"
|
|
nc -zw3 {{ hostvars[item]['ansible_host'] }} "$port" > /dev/null 2>&1 && echo "OK" || echo "FAIL"
|
|
register: ssh_port_results
|
|
changed_when: false
|
|
failed_when: false
|
|
loop: "{{ groups['active'] | select('ne', inventory_hostname) | list }}"
|
|
loop_control:
|
|
label: "{{ item }}"
|
|
|
|
- name: Summarise SSH port results
|
|
ansible.builtin.set_fact:
|
|
ssh_summary: "{{ ssh_summary | default({}) | combine({item.item: item.stdout | trim}) }}"
|
|
loop: "{{ ssh_port_results.results }}"
|
|
loop_control:
|
|
label: "{{ item.item }}"
|
|
|
|
# ── HTTP endpoint checks (run once from localhost) ────────────────
|
|
- name: Check HTTP endpoints
|
|
ansible.builtin.uri:
|
|
url: "{{ item.url }}"
|
|
method: GET
|
|
status_code: [200, 301, 302, 401, 403]
|
|
timeout: 5
|
|
validate_certs: false
|
|
register: http_results
|
|
failed_when: false
|
|
loop: "{{ http_endpoints }}"
|
|
loop_control:
|
|
label: "{{ item.name }}"
|
|
delegate_to: localhost
|
|
run_once: true
|
|
|
|
# ── Connectivity summary ──────────────────────────────────────────
|
|
- name: Display connectivity summary per host
|
|
ansible.builtin.debug:
|
|
msg: |
|
|
═══ {{ inventory_hostname }} ═══
|
|
Tailscale: {{ ts_backend | default('not installed') }} | IP: {{ ts_ip | default('n/a') }}
|
|
Peer ping results:
|
|
{% for host, result in (ping_summary | default({})).items() %}
|
|
{{ host }}: {{ result }}
|
|
{% endfor %}
|
|
SSH port results:
|
|
{% for host, result in (ssh_summary | default({})).items() %}
|
|
{{ host }}: {{ result }}
|
|
{% endfor %}
|
|
|
|
- name: Display HTTP endpoint results
|
|
ansible.builtin.debug:
|
|
msg: |
|
|
═══ HTTP Endpoint Health ═══
|
|
{% for item in http_results.results | default([]) %}
|
|
{{ item.item.name }}: {{ 'OK (' + (item.status | string) + ')' if item.status is defined and item.status > 0 else 'FAIL' }}
|
|
{% endfor %}
|
|
run_once: true
|
|
delegate_to: localhost
|
|
|
|
# ── Alert on failures ─────────────────────────────────────────────
|
|
- name: Collect failed peers
|
|
ansible.builtin.set_fact:
|
|
failed_peers: >-
|
|
{{ (ping_summary | default({})).items() | selectattr('1', 'eq', 'FAIL') | map(attribute='0') | list }}
|
|
|
|
- name: Send ntfy alert for connectivity failures
|
|
ansible.builtin.uri:
|
|
url: "{{ ntfy_url }}"
|
|
method: POST
|
|
body: "Connectivity failures on {{ inventory_hostname }}: {{ failed_peers | join(', ') }}"
|
|
headers:
|
|
Title: "Homelab Network Alert"
|
|
Priority: "high"
|
|
Tags: "warning,network"
|
|
body_format: raw
|
|
status_code: [200, 204]
|
|
delegate_to: localhost
|
|
failed_when: false
|
|
when:
|
|
- warn_on_failure | bool
|
|
- failed_peers | length > 0
|
|
|
|
# ── Write JSON report ─────────────────────────────────────────────
|
|
- name: Write connectivity report
|
|
ansible.builtin.copy:
|
|
content: "{{ {'host': inventory_hostname, 'timestamp': ansible_date_time.iso8601, 'tailscale_state': ts_backend | default('unknown'), 'tailscale_ip': ts_ip | default('n/a'), 'ping': ping_summary | default({}), 'ssh_port': ssh_summary | default({})} | to_nice_json }}"
|
|
dest: "{{ report_dir }}/{{ inventory_hostname }}_{{ ansible_date_time.date }}.json"
|
|
delegate_to: localhost
|
|
changed_when: false
|
|
```
|
|
|
|
**Step 2: Validate YAML syntax**
|
|
|
|
```bash
|
|
cd /home/homelab/organized/repos/homelab/ansible/automation
|
|
ansible-playbook --syntax-check -i hosts.ini playbooks/network_connectivity.yml
|
|
```
|
|
Expected: `playbook: playbooks/network_connectivity.yml` with no errors.
|
|
|
|
**Step 3: Dry-run against one host**
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/network_connectivity.yml --limit homelab --check
|
|
```
|
|
Expected: Tasks run, no failures. Some tasks will report `skipped` (when conditions, etc.) — that's fine.
|
|
|
|
**Step 4: Run for real against one host**
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/network_connectivity.yml --limit homelab
|
|
```
|
|
Expected: Connectivity summary printed, report written to `/tmp/connectivity_reports/homelab_<date>.json`.
|
|
|
|
**Step 5: Run against all active hosts**
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/network_connectivity.yml
|
|
```
|
|
Expected: Summary for every host in `[active]` group. Unreachable hosts are handled gracefully (skipped, not errored).
|
|
|
|
**Step 6: Commit**
|
|
|
|
```bash
|
|
git add playbooks/network_connectivity.yml
|
|
git commit -m "feat: add network_connectivity playbook for full mesh health check"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 2: `proxmox_management.yml` — Proxmox VM/LXC inventory and health
|
|
|
|
**Files:**
|
|
- Create: `playbooks/proxmox_management.yml`
|
|
|
|
**What it does:** Targets the `pve` host. Reports VM inventory (`qm list`), LXC inventory (`pct list`), node resource summary, storage pool status, and last 10 task log entries. Optional snapshot action via `-e action=snapshot -e vm_id=100`.
|
|
|
|
**Note:** `pve` uses `ansible_user=root` (see `hosts.ini`), so `become: false` is correct here — root already has all access.
|
|
|
|
**Step 1: Create the playbook**
|
|
|
|
```yaml
|
|
---
|
|
# Proxmox VE Management Playbook
|
|
# Reports VM/LXC inventory, resource usage, storage pool status, and recent tasks.
|
|
# Optionally creates a snapshot with -e action=snapshot -e vm_id=100
|
|
#
|
|
# Usage: ansible-playbook -i hosts.ini playbooks/proxmox_management.yml
|
|
# Usage: ansible-playbook -i hosts.ini playbooks/proxmox_management.yml -e action=snapshot -e vm_id=100
|
|
|
|
- name: Proxmox VE Management
|
|
hosts: pve
|
|
gather_facts: yes
|
|
become: false
|
|
vars:
|
|
action: "{{ action | default('status') }}" # status | snapshot
|
|
vm_id: "{{ vm_id | default('') }}"
|
|
report_dir: "/tmp/health_reports"
|
|
|
|
tasks:
|
|
- name: Create report directory
|
|
ansible.builtin.file:
|
|
path: "{{ report_dir }}"
|
|
state: directory
|
|
mode: '0755'
|
|
delegate_to: localhost
|
|
run_once: true
|
|
|
|
# ── Node overview ─────────────────────────────────────────────────
|
|
- name: Get PVE version
|
|
ansible.builtin.command: pveversion
|
|
register: pve_version
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Get node resource summary
|
|
ansible.builtin.shell: |
|
|
pvesh get /nodes/$(hostname)/status --output-format json 2>/dev/null || \
|
|
echo '{"error": "pvesh not available"}'
|
|
register: node_status_raw
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Parse node status
|
|
ansible.builtin.set_fact:
|
|
node_status: "{{ node_status_raw.stdout | from_json }}"
|
|
failed_when: false
|
|
when: node_status_raw.stdout | default('') | length > 0
|
|
|
|
# ── VM inventory ──────────────────────────────────────────────────
|
|
- name: List all VMs
|
|
ansible.builtin.command: qm list
|
|
register: vm_list
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: List all LXC containers
|
|
ansible.builtin.command: pct list
|
|
register: lxc_list
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Count running VMs
|
|
ansible.builtin.shell: |
|
|
qm list 2>/dev/null | grep -c "running" || echo "0"
|
|
register: vm_running_count
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Count running LXCs
|
|
ansible.builtin.shell: |
|
|
pct list 2>/dev/null | grep -c "running" || echo "0"
|
|
register: lxc_running_count
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── Storage pools ─────────────────────────────────────────────────
|
|
- name: Get storage pool status
|
|
ansible.builtin.shell: |
|
|
pvesh get /nodes/$(hostname)/storage --output-format json 2>/dev/null | \
|
|
python3 -c "
|
|
import json,sys
|
|
data=json.load(sys.stdin)
|
|
for s in data:
|
|
used_pct = round(s.get('used',0) / s.get('total',1) * 100, 1) if s.get('total',0) > 0 else 0
|
|
print(f\"{s.get('storage','?'):20} {s.get('type','?'):10} used={used_pct}% avail={round(s.get('avail',0)/1073741824,1)}GiB\")
|
|
" 2>/dev/null || pvesm status 2>/dev/null || echo "Storage info unavailable"
|
|
register: storage_status
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── Recent task log ───────────────────────────────────────────────
|
|
- name: Get recent PVE tasks
|
|
ansible.builtin.shell: |
|
|
pvesh get /nodes/$(hostname)/tasks \
|
|
--limit 10 \
|
|
--output-format json 2>/dev/null | \
|
|
python3 -c "
|
|
import json,sys,datetime
|
|
tasks=json.load(sys.stdin)
|
|
for t in tasks:
|
|
ts=datetime.datetime.fromtimestamp(t.get('starttime',0)).strftime('%Y-%m-%d %H:%M')
|
|
status=t.get('status','?')
|
|
upid=t.get('upid','?')
|
|
print(f'{ts} {status:12} {upid}')
|
|
" 2>/dev/null || echo "Task log unavailable"
|
|
register: recent_tasks
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── Summary output ────────────────────────────────────────────────
|
|
- name: Display Proxmox summary
|
|
ansible.builtin.debug:
|
|
msg: |
|
|
═══ Proxmox VE — {{ inventory_hostname }} ═══
|
|
Version: {{ pve_version.stdout | default('unknown') }}
|
|
|
|
VMs: {{ vm_running_count.stdout | trim }} running
|
|
{{ vm_list.stdout | default('(no VMs)') | indent(2) }}
|
|
|
|
LXCs: {{ lxc_running_count.stdout | trim }} running
|
|
{{ lxc_list.stdout | default('(no LXCs)') | indent(2) }}
|
|
|
|
Storage Pools:
|
|
{{ storage_status.stdout | default('n/a') | indent(2) }}
|
|
|
|
Recent Tasks (last 10):
|
|
{{ recent_tasks.stdout | default('n/a') | indent(2) }}
|
|
|
|
# ── Optional: snapshot a VM ───────────────────────────────────────
|
|
- name: Create VM snapshot
|
|
ansible.builtin.shell: |
|
|
snap_name="ansible-snap-$(date +%Y%m%d-%H%M%S)"
|
|
qm snapshot {{ vm_id }} "$snap_name" --description "Ansible automated snapshot"
|
|
echo "Snapshot created: $snap_name for VM {{ vm_id }}"
|
|
register: snapshot_result
|
|
when:
|
|
- action == "snapshot"
|
|
- vm_id | string | length > 0
|
|
changed_when: true
|
|
|
|
- name: Show snapshot result
|
|
ansible.builtin.debug:
|
|
msg: "{{ snapshot_result.stdout | default('No snapshot taken') }}"
|
|
when: action == "snapshot"
|
|
|
|
# ── Write JSON report ─────────────────────────────────────────────
|
|
- name: Write Proxmox report
|
|
ansible.builtin.copy:
|
|
content: "{{ {'host': inventory_hostname, 'timestamp': ansible_date_time.iso8601, 'version': pve_version.stdout | default('unknown'), 'vms_running': vm_running_count.stdout | trim, 'lxcs_running': lxc_running_count.stdout | trim, 'storage': storage_status.stdout | default(''), 'tasks': recent_tasks.stdout | default('')} | to_nice_json }}"
|
|
dest: "{{ report_dir }}/proxmox_{{ ansible_date_time.date }}.json"
|
|
delegate_to: localhost
|
|
changed_when: false
|
|
```
|
|
|
|
**Step 2: Validate syntax**
|
|
|
|
```bash
|
|
ansible-playbook --syntax-check -i hosts.ini playbooks/proxmox_management.yml
|
|
```
|
|
Expected: no errors.
|
|
|
|
**Step 3: Run against pve**
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/proxmox_management.yml
|
|
```
|
|
Expected: Proxmox summary table printed. JSON report written to `/tmp/health_reports/proxmox_<date>.json`.
|
|
|
|
**Step 4: Test snapshot action (optional — only if you have a test VM)**
|
|
|
|
```bash
|
|
# Replace 100 with a real VM ID from the qm list output above
|
|
ansible-playbook -i hosts.ini playbooks/proxmox_management.yml -e action=snapshot -e vm_id=100
|
|
```
|
|
Expected: `Snapshot created: ansible-snap-<timestamp> for VM 100`
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add playbooks/proxmox_management.yml
|
|
git commit -m "feat: add proxmox_management playbook for PVE VM/LXC inventory and health"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 3: `truenas_health.yml` — TrueNAS SCALE ZFS and app health
|
|
|
|
**Files:**
|
|
- Create: `playbooks/truenas_health.yml`
|
|
|
|
**What it does:** Targets `truenas-scale`. Checks ZFS pool health, scrub status, dataset usage, SMART disk status, and running TrueNAS apps (k3s-based). Flags degraded/faulted pools. Mirrors `synology_health.yml` structure.
|
|
|
|
**Note:** TrueNAS SCALE runs on Debian. The `vish` user needs sudo for `smartctl` and `zpool`. Check `host_vars/truenas-scale.yml` — `ansible_become: true` is set in `group_vars/homelab_linux.yml` which covers all hosts.
|
|
|
|
**Step 1: Create the playbook**
|
|
|
|
```yaml
|
|
---
|
|
# TrueNAS SCALE Health Check
|
|
# Checks ZFS pool status, scrub health, dataset usage, SMART disk status, and app state.
|
|
# Mirrors synology_health.yml but for TrueNAS SCALE (Debian-based with ZFS).
|
|
#
|
|
# Usage: ansible-playbook -i hosts.ini playbooks/truenas_health.yml
|
|
|
|
- name: TrueNAS SCALE Health Check
|
|
hosts: truenas-scale
|
|
gather_facts: yes
|
|
become: true
|
|
vars:
|
|
disk_warn_pct: 80
|
|
disk_critical_pct: 90
|
|
report_dir: "/tmp/health_reports"
|
|
|
|
tasks:
|
|
- name: Create report directory
|
|
ansible.builtin.file:
|
|
path: "{{ report_dir }}"
|
|
state: directory
|
|
mode: '0755'
|
|
delegate_to: localhost
|
|
run_once: true
|
|
|
|
# ── System overview ───────────────────────────────────────────────
|
|
- name: Get system uptime
|
|
ansible.builtin.command: uptime -p
|
|
register: uptime_out
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Get TrueNAS version
|
|
ansible.builtin.shell: |
|
|
cat /etc/version 2>/dev/null || \
|
|
midclt call system.version 2>/dev/null || \
|
|
echo "version unavailable"
|
|
register: truenas_version
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── ZFS pool health ───────────────────────────────────────────────
|
|
- name: Get ZFS pool status
|
|
ansible.builtin.command: zpool status -v
|
|
register: zpool_status
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Get ZFS pool list (usage)
|
|
ansible.builtin.command: zpool list -H
|
|
register: zpool_list
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Check for degraded or faulted pools
|
|
ansible.builtin.shell: |
|
|
zpool status 2>/dev/null | grep -E "state:\s*(DEGRADED|FAULTED|OFFLINE|REMOVED)" | wc -l
|
|
register: pool_errors
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Assert no degraded pools
|
|
ansible.builtin.assert:
|
|
that:
|
|
- (pool_errors.stdout | trim | int) == 0
|
|
success_msg: "All ZFS pools ONLINE"
|
|
fail_msg: "DEGRADED or FAULTED pool detected — run: zpool status"
|
|
changed_when: false
|
|
ignore_errors: yes
|
|
|
|
# ── ZFS scrub status ──────────────────────────────────────────────
|
|
- name: Get last scrub info per pool
|
|
ansible.builtin.shell: |
|
|
for pool in $(zpool list -H -o name 2>/dev/null); do
|
|
echo "Pool: $pool"
|
|
zpool status "$pool" 2>/dev/null | grep -E "scrub|scan" | head -3
|
|
echo "---"
|
|
done
|
|
register: scrub_status
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── Dataset usage ─────────────────────────────────────────────────
|
|
- name: Get dataset usage (top-level datasets)
|
|
ansible.builtin.shell: |
|
|
zfs list -H -o name,used,avail,refer,mountpoint -d 1 2>/dev/null | head -20
|
|
register: dataset_usage
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── SMART disk status ─────────────────────────────────────────────
|
|
- name: List physical disks
|
|
ansible.builtin.shell: |
|
|
lsblk -d -o NAME,SIZE,MODEL,SERIAL 2>/dev/null | grep -v "loop\|sr" || \
|
|
ls /dev/sd? /dev/nvme?n? 2>/dev/null
|
|
register: disk_list
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Check SMART health for each disk
|
|
ansible.builtin.shell: |
|
|
failed=0
|
|
for disk in $(lsblk -d -n -o NAME 2>/dev/null | grep -v "loop\|sr"); do
|
|
result=$(smartctl -H /dev/$disk 2>/dev/null | grep -E "SMART overall-health|PASSED|FAILED" || echo "n/a")
|
|
echo "$disk: $result"
|
|
echo "$result" | grep -q "FAILED" && failed=$((failed+1))
|
|
done
|
|
exit $failed
|
|
register: smart_results
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── TrueNAS apps (k3s) ────────────────────────────────────────────
|
|
- name: Get TrueNAS app status
|
|
ansible.builtin.shell: |
|
|
if command -v k3s >/dev/null 2>&1; then
|
|
k3s kubectl get pods -A --no-headers 2>/dev/null | \
|
|
awk '{print $4}' | sort | uniq -c | sort -rn
|
|
elif command -v midclt >/dev/null 2>&1; then
|
|
midclt call chart.release.query 2>/dev/null | \
|
|
python3 -c "
|
|
import json,sys
|
|
try:
|
|
apps=json.load(sys.stdin)
|
|
for a in apps:
|
|
print(f\"{a.get('id','?'):30} {a.get('status','?')}\")
|
|
except:
|
|
print('App status unavailable')
|
|
" 2>/dev/null
|
|
else
|
|
echo "App runtime not detected (k3s/midclt not found)"
|
|
fi
|
|
register: app_status
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── Summary output ────────────────────────────────────────────────
|
|
- name: Display TrueNAS health summary
|
|
ansible.builtin.debug:
|
|
msg: |
|
|
═══ TrueNAS SCALE — {{ inventory_hostname }} ═══
|
|
Version : {{ truenas_version.stdout | default('unknown') | trim }}
|
|
Uptime : {{ uptime_out.stdout | default('n/a') }}
|
|
Pool errors: {{ pool_errors.stdout | trim | default('0') }}
|
|
|
|
ZFS Pool List:
|
|
{{ zpool_list.stdout | default('(none)') | indent(2) }}
|
|
|
|
ZFS Pool Status (degraded/faulted check):
|
|
Degraded pools found: {{ pool_errors.stdout | trim }}
|
|
|
|
Scrub Status:
|
|
{{ scrub_status.stdout | default('n/a') | indent(2) }}
|
|
|
|
Dataset Usage (top-level):
|
|
{{ dataset_usage.stdout | default('n/a') | indent(2) }}
|
|
|
|
SMART Disk Status:
|
|
{{ smart_results.stdout | default('n/a') | indent(2) }}
|
|
|
|
TrueNAS Apps:
|
|
{{ app_status.stdout | default('n/a') | indent(2) }}
|
|
|
|
# ── Write JSON report ─────────────────────────────────────────────
|
|
- name: Write TrueNAS health report
|
|
ansible.builtin.copy:
|
|
content: "{{ {'host': inventory_hostname, 'timestamp': ansible_date_time.iso8601, 'version': truenas_version.stdout | default('unknown') | trim, 'pool_errors': pool_errors.stdout | trim, 'zpool_list': zpool_list.stdout | default(''), 'scrub': scrub_status.stdout | default(''), 'smart': smart_results.stdout | default(''), 'apps': app_status.stdout | default('')} | to_nice_json }}"
|
|
dest: "{{ report_dir }}/truenas_{{ ansible_date_time.date }}.json"
|
|
delegate_to: localhost
|
|
changed_when: false
|
|
```
|
|
|
|
**Step 2: Validate syntax**
|
|
|
|
```bash
|
|
ansible-playbook --syntax-check -i hosts.ini playbooks/truenas_health.yml
|
|
```
|
|
Expected: no errors.
|
|
|
|
**Step 3: Run against truenas-scale**
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/truenas_health.yml
|
|
```
|
|
Expected: Health summary printed, pool status shown, SMART results visible. JSON report at `/tmp/health_reports/truenas_<date>.json`.
|
|
|
|
**Step 4: Commit**
|
|
|
|
```bash
|
|
git add playbooks/truenas_health.yml
|
|
git commit -m "feat: add truenas_health playbook for ZFS pool, scrub, SMART, and app status"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 4: `ntp_check.yml` — Time sync health audit
|
|
|
|
**Files:**
|
|
- Create: `playbooks/ntp_check.yml`
|
|
|
|
**What it does:** Checks time sync status across all hosts. Detects which NTP daemon is running, extracts current offset in milliseconds, warns at >500ms, critical at >1000ms. Sends ntfy alert for hosts exceeding warn threshold. Read-only — no config changes.
|
|
|
|
**Platform notes:**
|
|
- Ubuntu/Debian: `systemd-timesyncd` → use `timedatectl show-timesync` or `chronyc tracking`
|
|
- Synology: Uses its own NTP, check via `/proc/driver/rtc` or `synoinfo.conf` + `ntpq -p`
|
|
- TrueNAS: Debian-based, likely `chrony` or `systemd-timesyncd`
|
|
- Proxmox: Debian-based
|
|
|
|
**Step 1: Create the playbook**
|
|
|
|
```yaml
|
|
---
|
|
# NTP Time Sync Health Check
|
|
# Audits time synchronization across all hosts. Read-only — no config changes.
|
|
# Warns when offset > 500ms, critical > 1000ms.
|
|
#
|
|
# Usage: ansible-playbook -i hosts.ini playbooks/ntp_check.yml
|
|
# Usage: ansible-playbook -i hosts.ini playbooks/ntp_check.yml --limit synology
|
|
|
|
- name: NTP Time Sync Health Check
|
|
hosts: "{{ host_target | default('active') }}"
|
|
gather_facts: yes
|
|
ignore_unreachable: true
|
|
vars:
|
|
warn_offset_ms: 500
|
|
critical_offset_ms: 1000
|
|
ntfy_url: "{{ ntfy_url | default('https://ntfy.sh/REDACTED_TOPIC') }}"
|
|
report_dir: "/tmp/ntp_reports"
|
|
|
|
tasks:
|
|
- name: Create report directory
|
|
ansible.builtin.file:
|
|
path: "{{ report_dir }}"
|
|
state: directory
|
|
mode: '0755'
|
|
delegate_to: localhost
|
|
run_once: true
|
|
|
|
# ── Detect NTP daemon ─────────────────────────────────────────────
|
|
- name: Detect active NTP implementation
|
|
ansible.builtin.shell: |
|
|
if command -v chronyc >/dev/null 2>&1 && chronyc tracking >/dev/null 2>&1; then
|
|
echo "chrony"
|
|
elif timedatectl show-timesync 2>/dev/null | grep -q ServerName; then
|
|
echo "timesyncd"
|
|
elif timedatectl 2>/dev/null | grep -q "NTP service: active"; then
|
|
echo "timesyncd"
|
|
elif command -v ntpq >/dev/null 2>&1 && ntpq -p >/dev/null 2>&1; then
|
|
echo "ntpd"
|
|
else
|
|
echo "unknown"
|
|
fi
|
|
register: ntp_impl
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── Get offset (chrony) ───────────────────────────────────────────
|
|
- name: Get chrony tracking info
|
|
ansible.builtin.shell: chronyc tracking 2>/dev/null
|
|
register: chrony_tracking
|
|
changed_when: false
|
|
failed_when: false
|
|
when: ntp_impl.stdout | trim == "chrony"
|
|
|
|
- name: Parse chrony offset (ms)
|
|
ansible.builtin.shell: |
|
|
chronyc tracking 2>/dev/null | \
|
|
grep "System time" | \
|
|
awk '{printf "%.3f", $4 * 1000}'
|
|
register: chrony_offset_ms
|
|
changed_when: false
|
|
failed_when: false
|
|
when: ntp_impl.stdout | trim == "chrony"
|
|
|
|
- name: Get chrony sync source
|
|
ansible.builtin.shell: |
|
|
chronyc sources -v 2>/dev/null | grep "^\^" | head -3
|
|
register: chrony_sources
|
|
changed_when: false
|
|
failed_when: false
|
|
when: ntp_impl.stdout | trim == "chrony"
|
|
|
|
# ── Get offset (systemd-timesyncd) ────────────────────────────────
|
|
- name: Get timesyncd status
|
|
ansible.builtin.shell: timedatectl show-timesync 2>/dev/null || timedatectl 2>/dev/null
|
|
register: timesyncd_info
|
|
changed_when: false
|
|
failed_when: false
|
|
when: ntp_impl.stdout | trim == "timesyncd"
|
|
|
|
- name: Parse timesyncd offset (ms)
|
|
ansible.builtin.shell: |
|
|
# timesyncd doesn't expose offset cleanly — use systemd journal instead
|
|
# Fall back to 0 if not available
|
|
journalctl -u systemd-timesyncd --since "1 hour ago" --no-pager 2>/dev/null | \
|
|
grep -oE "offset [+-]?[0-9]+(\.[0-9]+)?(ms|us|s)" | tail -1 | \
|
|
awk '{
|
|
val=$2; unit=$3;
|
|
gsub(/[^0-9.-]/,"",val);
|
|
if (unit=="us") printf "%.3f", val/1000;
|
|
else if (unit=="s") printf "%.3f", val*1000;
|
|
else printf "%.3f", val;
|
|
}' || echo "0"
|
|
register: timesyncd_offset_ms
|
|
changed_when: false
|
|
failed_when: false
|
|
when: ntp_impl.stdout | trim == "timesyncd"
|
|
|
|
# ── Get offset (ntpd) ─────────────────────────────────────────────
|
|
- name: Get ntpq peers
|
|
ansible.builtin.shell: ntpq -pn 2>/dev/null | head -10
|
|
register: ntpq_peers
|
|
changed_when: false
|
|
failed_when: false
|
|
when: ntp_impl.stdout | trim == "ntpd"
|
|
|
|
- name: Parse ntpq offset (ms)
|
|
ansible.builtin.shell: |
|
|
# offset is column 9 in ntpq -p output (milliseconds)
|
|
ntpq -p 2>/dev/null | awk 'NR>2 && /^\*/ {printf "%.3f", $9; exit}' || echo "0"
|
|
register: ntpq_offset_ms
|
|
changed_when: false
|
|
failed_when: false
|
|
when: ntp_impl.stdout | trim == "ntpd"
|
|
|
|
# ── Consolidate offset ────────────────────────────────────────────
|
|
- name: Set unified offset fact
|
|
ansible.builtin.set_fact:
|
|
ntp_offset_ms: >-
|
|
{{
|
|
(chrony_offset_ms.stdout | default('0')) | float
|
|
if ntp_impl.stdout | trim == 'chrony'
|
|
else (timesyncd_offset_ms.stdout | default('0')) | float
|
|
if ntp_impl.stdout | trim == 'timesyncd'
|
|
else (ntpq_offset_ms.stdout | default('0')) | float
|
|
}}
|
|
ntp_raw_info: >-
|
|
{{
|
|
chrony_tracking.stdout | default('')
|
|
if ntp_impl.stdout | trim == 'chrony'
|
|
else timesyncd_info.stdout | default('')
|
|
if ntp_impl.stdout | trim == 'timesyncd'
|
|
else ntpq_peers.stdout | default('')
|
|
}}
|
|
|
|
- name: Determine sync status
|
|
ansible.builtin.set_fact:
|
|
ntp_status: >-
|
|
{{
|
|
'CRITICAL' if (ntp_offset_ms | abs) >= critical_offset_ms
|
|
else 'WARN' if (ntp_offset_ms | abs) >= warn_offset_ms
|
|
else 'OK'
|
|
}}
|
|
|
|
# ── Per-host summary ──────────────────────────────────────────────
|
|
- name: Display NTP summary
|
|
ansible.builtin.debug:
|
|
msg: |
|
|
═══ {{ inventory_hostname }} ═══
|
|
NTP daemon : {{ ntp_impl.stdout | trim | default('unknown') }}
|
|
Offset : {{ ntp_offset_ms }} ms
|
|
Status : {{ ntp_status }}
|
|
Details :
|
|
{{ ntp_raw_info | indent(2) }}
|
|
|
|
# ── Alert on warn/critical ────────────────────────────────────────
|
|
- name: Send ntfy alert for NTP issues
|
|
ansible.builtin.uri:
|
|
url: "{{ ntfy_url }}"
|
|
method: POST
|
|
body: "NTP {{ ntp_status }} on {{ inventory_hostname }}: offset={{ ntp_offset_ms }}ms (threshold={{ warn_offset_ms }}ms)"
|
|
headers:
|
|
Title: "Homelab NTP Alert"
|
|
Priority: "{{ 'urgent' if ntp_status == 'CRITICAL' else 'high' }}"
|
|
Tags: "warning,clock"
|
|
body_format: raw
|
|
status_code: [200, 204]
|
|
delegate_to: localhost
|
|
failed_when: false
|
|
when: ntp_status in ['WARN', 'CRITICAL']
|
|
|
|
# ── Write JSON report ─────────────────────────────────────────────
|
|
- name: Write NTP report
|
|
ansible.builtin.copy:
|
|
content: "{{ {'host': inventory_hostname, 'timestamp': ansible_date_time.iso8601, 'ntp_daemon': ntp_impl.stdout | trim, 'offset_ms': ntp_offset_ms, 'status': ntp_status} | to_nice_json }}"
|
|
dest: "{{ report_dir }}/{{ inventory_hostname }}_{{ ansible_date_time.date }}.json"
|
|
delegate_to: localhost
|
|
changed_when: false
|
|
```
|
|
|
|
**Step 2: Validate syntax**
|
|
|
|
```bash
|
|
ansible-playbook --syntax-check -i hosts.ini playbooks/ntp_check.yml
|
|
```
|
|
Expected: no errors.
|
|
|
|
**Step 3: Run against one host**
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/ntp_check.yml --limit homelab
|
|
```
|
|
Expected: NTP daemon detected, offset printed, status OK/WARN/CRITICAL.
|
|
|
|
**Step 4: Run across all hosts**
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/ntp_check.yml
|
|
```
|
|
Expected: Summary for every active host. Synology hosts may report `unknown` for daemon — that's acceptable (they have NTP but expose it differently).
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add playbooks/ntp_check.yml
|
|
git commit -m "feat: add ntp_check playbook for time sync drift auditing across all hosts"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 5: `cron_audit.yml` — Scheduled task inventory
|
|
|
|
**Files:**
|
|
- Create: `playbooks/cron_audit.yml`
|
|
|
|
**What it does:** Inventories all scheduled tasks across every host: system crontabs, user crontabs, and systemd timer units. Flags potential security issues (root cron jobs referencing world-writable paths, missing-file paths). Outputs per-host JSON.
|
|
|
|
**Step 1: Create the playbook**
|
|
|
|
```yaml
|
|
---
|
|
# Cron and Scheduled Task Audit
|
|
# Inventories crontabs and systemd timers across all hosts.
|
|
# Flags security concerns: root crons with world-writable path references.
|
|
#
|
|
# Usage: ansible-playbook -i hosts.ini playbooks/cron_audit.yml
|
|
# Usage: ansible-playbook -i hosts.ini playbooks/cron_audit.yml --limit homelab
|
|
|
|
- name: Cron and Scheduled Task Audit
|
|
hosts: "{{ host_target | default('active') }}"
|
|
gather_facts: yes
|
|
ignore_unreachable: true
|
|
vars:
|
|
report_dir: "/tmp/cron_audit"
|
|
|
|
tasks:
|
|
- name: Create audit report directory
|
|
ansible.builtin.file:
|
|
path: "{{ report_dir }}"
|
|
state: directory
|
|
mode: '0755'
|
|
delegate_to: localhost
|
|
run_once: true
|
|
|
|
# ── System crontabs ───────────────────────────────────────────────
|
|
- name: Read /etc/crontab
|
|
ansible.builtin.shell: cat /etc/crontab 2>/dev/null || echo "(not present)"
|
|
register: etc_crontab
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Read /etc/cron.d/ entries
|
|
ansible.builtin.shell: |
|
|
for f in /etc/cron.d/*; do
|
|
[ -f "$f" ] || continue
|
|
echo "=== $f ==="
|
|
cat "$f"
|
|
echo ""
|
|
done
|
|
register: cron_d_entries
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Read /etc/cron.{hourly,daily,weekly,monthly} scripts
|
|
ansible.builtin.shell: |
|
|
for dir in hourly daily weekly monthly; do
|
|
path="/etc/cron.$dir"
|
|
[ -d "$path" ] || continue
|
|
scripts=$(ls "$path" 2>/dev/null)
|
|
if [ -n "$scripts" ]; then
|
|
echo "=== /etc/cron.$dir ==="
|
|
echo "$scripts"
|
|
fi
|
|
done
|
|
register: cron_dirs
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── User crontabs ─────────────────────────────────────────────────
|
|
- name: List users with crontabs
|
|
ansible.builtin.shell: |
|
|
if [ -d /var/spool/cron/crontabs ]; then
|
|
ls /var/spool/cron/crontabs/ 2>/dev/null
|
|
elif [ -d /var/spool/cron ]; then
|
|
ls /var/spool/cron/ 2>/dev/null | grep -v atjobs
|
|
else
|
|
echo "(crontab spool not found)"
|
|
fi
|
|
register: users_with_crontabs
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
- name: Dump user crontabs
|
|
ansible.builtin.shell: |
|
|
spool_dir=""
|
|
[ -d /var/spool/cron/crontabs ] && spool_dir=/var/spool/cron/crontabs
|
|
[ -d /var/spool/cron ] && [ -z "$spool_dir" ] && spool_dir=/var/spool/cron
|
|
|
|
if [ -z "$spool_dir" ]; then
|
|
echo "(no spool directory found)"
|
|
exit 0
|
|
fi
|
|
|
|
for user_file in "$spool_dir"/*; do
|
|
[ -f "$user_file" ] || continue
|
|
user=$(basename "$user_file")
|
|
echo "=== crontab for: $user ==="
|
|
cat "$user_file" 2>/dev/null
|
|
echo ""
|
|
done
|
|
register: user_crontabs
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── Systemd timers ────────────────────────────────────────────────
|
|
- name: List systemd timers
|
|
ansible.builtin.shell: |
|
|
if command -v systemctl >/dev/null 2>&1; then
|
|
systemctl list-timers --all --no-pager 2>/dev/null || echo "(systemd not available)"
|
|
else
|
|
echo "(not a systemd host)"
|
|
fi
|
|
register: systemd_timers
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── Security flags ────────────────────────────────────────────────
|
|
- name: REDACTED_APP_PASSWORD referencing world-writable paths
|
|
ansible.builtin.shell: |
|
|
# Gather all root cron entries
|
|
{
|
|
cat /etc/crontab 2>/dev/null
|
|
cat /etc/cron.d/* 2>/dev/null
|
|
spool=""
|
|
[ -d /var/spool/cron/crontabs ] && spool=/var/spool/cron/crontabs
|
|
[ -d /var/spool/cron ] && spool=/var/spool/cron
|
|
[ -n "$spool" ] && cat "$spool/root" 2>/dev/null
|
|
} | grep -v "^#" | grep -v "^$" > /tmp/_cron_lines.txt
|
|
|
|
found=0
|
|
while IFS= read -r line; do
|
|
# Extract script/binary paths from the cron command
|
|
cmd=$(echo "$line" | awk '{for(i=6;i<=NF;i++) printf $i" "; print ""}' | awk '{print $1}')
|
|
if [ -n "$cmd" ] && [ -f "$cmd" ]; then
|
|
perms=$(stat -c "%a" "$cmd" 2>/dev/null || echo "")
|
|
if echo "$perms" | grep -qE "^[0-9][0-9][2367]$"; then
|
|
echo "FLAGGED: $cmd is world-writable — used in cron: $line"
|
|
found=$((found+1))
|
|
fi
|
|
fi
|
|
done < /tmp/_cron_lines.txt
|
|
rm -f /tmp/_cron_lines.txt
|
|
|
|
[ "$found" -eq 0 ] && echo "No world-writable cron script paths found"
|
|
exit 0
|
|
register: security_flags
|
|
changed_when: false
|
|
failed_when: false
|
|
|
|
# ── Summary ───────────────────────────────────────────────────────
|
|
- name: Display cron audit summary
|
|
ansible.builtin.debug:
|
|
msg: |
|
|
═══ Cron Audit — {{ inventory_hostname }} ═══
|
|
|
|
/etc/crontab:
|
|
{{ etc_crontab.stdout | default('(empty)') | indent(2) }}
|
|
|
|
/etc/cron.d/:
|
|
{{ cron_d_entries.stdout | default('(empty)') | indent(2) }}
|
|
|
|
Cron directories (/etc/cron.{hourly,daily,weekly,monthly}):
|
|
{{ cron_dirs.stdout | default('(empty)') | indent(2) }}
|
|
|
|
Users with crontabs: {{ users_with_crontabs.stdout | default('(none)') | trim }}
|
|
|
|
User crontab contents:
|
|
{{ user_crontabs.stdout | default('(none)') | indent(2) }}
|
|
|
|
Systemd timers:
|
|
{{ systemd_timers.stdout | default('(none)') | indent(2) }}
|
|
|
|
Security flags:
|
|
{{ security_flags.stdout | default('(none)') | indent(2) }}
|
|
|
|
# ── Write JSON report ─────────────────────────────────────────────
|
|
- name: Write cron audit report
|
|
ansible.builtin.copy:
|
|
content: "{{ {'host': inventory_hostname, 'timestamp': ansible_date_time.iso8601, 'etc_crontab': etc_crontab.stdout | default(''), 'cron_d': cron_d_entries.stdout | default(''), 'cron_dirs': cron_dirs.stdout | default(''), 'users_with_crontabs': users_with_crontabs.stdout | default(''), 'user_crontabs': user_crontabs.stdout | default(''), 'systemd_timers': systemd_timers.stdout | default(''), 'security_flags': security_flags.stdout | default('')} | to_nice_json }}"
|
|
dest: "{{ report_dir }}/{{ inventory_hostname }}_{{ ansible_date_time.date }}.json"
|
|
delegate_to: localhost
|
|
changed_when: false
|
|
```
|
|
|
|
**Step 2: Validate syntax**
|
|
|
|
```bash
|
|
ansible-playbook --syntax-check -i hosts.ini playbooks/cron_audit.yml
|
|
```
|
|
Expected: no errors.
|
|
|
|
**Step 3: Run against one host**
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/cron_audit.yml --limit homelab
|
|
```
|
|
Expected: Cron entries and systemd timers displayed. Security flags report shown.
|
|
|
|
**Step 4: Run across all hosts**
|
|
|
|
```bash
|
|
ansible-playbook -i hosts.ini playbooks/cron_audit.yml
|
|
```
|
|
Expected: Summary per host. Reports written to `/tmp/cron_audit/`.
|
|
|
|
**Step 5: Commit**
|
|
|
|
```bash
|
|
git add playbooks/cron_audit.yml
|
|
git commit -m "feat: add cron_audit playbook for scheduled task inventory across all hosts"
|
|
```
|
|
|
|
---
|
|
|
|
## Task 6: Update README.md
|
|
|
|
**Files:**
|
|
- Modify: `README.md`
|
|
|
|
**Step 1: Add the 5 new playbooks to the relevant tables in README.md**
|
|
|
|
Add to the Health & Monitoring table:
|
|
```markdown
|
|
| **`network_connectivity.yml`** | Full mesh Tailscale + SSH + HTTP endpoint health | Daily | ✅ |
|
|
| **`ntp_check.yml`** | Time sync drift audit with ntfy alerts | Daily | ✅ |
|
|
```
|
|
|
|
Add a new "Platform Management" section (after Advanced Container Management):
|
|
```markdown
|
|
### 🖥️ Platform Management (3 playbooks)
|
|
| Playbook | Purpose | Usage | Multi-System |
|
|
|----------|---------|-------|--------------|
|
|
| `synology_health.yml` | Synology NAS health (DSM, RAID, Tailscale) | Monthly | Synology only |
|
|
| **`proxmox_management.yml`** | 🆕 PVE VM/LXC inventory, storage pools, snapshots | Weekly | PVE only |
|
|
| **`truenas_health.yml`** | 🆕 ZFS pool health, scrub, SMART, app status | Weekly | TrueNAS only |
|
|
```
|
|
|
|
Add to the Security & Maintenance table:
|
|
```markdown
|
|
| **`cron_audit.yml`** | 🆕 Scheduled task inventory + security flags | Monthly | ✅ |
|
|
```
|
|
|
|
**Step 2: Update the total playbook count at the bottom**
|
|
|
|
Change: `33 playbooks` → `38 playbooks`
|
|
|
|
**Step 3: Commit**
|
|
|
|
```bash
|
|
git add README.md
|
|
git commit -m "docs: update README with 5 new playbooks"
|
|
```
|