# NTFY Notification System Documentation ## Overview The homelab uses a comprehensive notification system built around NTFY (a simple HTTP-based pub-sub notification service) with multiple bridges and integrations for different notification channels. ## Architecture ### Core Components 1. **NTFY Server** - Main notification hub 2. **NTFY Bridge** - Connects Alertmanager to NTFY 3. **Signal Bridge** - Forwards NTFY notifications to Signal messenger 4. **Gitea NTFY Bridge** - Sends Git repository events to NTFY ### Container Stack All notification components are deployed via Docker Compose in the alerting stack: ```yaml # Location: /home/homelab/docker/monitoring/homelab_vm/alerting.yaml services: ntfy: image: binwiederhier/ntfy:latest container_name: ntfy command: serve volumes: - /home/homelab/docker/monitoring/homelab_vm/ntfy:/var/lib/ntfy ports: - "8080:80" environment: - NTFY_BASE_URL=http://homelab.vish.local:8080 - NTFY_CACHE_FILE=/var/lib/ntfy/cache.db - NTFY_AUTH_FILE=/var/lib/ntfy/auth.db - NTFY_ATTACHMENT_CACHE_DIR=/var/lib/ntfy/attachments restart: unless-stopped networks: - alerting ntfy-bridge: image: xenrox/ntfy-alertmanager:latest container_name: ntfy-bridge environment: - NTFY_TOPIC="REDACTED_NTFY_TOPIC" - NTFY_URL=http://ntfy:80 - NTFY_USER= - NTFY_PASSWORD= "REDACTED_PASSWORD" - "8081:8080" restart: unless-stopped networks: - alerting signal-bridge: image: bbernhard/signal-cli-rest-api:latest container_name: signal-bridge ports: - "8082:8080" environment: - MODE=json-rpc volumes: - /home/homelab/docker/monitoring/homelab_vm/signal-data:/home/.local/share/signal-cli restart: unless-stopped networks: - alerting ``` ## Configuration Files ### NTFY Server Configuration **Location**: `/home/homelab/docker/monitoring/homelab_vm/ntfy/server.yml` ```yaml # Basic server configuration base-url: "http://homelab.vish.local:8080" listen-http: ":80" cache-file: "/var/lib/ntfy/cache.db" auth-file: "/var/lib/ntfy/auth.db" attachment-cache-dir: "/var/lib/ntfy/attachments" # Authentication and access control auth-default-access: "deny-all" enable-signup: false enable-login: true # Rate limiting visitor-request-limit-burst: 60 visitor-request-limit-replenish: "5s" # Message limits message-limit: 4096 attachment-file-size-limit: "15M" attachment-total-size-limit: "100M" # Retention cache-duration: "12h" keepalive-interval: "45s" manager-interval: "1m" # Topics and subscriptions topics: - name: "alerts" description: "System alerts from Prometheus/Alertmanager" - name: "gitea" description: "Git repository notifications" - name: "monitoring" description: "Infrastructure monitoring alerts" ``` ### Alertmanager Integration **Location**: `/home/homelab/docker/monitoring/alerting/alertmanager/alertmanager.yml` ```yaml global: smtp_smarthost: 'localhost:587' smtp_from: 'alertmanager@homelab.local' route: group_by: ['alertname'] group_wait: 10s group_interval: 10s repeat_interval: 1h receiver: 'web.hook' receivers: - name: 'web.hook' webhook_configs: - url: 'http://ntfy-bridge:8080/alerts' send_resolved: true http_config: basic_auth: username: '' password: '' inhibit_rules: - source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname', 'dev', 'instance'] ``` ### Prometheus Alert Rules **Location**: `/home/homelab/docker/monitoring/alerting/alert-rules.yml` Key alert rules that trigger NTFY notifications: ```yaml groups: - name: system.rules rules: - alert: InstanceDown expr: up == 0 for: 1m labels: severity: critical annotations: summary: "Instance {{ $labels.instance }} down" description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minute." - alert: HighCPUUsage expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80 for: 2m labels: severity: warning annotations: summary: "High CPU usage on {{ $labels.instance }}" description: "CPU usage is above 80% for more than 2 minutes." - alert: HighMemoryUsage expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > 90 for: 2m labels: severity: critical annotations: summary: "High memory usage on {{ $labels.instance }}" description: "Memory usage is above 90% for more than 2 minutes." - alert: DiskSpaceLow expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < 10 for: 1m labels: severity: critical annotations: summary: "Low disk space on {{ $labels.instance }}" description: "Disk space is below 10% on root filesystem." ``` ## Notification Channels ### 1. NTFY Web Interface - **URL**: http://homelab.vish.local:8080 - **Topics**: - `alerts` - System monitoring alerts - `gitea` - Git repository events - `monitoring` - Infrastructure status ### 2. Signal Messenger Integration - **Bridge Container**: signal-bridge - **Port**: 8082 - **Configuration**: `/home/homelab/docker/monitoring/homelab_vm/signal-data/` ### 3. Gitea Integration - **Bridge Container**: gitea-ntfy-bridge - **Configuration**: `/home/homelab/docker/monitoring/homelab_vm/gitea-ntfy-bridge/` ## Current Monitoring Targets The Prometheus instance monitors the following nodes: ```yaml # From /home/homelab/docker/monitoring/prometheus/prometheus.yml scrape_configs: - job_name: "prometheus" static_configs: - targets: ["localhost:9090"] - job_name: "alertmanager" static_configs: - targets: ["alertmanager:9093"] - job_name: "node-exporter" static_configs: - targets: ["localhost:9100"] - job_name: "calypso-node" static_configs: - targets: ["100.75.252.64:9100"] - job_name: "seattle-node" static_configs: - targets: ["100.82.197.124:9100"] - job_name: "proxmox-node" static_configs: - targets: ["100.87.12.28:9100"] ``` ## How to Modify Notifications ### 1. Adding New Alert Rules Edit the alert rules file: ```bash sudo nano /home/homelab/docker/monitoring/alerting/alert-rules.yml ``` Example new rule: ```yaml - alert: ServiceDown expr: up{job="my-service"} == 0 for: 30s labels: severity: warning annotations: summary: "Service {{ $labels.job }} is down" description: "The service {{ $labels.job }} on {{ $labels.instance }} has been down for more than 30 seconds." ``` ### 2. Modifying Notification Routing Edit Alertmanager configuration: ```bash sudo nano /home/homelab/docker/monitoring/alerting/alertmanager/alertmanager.yml ``` ### 3. Adding New NTFY Topics Edit NTFY server configuration: ```bash sudo nano /home/homelab/docker/monitoring/homelab_vm/ntfy/server.yml ``` ### 4. Changing Notification Thresholds Modify the alert expressions in `alert-rules.yml`. Common patterns: - **CPU Usage**: `expr: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > THRESHOLD` - **Memory Usage**: `expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 > THRESHOLD` - **Disk Usage**: `expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) * 100 < THRESHOLD` ### 5. Reloading Configuration After making changes: ```bash # Reload Prometheus configuration curl -X POST http://localhost:9090/-/reload # Reload Alertmanager configuration curl -X POST http://localhost:9093/-/reload # Restart NTFY if server config changed cd /home/homelab/docker/monitoring docker compose -f homelab_vm/alerting.yaml restart ntfy ``` ## Testing Notifications ### Manual Test via NTFY API ```bash # Send test notification curl -d "Test notification from homelab" http://homelab.vish.local:8080/alerts # Send with priority and tags curl -H "Priority: urgent" -H "Tags: warning,test" -d "High priority test" http://homelab.vish.local:8080/alerts ``` ### Test Alert Rules ```bash # Trigger a test alert by stopping a service temporarily sudo systemctl stop node_exporter # Wait for alert to fire, then restart sudo systemctl start node_exporter ``` ### Verify Alert Flow 1. **Prometheus** scrapes metrics and evaluates rules 2. **Alertmanager** receives alerts and routes them 3. **NTFY Bridge** converts alerts to NTFY messages 4. **NTFY Server** publishes to subscribed topics 5. **Signal Bridge** forwards to Signal messenger (if configured) ## Troubleshooting ### Common Issues 1. **Alerts not firing**: Check Prometheus targets are up 2. **Notifications not received**: Verify NTFY bridge connectivity 3. **Signal not working**: Check Signal bridge registration ### Useful Commands ```bash # Check container status docker ps | grep -E "(ntfy|alert|signal)" # View logs docker logs ntfy docker logs ntfy-bridge docker logs alertmanager # Test connectivity curl http://homelab.vish.local:8080/v1/health curl http://localhost:9093/-/healthy curl http://localhost:9090/-/healthy ``` ### Log Locations - **NTFY**: `docker logs ntfy` - **Alertmanager**: `docker logs alertmanager` - **Prometheus**: `docker logs prometheus` - **NTFY Bridge**: `docker logs ntfy-bridge` ## Security Considerations 1. **Authentication**: NTFY server has authentication enabled 2. **Network**: All services run on internal Docker network 3. **Access Control**: Default access is deny-all 4. **Rate Limiting**: Configured to prevent abuse ## Backup and Recovery ### Important Files to Backup - `/home/homelab/docker/monitoring/homelab_vm/ntfy/` - NTFY data - `/home/homelab/docker/monitoring/alerting/` - Alert configurations - `/home/homelab/docker/monitoring/prometheus/` - Prometheus config ### Recovery Process 1. Restore configuration files 2. Restart containers: `docker compose -f homelab_vm/alerting.yaml up -d` 3. Verify all services are healthy 4. Test notification flow ## Maintenance ### Regular Tasks 1. **Weekly**: Check alert rule effectiveness 2. **Monthly**: Review notification volumes 3. **Quarterly**: Update container images 4. **Annually**: Review and update alert thresholds ### Monitoring the Monitoring - Monitor NTFY server uptime - Track alert volume and patterns - Verify notification delivery - Check for false positives/negatives --- **Last Updated**: February 15, 2026 **Maintainer**: Homelab Administrator **Version**: 1.0