# Tailscale Host Monitoring Status Report > **โš ๏ธ Historical Snapshot**: This document was generated on Feb 15, 2026. The alerts and offline status listed here are no longer current. For live node status, run `tailscale status` on the homelab VM or check Grafana at `http://100.67.40.126:3000`. ## ๐Ÿ“Š Status Snapshot **Generated:** February 15, 2026 ### Monitored Tailscale Hosts (13 total) #### โœ… Online Hosts (10) - **atlantis-node** (100.83.230.112:9100) - Synology NAS - **atlantis-snmp** (100.83.230.112) - SNMP monitoring - **calypso-node** (100.103.48.78:9100) - Node exporter - **calypso-snmp** (100.103.48.78) - SNMP monitoring - **concord-nuc-node** (100.72.55.21:9100) - Intel NUC - **proxmox-node** (100.87.12.28:9100) - Proxmox server - **raspberry-pis** (100.77.151.40:9100) - Pi cluster node - **setillo-node** (100.125.0.20:9100) - Node exporter - **setillo-snmp** (100.125.0.20) - SNMP monitoring - **truenas-node** (100.75.252.64:9100) - TrueNAS server #### โŒ Offline Hosts (3) - **homelab-node** (100.67.40.126:9100) - Main homelab VM - **raspberry-pis** (100.123.246.75:9100) - Pi cluster node - **vmi2076105-node** (100.99.156.20:9100) - VPS instance ## ๐Ÿšจ Active Alerts ### Critical HostDown Alerts (2 firing) 1. **vmi2076105-node** (100.99.156.20:9100) - Status: Firing since Feb 14, 07:57 UTC - Duration: ~24 hours - Notifications: Sent to ntfy + Signal 2. **homelab-node** (100.67.40.126:9100) - Status: Firing since Feb 14, 09:23 UTC - Duration: ~22 hours - Notifications: Sent to ntfy + Signal ## ๐Ÿ“ฌ Notification System Status ### โœ… Working Notification Channels - **ntfy**: http://192.168.0.210:8081/homelab-alerts โœ… - **Signal**: Via signal-bridge (critical alerts) โœ… - **Alertmanager**: http://100.67.40.126:9093 โœ… ### Test Results - ntfy notification test: **PASSED** โœ… - Message delivery: **CONFIRMED** โœ… - Alert routing: **WORKING** โœ… ## โš™๏ธ Monitoring Configuration ### Alert Rules - **Trigger**: Host unreachable for 2+ minutes - **Severity**: Critical (dual-channel notifications) - **Query**: `up{job=~".*-node"} == 0` - **Evaluation**: Every 30 seconds ### Notification Routing - **Warning alerts** โ†’ ntfy only - **Critical alerts** โ†’ ntfy + Signal - **Resolved alerts** โ†’ Both channels ## ๐Ÿ”ง Infrastructure Details ### Monitoring Stack - **Prometheus**: http://100.67.40.126:9090 - **Grafana**: http://100.67.40.126:3000 - **Alertmanager**: http://100.67.40.126:9093 - **Bridge Services**: ntfy-bridge (5001), signal-bridge (5000) ### Data Collection - **Node Exporter**: System metrics on port 9100 - **SNMP Exporter**: Network device metrics on port 9116 - **Scrape Interval**: 15 seconds - **Retention**: Default Prometheus retention ## ๐Ÿ“‹ Recommendations ### Immediate Actions 1. **Investigate offline hosts**: - Check homelab-node (100.67.40.126) - main VM down - Verify vmi2076105-node (100.99.156.20) - VPS status - Check raspberry-pis node (100.123.246.75) 2. **Verify notifications**: - Confirm you're receiving ntfy alerts on mobile - Test Signal notifications for critical alerts ### Maintenance - Monitor disk space on active hosts - Review alert thresholds if needed - Consider adding more monitoring targets ## ๐Ÿงช Testing Use the test script to verify monitoring: ```bash ./scripts/test-tailscale-monitoring.sh ``` For manual testing: 1. Stop node_exporter on any host: `sudo systemctl stop node_exporter` 2. Wait 2+ minutes for alert to fire 3. Check ntfy app and Signal for notifications 4. Restart: `sudo systemctl start node_exporter` --- ## ๐ŸŸข Verified Online Nodes (March 2026) As of March 11, 2026, all 16 active nodes verified reachable via ping: | Node | Tailscale IP | Role | |------|-------------|------| | atlantis | 100.83.230.112 | Primary NAS, exit node | | calypso | 100.103.48.78 | Secondary NAS, Headscale host | | setillo | 100.125.0.20 | Remote NAS, Tucson | | homelab | 100.67.40.126 | Main VM (this host) | | pve | 100.87.12.28 | Proxmox hypervisor | | vish-concord-nuc | 100.72.55.21 | Intel NUC, exit node | | pi-5 | 100.77.151.40 | Raspberry Pi 5 | | matrix-ubuntu | 100.85.21.51 | Atlantis VM | | guava | 100.75.252.64 | TrueNAS Scale | | jellyfish | 100.69.121.120 | Pi 5 media/NAS | | gl-mt3000 | 100.126.243.15 | GL.iNet router (remote), SSH alias `gl-mt3000` | | gl-be3600 | 100.105.59.123 | GL.iNet router (Concord), exit node | | homeassistant | 100.112.186.90 | HA Green (via GL-MT3000 subnet) | | seattle | 100.82.197.124 | Contabo VPS, exit node | | shinku-ryuu | 100.98.93.15 | Desktop workstation (Windows) | | moon | 100.64.0.6 | Debian x86_64, GL-MT3000 subnet (`192.168.12.223`) | | headscale-test | 100.64.0.1 | Headscale test node | ### Notes - **moon** was migrated from public Tailscale (`dvish92@`) to Headscale on 2026-03-14. It is on the `192.168.12.0/24` subnet behind the GL-MT3000 router. `accept_routes=true` is enabled so it can reach `192.168.0.0/24` (home LAN) via Calypso's subnet advertisement. - **guava** has `accept_routes=false` to prevent Calypso's `192.168.0.0/24` route from overriding its own LAN replies. See `docs/troubleshooting/guava-smb-incident-2026-03-14.md`. - **shinku-ryuu** also has `accept_routes=false` for the same reason. --- **Last Updated:** March 2026 **Note:** The Feb 2026 alerts (homelab-node and vmi2076105-node offline) were resolved. Both nodes are now online.