4.7 KiB
4.7 KiB
Tailscale Host Monitoring Status Report
⚠️ Historical Snapshot: This document was generated on Feb 15, 2026. The alerts and offline status listed here are no longer current. For live node status, run
tailscale statuson the homelab VM or check Grafana athttp://100.67.40.126:3000.
📊 Status Snapshot
Generated: February 15, 2026
Monitored Tailscale Hosts (13 total)
✅ Online Hosts (10)
- atlantis-node (100.83.230.112:9100) - Synology NAS
- atlantis-snmp (100.83.230.112) - SNMP monitoring
- calypso-node (100.103.48.78:9100) - Node exporter
- calypso-snmp (100.103.48.78) - SNMP monitoring
- concord-nuc-node (100.72.55.21:9100) - Intel NUC
- proxmox-node (100.87.12.28:9100) - Proxmox server
- raspberry-pis (100.77.151.40:9100) - Pi cluster node
- setillo-node (100.125.0.20:9100) - Node exporter
- setillo-snmp (100.125.0.20) - SNMP monitoring
- truenas-node (100.75.252.64:9100) - TrueNAS server
❌ Offline Hosts (3)
- homelab-node (100.67.40.126:9100) - Main homelab VM
- raspberry-pis (100.123.246.75:9100) - Pi cluster node
- vmi2076105-node (100.99.156.20:9100) - VPS instance
🚨 Active Alerts
Critical HostDown Alerts (2 firing)
-
vmi2076105-node (100.99.156.20:9100)
- Status: Firing since Feb 14, 07:57 UTC
- Duration: ~24 hours
- Notifications: Sent to ntfy + Signal
-
homelab-node (100.67.40.126:9100)
- Status: Firing since Feb 14, 09:23 UTC
- Duration: ~22 hours
- Notifications: Sent to ntfy + Signal
📬 Notification System Status
✅ Working Notification Channels
- ntfy: http://192.168.0.210:8081/homelab-alerts ✅
- Signal: Via signal-bridge (critical alerts) ✅
- Alertmanager: http://100.67.40.126:9093 ✅
Test Results
- ntfy notification test: PASSED ✅
- Message delivery: CONFIRMED ✅
- Alert routing: WORKING ✅
⚙️ Monitoring Configuration
Alert Rules
- Trigger: Host unreachable for 2+ minutes
- Severity: Critical (dual-channel notifications)
- Query:
up{job=~".*-node"} == 0 - Evaluation: Every 30 seconds
Notification Routing
- Warning alerts → ntfy only
- Critical alerts → ntfy + Signal
- Resolved alerts → Both channels
🔧 Infrastructure Details
Monitoring Stack
- Prometheus: http://100.67.40.126:9090
- Grafana: http://100.67.40.126:3000
- Alertmanager: http://100.67.40.126:9093
- Bridge Services: ntfy-bridge (5001), signal-bridge (5000)
Data Collection
- Node Exporter: System metrics on port 9100
- SNMP Exporter: Network device metrics on port 9116
- Scrape Interval: 15 seconds
- Retention: Default Prometheus retention
📋 Recommendations
Immediate Actions
-
Investigate offline hosts:
- Check homelab-node (100.67.40.126) - main VM down
- Verify vmi2076105-node (100.99.156.20) - VPS status
- Check raspberry-pis node (100.123.246.75)
-
Verify notifications:
- Confirm you're receiving ntfy alerts on mobile
- Test Signal notifications for critical alerts
Maintenance
- Monitor disk space on active hosts
- Review alert thresholds if needed
- Consider adding more monitoring targets
🧪 Testing
Use the test script to verify monitoring:
./scripts/test-tailscale-monitoring.sh
For manual testing:
- Stop node_exporter on any host:
sudo systemctl stop node_exporter - Wait 2+ minutes for alert to fire
- Check ntfy app and Signal for notifications
- Restart:
sudo systemctl start node_exporter
🟢 Verified Online Nodes (March 2026)
As of March 11, 2026, all 16 active nodes verified reachable via ping:
| Node | Tailscale IP | Role |
|---|---|---|
| atlantis | 100.83.230.112 | Primary NAS, exit node |
| calypso | 100.103.48.78 | Secondary NAS, Headscale host |
| setillo | 100.125.0.20 | Remote NAS, Tucson |
| homelab | 100.67.40.126 | Main VM (this host) |
| pve | 100.87.12.28 | Proxmox hypervisor |
| vish-concord-nuc | 100.72.55.21 | Intel NUC, exit node |
| pi-5 | 100.77.151.40 | Raspberry Pi 5 |
| matrix-ubuntu | 100.85.21.51 | Atlantis VM |
| guava | 100.75.252.64 | TrueNAS Scale |
| jellyfish | 100.69.121.120 | Pi 5 media/NAS |
| gl-mt3000 | 100.126.243.15 | GL.iNet router (Concord) |
| gl-be3600 | 100.105.59.123 | GL.iNet router (Concord), exit node |
| homeassistant | 100.112.186.90 | HA Green (via GL-MT3000 subnet) |
| seattle | 100.82.197.124 | Contabo VPS, exit node |
| shinku-ryuu | 100.98.93.15 | Desktop workstation |
| headscale-test | 100.64.0.1 | Headscale test node |
Last Updated: March 2026
Note: The Feb 2026 alerts (homelab-node and vmi2076105-node offline) were resolved. Both nodes are now online.