5.3 KiB
5.3 KiB
Tailscale Host Monitoring Status Report
⚠️ Historical Snapshot: This document was generated on Feb 15, 2026. The alerts and offline status listed here are no longer current. For live node status, run
tailscale statuson the homelab VM or check Grafana athttp://100.67.40.126:3000.
📊 Status Snapshot
Generated: February 15, 2026
Monitored Tailscale Hosts (13 total)
✅ Online Hosts (10)
- atlantis-node (100.83.230.112:9100) - Synology NAS
- atlantis-snmp (100.83.230.112) - SNMP monitoring
- calypso-node (100.103.48.78:9100) - Node exporter
- calypso-snmp (100.103.48.78) - SNMP monitoring
- concord-nuc-node (100.72.55.21:9100) - Intel NUC
- proxmox-node (100.87.12.28:9100) - Proxmox server
- raspberry-pis (100.77.151.40:9100) - Pi cluster node
- setillo-node (100.125.0.20:9100) - Node exporter
- setillo-snmp (100.125.0.20) - SNMP monitoring
- truenas-node (100.75.252.64:9100) - TrueNAS server
❌ Offline Hosts (3)
- homelab-node (100.67.40.126:9100) - Main homelab VM
- raspberry-pis (100.123.246.75:9100) - Pi cluster node
- vmi2076105-node (100.99.156.20:9100) - VPS instance
🚨 Active Alerts
Critical HostDown Alerts (2 firing)
-
vmi2076105-node (100.99.156.20:9100)
- Status: Firing since Feb 14, 07:57 UTC
- Duration: ~24 hours
- Notifications: Sent to ntfy + Signal
-
homelab-node (100.67.40.126:9100)
- Status: Firing since Feb 14, 09:23 UTC
- Duration: ~22 hours
- Notifications: Sent to ntfy + Signal
📬 Notification System Status
✅ Working Notification Channels
- ntfy: http://192.168.0.210:8081/homelab-alerts ✅
- Signal: Via signal-bridge (critical alerts) ✅
- Alertmanager: http://100.67.40.126:9093 ✅
Test Results
- ntfy notification test: PASSED ✅
- Message delivery: CONFIRMED ✅
- Alert routing: WORKING ✅
⚙️ Monitoring Configuration
Alert Rules
- Trigger: Host unreachable for 2+ minutes
- Severity: Critical (dual-channel notifications)
- Query:
up{job=~".*-node"} == 0 - Evaluation: Every 30 seconds
Notification Routing
- Warning alerts → ntfy only
- Critical alerts → ntfy + Signal
- Resolved alerts → Both channels
🔧 Infrastructure Details
Monitoring Stack
- Prometheus: http://100.67.40.126:9090
- Grafana: http://100.67.40.126:3000
- Alertmanager: http://100.67.40.126:9093
- Bridge Services: ntfy-bridge (5001), signal-bridge (5000)
Data Collection
- Node Exporter: System metrics on port 9100
- SNMP Exporter: Network device metrics on port 9116
- Scrape Interval: 15 seconds
- Retention: Default Prometheus retention
📋 Recommendations
Immediate Actions
-
Investigate offline hosts:
- Check homelab-node (100.67.40.126) - main VM down
- Verify vmi2076105-node (100.99.156.20) - VPS status
- Check raspberry-pis node (100.123.246.75)
-
Verify notifications:
- Confirm you're receiving ntfy alerts on mobile
- Test Signal notifications for critical alerts
Maintenance
- Monitor disk space on active hosts
- Review alert thresholds if needed
- Consider adding more monitoring targets
🧪 Testing
Use the test script to verify monitoring:
./scripts/test-tailscale-monitoring.sh
For manual testing:
- Stop node_exporter on any host:
sudo systemctl stop node_exporter - Wait 2+ minutes for alert to fire
- Check ntfy app and Signal for notifications
- Restart:
sudo systemctl start node_exporter
🟢 Verified Online Nodes (March 2026)
As of March 11, 2026, all 16 active nodes verified reachable via ping:
| Node | Tailscale IP | Role |
|---|---|---|
| atlantis | 100.83.230.112 | Primary NAS, exit node |
| calypso | 100.103.48.78 | Secondary NAS, Headscale host |
| setillo | 100.125.0.20 | Remote NAS, Tucson |
| homelab | 100.67.40.126 | Main VM (this host) |
| pve | 100.87.12.28 | Proxmox hypervisor |
| vish-concord-nuc | 100.72.55.21 | Intel NUC, exit node |
| pi-5 | 100.77.151.40 | Raspberry Pi 5 |
| matrix-ubuntu | 100.85.21.51 | Atlantis VM |
| guava | 100.75.252.64 | TrueNAS Scale |
| jellyfish | 100.69.121.120 | Pi 5 media/NAS |
| gl-mt3000 | 100.126.243.15 | GL.iNet router (remote), SSH alias gl-mt3000 |
| gl-be3600 | 100.105.59.123 | GL.iNet router (Concord), exit node |
| homeassistant | 100.112.186.90 | HA Green (via GL-MT3000 subnet) |
| seattle | 100.82.197.124 | Contabo VPS, exit node |
| shinku-ryuu | 100.98.93.15 | Desktop workstation (Windows) |
| moon | 100.64.0.6 | Debian x86_64, GL-MT3000 subnet (192.168.12.223) |
| headscale-test | 100.64.0.1 | Headscale test node |
Notes
- moon was migrated from public Tailscale (
dvish92@) to Headscale on 2026-03-14. It is on the192.168.12.0/24subnet behind the GL-MT3000 router.accept_routes=trueis enabled so it can reach192.168.0.0/24(home LAN) via Calypso's subnet advertisement. - guava has
accept_routes=falseto prevent Calypso's192.168.0.0/24route from overriding its own LAN replies. Seedocs/troubleshooting/guava-smb-incident-2026-03-14.md. - shinku-ryuu also has
accept_routes=falsefor the same reason.
Last Updated: March 2026
Note: The Feb 2026 alerts (homelab-node and vmi2076105-node offline) were resolved. Both nodes are now online.