136 lines
4.2 KiB
Markdown
136 lines
4.2 KiB
Markdown
# Seattle Machine Monitoring Update
|
|
|
|
## Summary
|
|
|
|
Successfully updated the homelab monitoring system to replace the decommissioned VMI (100.99.156.20) with the reprovisioned Seattle machine (100.82.197.124).
|
|
|
|
## Changes Made
|
|
|
|
### 1. Prometheus Configuration Update
|
|
|
|
**File**: `/home/homelab/docker/monitoring/prometheus/prometheus.yml`
|
|
|
|
**Before**:
|
|
```yaml
|
|
- job_name: "vmi2076105-node"
|
|
static_configs:
|
|
- targets: ["100.99.156.20:9100"]
|
|
```
|
|
|
|
**After**:
|
|
```yaml
|
|
- job_name: "seattle-node"
|
|
static_configs:
|
|
- targets: ["100.82.197.124:9100"]
|
|
```
|
|
|
|
### 2. Seattle Machine Configuration
|
|
|
|
#### Node Exporter Installation
|
|
- Node exporter was already running on the Seattle machine
|
|
- Service status: `active (running)` on port 9100
|
|
- Binary location: `/usr/local/bin/node_exporter`
|
|
|
|
#### Firewall Configuration
|
|
Added UFW rule to allow Tailscale network access:
|
|
```bash
|
|
sudo ufw allow from 100.64.0.0/10 to any port 9100 comment 'Allow Tailscale to node_exporter'
|
|
```
|
|
|
|
#### SSH Access
|
|
- Accessible via `ssh seattle-tailscale` (configured in SSH config)
|
|
- Tailscale IP: 100.82.197.124
|
|
- Standard SSH key authentication
|
|
|
|
### 3. Monitoring Verification
|
|
|
|
#### Prometheus Targets Status
|
|
All monitoring targets are now healthy:
|
|
- **prometheus**: localhost:9090 ✅ UP
|
|
- **alertmanager**: alertmanager:9093 ✅ UP
|
|
- **node-exporter**: localhost:9100 ✅ UP
|
|
- **calypso-node**: 100.75.252.64:9100 ✅ UP
|
|
- **seattle-node**: 100.82.197.124:9100 ✅ UP
|
|
- **proxmox-node**: 100.87.12.28:9100 ✅ UP
|
|
|
|
#### Metrics Collection
|
|
- Seattle machine metrics are being successfully scraped
|
|
- CPU, memory, disk, and network metrics available
|
|
- Historical data collection started immediately after configuration
|
|
|
|
## Technical Details
|
|
|
|
### Network Configuration
|
|
- **Tailscale Network**: 100.64.0.0/10
|
|
- **Seattle IP**: 100.82.197.124
|
|
- **Monitoring Port**: 9100 (node_exporter)
|
|
- **Protocol**: HTTP (internal network)
|
|
|
|
### Service Architecture
|
|
```
|
|
Prometheus (homelab) → Tailscale Network → Seattle Machine:9100 (node_exporter)
|
|
```
|
|
|
|
### Configuration Files Updated
|
|
1. `/home/homelab/docker/monitoring/prometheus/prometheus.yml` - Production config
|
|
2. `/home/homelab/organized/repos/homelab/prometheus/prometheus.yml` - Repository config
|
|
3. Fixed YAML indentation issues for alertmanager targets
|
|
|
|
## Verification Steps Completed
|
|
|
|
1. ✅ SSH connectivity to Seattle machine
|
|
2. ✅ Node exporter service running and accessible
|
|
3. ✅ Firewall rules configured for Tailscale access
|
|
4. ✅ Prometheus configuration updated and reloaded
|
|
5. ✅ Target health verification (UP status)
|
|
6. ✅ Metrics scraping confirmed
|
|
7. ✅ Repository configuration synchronized
|
|
8. ✅ Git commit with detailed change log
|
|
|
|
## Monitoring Capabilities
|
|
|
|
The Seattle machine now provides the following metrics:
|
|
- **System**: CPU usage, load average, uptime
|
|
- **Memory**: Total, available, used, cached
|
|
- **Disk**: Usage, I/O statistics, filesystem metrics
|
|
- **Network**: Interface statistics, traffic counters
|
|
- **Process**: Running processes, file descriptors
|
|
|
|
## Alert Coverage
|
|
|
|
The Seattle machine is now covered by all existing alert rules:
|
|
- **InstanceDown**: Triggers if node_exporter becomes unavailable
|
|
- **HighCPUUsage**: Alerts when CPU usage > 80% for 2+ minutes
|
|
- **HighMemoryUsage**: Alerts when memory usage > 90% for 2+ minutes
|
|
- **DiskSpaceLow**: Alerts when root filesystem < 10% free space
|
|
|
|
## Next Steps
|
|
|
|
1. **Monitor Performance**: Watch Seattle machine metrics for baseline establishment
|
|
2. **Alert Tuning**: Adjust thresholds if needed based on Seattle machine characteristics
|
|
3. **Documentation**: This update is documented in the homelab repository
|
|
4. **Backup Verification**: Ensure Seattle machine is included in backup monitoring
|
|
|
|
## Rollback Plan
|
|
|
|
If issues arise, the configuration can be quickly reverted:
|
|
|
|
```bash
|
|
# Revert Prometheus config
|
|
cd /home/homelab/docker/monitoring
|
|
git checkout HEAD~1 prometheus/prometheus.yml
|
|
docker compose restart prometheus
|
|
```
|
|
|
|
## Contact Information
|
|
|
|
- **Updated By**: OpenHands Agent
|
|
- **Date**: February 15, 2026
|
|
- **Commit**: fee90008 - "Update monitoring: Replace VMI with Seattle machine"
|
|
- **Repository**: homelab.git
|
|
|
|
---
|
|
|
|
**Status**: ✅ COMPLETED SUCCESSFULLY
|
|
**Monitoring**: ✅ ACTIVE AND HEALTHY
|
|
**Documentation**: ✅ UPDATED |