333 lines
9.4 KiB
Markdown
333 lines
9.4 KiB
Markdown
# 📊 Operational Status
|
|
|
|
*Current operational status of all homelab services and infrastructure*
|
|
|
|
## Infrastructure Overview
|
|
|
|
### Host Status
|
|
| Host | Status | Uptime | CPU | Memory | Storage |
|
|
|------|--------|--------|-----|--------|---------|
|
|
| **Atlantis** (DS1821+) | ✅ Online | 99.8% | 15% | 45% | 78% |
|
|
| **Calypso** (Custom NAS) | ✅ Online | 99.5% | 12% | 38% | 65% |
|
|
| **homelab_vm** (Main VM) | ✅ Online | 99.9% | 25% | 55% | 42% |
|
|
| **concord_nuc** (Intel NUC) | ✅ Online | 99.7% | 18% | 48% | 35% |
|
|
| **raspberry-pi-5-vish** | ✅ Online | 99.6% | 8% | 32% | 28% |
|
|
|
|
### Network Status
|
|
- **Internet Connectivity**: ✅ Stable (1Gbps/50Mbps)
|
|
- **Internal Network**: ✅ 10GbE backbone operational
|
|
- **VPN Access**: ✅ WireGuard and Tailscale active
|
|
- **DNS Resolution**: ✅ Pi-hole and AdGuard operational
|
|
- **SSL Certificates**: ✅ All certificates valid
|
|
|
|
## Service Categories
|
|
|
|
### Media & Entertainment
|
|
|
|
#### Streaming Services
|
|
- **Plex Media Server** - ✅ Active (concord_nuc)
|
|
- Hardware transcoding: ✅ Intel Quick Sync enabled
|
|
- Remote access: ✅ Direct connection available
|
|
- Library size: 2.1TB movies, 850GB TV shows
|
|
- Active streams: 2/4 concurrent
|
|
|
|
- **Jellyfin** - ✅ Active (Atlantis)
|
|
- Alternative streaming platform
|
|
- 4K HDR support enabled
|
|
- Mobile apps configured
|
|
|
|
- **Navidrome** - ✅ Active (Calypso)
|
|
- Music streaming: 45GB library
|
|
- Subsonic API enabled
|
|
- Mobile sync active
|
|
|
|
#### Media Management (Arr Suite)
|
|
- **Sonarr** - ✅ Active (Atlantis)
|
|
- TV series monitoring: 127 series
|
|
- Quality profiles: 1080p/4K configured
|
|
- Indexers: 8 active
|
|
|
|
- **Radarr** - ✅ Active (Atlantis)
|
|
- Movie monitoring: 342 movies
|
|
- Quality profiles: 1080p/4K configured
|
|
- Custom formats enabled
|
|
|
|
- **Lidarr** - ✅ Active (Calypso)
|
|
- Music monitoring: 89 artists
|
|
- Quality profiles: FLAC/MP3 configured
|
|
- Metadata enhancement active
|
|
|
|
- **Prowlarr** - ✅ Active (Atlantis)
|
|
- Indexer management: 12 indexers
|
|
- API sync with all *arr services
|
|
- Health checks passing
|
|
|
|
### Gaming Services
|
|
|
|
#### Game Servers
|
|
- **Minecraft Server** - ✅ Active (homelab_vm)
|
|
- Version: 1.20.4 Paper
|
|
- Players: 0/20 online
|
|
- Plugins: 15 installed
|
|
- Backup: Daily automated
|
|
|
|
- **Satisfactory Server** - ✅ Active (homelab_vm)
|
|
- Version: Update 8
|
|
- Players: 0/4 online
|
|
- Save backup: Every 6 hours
|
|
- Mods: Vanilla
|
|
|
|
- **Left 4 Dead 2 Server** - ⚠️ Maintenance (homelab_vm)
|
|
- Status: Updating game files
|
|
- Expected online: 2 hours
|
|
- Custom campaigns installed
|
|
|
|
- **Garry's Mod PropHunt** - ✅ Active (homelab_vm)
|
|
- Players: 0/16 online
|
|
- Maps: 25 PropHunt maps
|
|
- Addons: 12 workshop items
|
|
|
|
#### Game Management
|
|
- **PufferPanel** - ✅ Active (homelab_vm)
|
|
- Managing: 4 game servers
|
|
- Web interface: https://games.vish.gg
|
|
- Automated backups enabled
|
|
|
|
### Development & DevOps
|
|
|
|
#### Version Control
|
|
- **Gitea** - ✅ Active (Calypso)
|
|
- Repositories: 23 active
|
|
- Users: 3 registered
|
|
- CI/CD: Gitea Runner operational
|
|
- OAuth: Authentik integration
|
|
|
|
#### Container Management
|
|
- **Portainer** - ✅ Active (All hosts)
|
|
- Stacks: 81 total (79 running, 2 stopped intentionally)
|
|
- Containers: 157+ total
|
|
- GitOps: 80/81 stacks automated (100% of managed stacks; gitea excluded as bootstrap)
|
|
- Health: 97.5% success rate
|
|
|
|
- **Watchtower** - ✅ Active (All hosts)
|
|
- Auto-updates: Enabled
|
|
- Schedule: Daily at 3 AM
|
|
- Notifications: NTFY integration
|
|
- Success rate: 98.2%
|
|
|
|
#### Development Tools
|
|
- **OpenHands** - ✅ Active (homelab_vm)
|
|
- AI development assistant
|
|
- GPU acceleration: Available
|
|
- Model: GPT-4 integration
|
|
|
|
- **Code Server** - ✅ Active (Calypso)
|
|
- VS Code in browser
|
|
- Extensions: 25 installed
|
|
- Git integration: Active
|
|
|
|
### Infrastructure & Networking
|
|
|
|
#### Network Services
|
|
- **Nginx Proxy Manager** - ✅ Active (Calypso)
|
|
- Proxy hosts: 45 configured
|
|
- SSL certificates: 42 active
|
|
- Access lists: 8 configured
|
|
- Uptime: 99.9%
|
|
|
|
- **Pi-hole** - ✅ Active (concord_nuc)
|
|
- Queries blocked: 23.4% (24h)
|
|
- Blocklists: 15 active
|
|
- Clients: 28 devices
|
|
- Upstream DNS: Cloudflare
|
|
|
|
- **AdGuard Home** - ✅ Active (Calypso)
|
|
- Secondary DNS filtering
|
|
- Queries blocked: 21.8% (24h)
|
|
- Parental controls: Enabled
|
|
- Safe browsing: Active
|
|
|
|
#### VPN Services
|
|
- **WireGuard** - ✅ Active (Multiple hosts)
|
|
- Peers: 8 configured
|
|
- Traffic: 2.3GB (7 days)
|
|
- Handshakes: All successful
|
|
- Mobile clients: 4 active
|
|
|
|
- **Tailscale** - ✅ Active (All hosts)
|
|
- Mesh network: 12 nodes
|
|
- Exit nodes: 2 configured
|
|
- Magic DNS: Enabled
|
|
- Subnet routing: Active
|
|
|
|
### Monitoring & Observability
|
|
|
|
#### Metrics & Monitoring
|
|
- **Prometheus** - ✅ Active (homelab_vm)
|
|
- Targets: 45 monitored
|
|
- Metrics retention: 15 days
|
|
- Storage: 2.1GB used
|
|
- Scrape success: 99.1%
|
|
|
|
- **Grafana** - ✅ Active (homelab_vm)
|
|
- Version: 12.4.0 (pinned, `grafana/grafana-oss:12.4.0`)
|
|
- URL: `https://gf.vish.gg` (Authentik SSO) / `http://192.168.0.210:3300`
|
|
- Dashboards: 4 (Infrastructure Overview, Node Details, Synology NAS, Node Exporter Full)
|
|
- Default home: Node Details - Full Metrics (`node-details-v2`)
|
|
- Auth: Authentik OAuth2 SSO + local admin account
|
|
- Stack: `monitoring-stack` (GitOps, `hosts/vms/homelab-vm/monitoring.yaml`)
|
|
|
|
- **AlertManager** - ✅ Active (homelab_vm)
|
|
- Alert rules: 28 configured
|
|
- Notifications: NTFY, Email
|
|
- Silences: 2 active
|
|
- Firing alerts: 0 current
|
|
|
|
#### Uptime Monitoring
|
|
- **Uptime Kuma** - ✅ Active (raspberry-pi-5-vish)
|
|
- Monitors: 67 services
|
|
- Uptime average: 99.4%
|
|
- Notifications: NTFY integration
|
|
- Status page: Public
|
|
|
|
### Security & Authentication
|
|
|
|
#### Identity Management
|
|
- **Authentik** - ✅ Active (Calypso)
|
|
- Users: 5 registered
|
|
- Applications: 12 integrated
|
|
- OAuth providers: 3 configured
|
|
- MFA: TOTP enabled
|
|
|
|
- **Vaultwarden** - ✅ Active (Calypso)
|
|
- Vault items: 247 stored
|
|
- Organizations: 2 configured
|
|
- Emergency access: Configured
|
|
- Backup: Daily encrypted
|
|
|
|
#### Security Tools
|
|
- **Fail2ban** - ✅ Active (All hosts)
|
|
- Jails: 8 configured
|
|
- Banned IPs: 23 (7 days)
|
|
- SSH protection: Active
|
|
- Log monitoring: Enabled
|
|
|
|
### Communication & Collaboration
|
|
|
|
#### Chat & Messaging
|
|
- **Matrix Synapse** - ✅ Active (homelab_vm)
|
|
- Users: 4 registered
|
|
- Rooms: 12 active
|
|
- Federation: Enabled
|
|
- E2E encryption: Active
|
|
|
|
- **Element Web** - ✅ Active (homelab_vm)
|
|
- Matrix client interface
|
|
- Voice/video calls: Enabled
|
|
- File sharing: Active
|
|
- Themes: Custom configured
|
|
|
|
- **NTFY** - ✅ Active (homelab_vm)
|
|
- Topics: 15 configured
|
|
- Messages: 1,247 (30 days)
|
|
- Subscribers: 8 active
|
|
- Delivery rate: 99.8%
|
|
|
|
### Productivity & Office
|
|
|
|
#### Document Management
|
|
- **Paperless-ngx** - ✅ Active (Calypso)
|
|
- Documents: 1,456 stored
|
|
- OCR processing: Active
|
|
- Tags: 89 configured
|
|
- Storage: 2.8GB used
|
|
|
|
- **Stirling PDF** - ✅ Active (homelab_vm)
|
|
- PDF manipulation tools
|
|
- Processing: 156 files (30 days)
|
|
- Features: All modules active
|
|
- Performance: Excellent
|
|
|
|
#### File Management
|
|
- **Syncthing** - ✅ Active (Multiple hosts)
|
|
- Folders: 8 synchronized
|
|
- Devices: 6 connected
|
|
- Sync status: Up to date
|
|
- Conflicts: 0 current
|
|
|
|
- **Seafile** - ✅ Active (Calypso)
|
|
- Libraries: 5 configured
|
|
- Users: 3 active
|
|
- Storage: 45GB used
|
|
- Sync clients: 4 active
|
|
|
|
## Performance Metrics
|
|
|
|
### Resource Utilization (24h Average)
|
|
- **CPU Usage**: 18.5% across all hosts
|
|
- **Memory Usage**: 42.3% across all hosts
|
|
- **Storage Usage**: 51.2% across all hosts
|
|
- **Network Traffic**: 2.1TB ingress, 850GB egress
|
|
|
|
### Service Response Times
|
|
- **Web Services**: 145ms average
|
|
- **API Endpoints**: 89ms average
|
|
- **Database Queries**: 23ms average
|
|
- **File Operations**: 67ms average
|
|
|
|
### Backup Status
|
|
- **Daily Backups**: ✅ 23/23 successful
|
|
- **Weekly Backups**: ✅ 8/8 successful
|
|
- **Monthly Backups**: ✅ 3/3 successful
|
|
- **Offsite Backups**: ✅ Cloud sync active
|
|
|
|
## Recent Changes
|
|
|
|
### Last 7 Days
|
|
- **2026-03-08**: Fixed Grafana default home dashboard (set to `node-details-v2` via org preferences API)
|
|
- **2026-03-08**: Pinned Grafana image to `12.4.0`, disabled `kubernetesDashboards` feature toggle
|
|
- **2026-03-08**: Completed full GitOps migration — all 81 stacks now on canonical `hosts/` paths
|
|
- **2026-03-08**: SABnzbd disk-full recovery on Atlantis — freed 185GB, resumed downloads
|
|
- **2026-03-08**: Added immich-stack to Calypso
|
|
|
|
### Planned Maintenance
|
|
- Monitor Grafana `node-details-v2` and `Node Exporter Full` dashboards for export/backup into monitoring.yaml
|
|
|
|
## Alert Summary
|
|
|
|
### Active Alerts
|
|
- **None** - All systems operational
|
|
|
|
### Recent Alerts (Resolved)
|
|
- **2024-02-23 14:32**: High memory usage on homelab_vm (resolved)
|
|
- **2024-02-22 09:15**: SSL certificate near expiry (renewed)
|
|
- **2024-02-21 22:45**: Backup job delayed (completed)
|
|
|
|
### Alert Trends
|
|
- **Critical alerts**: 0 (7 days)
|
|
- **Warning alerts**: 3 (7 days)
|
|
- **Info alerts**: 12 (7 days)
|
|
- **MTTR**: 15 minutes average
|
|
|
|
## Capacity Planning
|
|
|
|
### Storage Growth
|
|
- **Current usage**: 51.2% (15.8TB used / 30.9TB total)
|
|
- **Monthly growth**: 2.3% average
|
|
- **Projected full**: 18 months
|
|
- **Next expansion**: Q4 2024
|
|
|
|
### Compute Resources
|
|
- **CPU headroom**: 81.5% available
|
|
- **Memory headroom**: 57.7% available
|
|
- **Network utilization**: 12% peak
|
|
- **Scaling needed**: None immediate
|
|
|
|
### Service Scaling
|
|
- **Container density**: 156 containers across 5 hosts
|
|
- **Resource efficiency**: 89% optimal
|
|
- **Bottlenecks**: None identified
|
|
- **Optimization opportunities**: 3 identified
|
|
|
|
---
|
|
**Last Updated**: 2026-03-08 | **Next Review**: As needed |