Sanitized mirror from private repository - 2026-04-05 10:50:43 UTC
This commit is contained in:
333
docs/OPERATIONAL_STATUS.md
Normal file
333
docs/OPERATIONAL_STATUS.md
Normal file
@@ -0,0 +1,333 @@
|
||||
# 📊 Operational Status
|
||||
|
||||
*Current operational status of all homelab services and infrastructure*
|
||||
|
||||
## Infrastructure Overview
|
||||
|
||||
### Host Status
|
||||
| Host | Status | Uptime | CPU | Memory | Storage |
|
||||
|------|--------|--------|-----|--------|---------|
|
||||
| **Atlantis** (DS1821+) | ✅ Online | 99.8% | 15% | 45% | 78% |
|
||||
| **Calypso** (Custom NAS) | ✅ Online | 99.5% | 12% | 38% | 65% |
|
||||
| **homelab_vm** (Main VM) | ✅ Online | 99.9% | 25% | 55% | 42% |
|
||||
| **concord_nuc** (Intel NUC) | ✅ Online | 99.7% | 18% | 48% | 35% |
|
||||
| **raspberry-pi-5-vish** | ✅ Online | 99.6% | 8% | 32% | 28% |
|
||||
|
||||
### Network Status
|
||||
- **Internet Connectivity**: ✅ Stable (1Gbps/50Mbps)
|
||||
- **Internal Network**: ✅ 10GbE backbone operational
|
||||
- **VPN Access**: ✅ WireGuard and Tailscale active
|
||||
- **DNS Resolution**: ✅ Pi-hole and AdGuard operational
|
||||
- **SSL Certificates**: ✅ All certificates valid
|
||||
|
||||
## Service Categories
|
||||
|
||||
### Media & Entertainment
|
||||
|
||||
#### Streaming Services
|
||||
- **Plex Media Server** - ✅ Active (concord_nuc)
|
||||
- Hardware transcoding: ✅ Intel Quick Sync enabled
|
||||
- Remote access: ✅ Direct connection available
|
||||
- Library size: 2.1TB movies, 850GB TV shows
|
||||
- Active streams: 2/4 concurrent
|
||||
|
||||
- **Jellyfin** - ✅ Active (Atlantis)
|
||||
- Alternative streaming platform
|
||||
- 4K HDR support enabled
|
||||
- Mobile apps configured
|
||||
|
||||
- **Navidrome** - ✅ Active (Calypso)
|
||||
- Music streaming: 45GB library
|
||||
- Subsonic API enabled
|
||||
- Mobile sync active
|
||||
|
||||
#### Media Management (Arr Suite)
|
||||
- **Sonarr** - ✅ Active (Atlantis)
|
||||
- TV series monitoring: 127 series
|
||||
- Quality profiles: 1080p/4K configured
|
||||
- Indexers: 8 active
|
||||
|
||||
- **Radarr** - ✅ Active (Atlantis)
|
||||
- Movie monitoring: 342 movies
|
||||
- Quality profiles: 1080p/4K configured
|
||||
- Custom formats enabled
|
||||
|
||||
- **Lidarr** - ✅ Active (Calypso)
|
||||
- Music monitoring: 89 artists
|
||||
- Quality profiles: FLAC/MP3 configured
|
||||
- Metadata enhancement active
|
||||
|
||||
- **Prowlarr** - ✅ Active (Atlantis)
|
||||
- Indexer management: 12 indexers
|
||||
- API sync with all *arr services
|
||||
- Health checks passing
|
||||
|
||||
### Gaming Services
|
||||
|
||||
#### Game Servers
|
||||
- **Minecraft Server** - ✅ Active (homelab_vm)
|
||||
- Version: 1.20.4 Paper
|
||||
- Players: 0/20 online
|
||||
- Plugins: 15 installed
|
||||
- Backup: Daily automated
|
||||
|
||||
- **Satisfactory Server** - ✅ Active (homelab_vm)
|
||||
- Version: Update 8
|
||||
- Players: 0/4 online
|
||||
- Save backup: Every 6 hours
|
||||
- Mods: Vanilla
|
||||
|
||||
- **Left 4 Dead 2 Server** - ⚠️ Maintenance (homelab_vm)
|
||||
- Status: Updating game files
|
||||
- Expected online: 2 hours
|
||||
- Custom campaigns installed
|
||||
|
||||
- **Garry's Mod PropHunt** - ✅ Active (homelab_vm)
|
||||
- Players: 0/16 online
|
||||
- Maps: 25 PropHunt maps
|
||||
- Addons: 12 workshop items
|
||||
|
||||
#### Game Management
|
||||
- **PufferPanel** - ✅ Active (homelab_vm)
|
||||
- Managing: 4 game servers
|
||||
- Web interface: https://games.vish.gg
|
||||
- Automated backups enabled
|
||||
|
||||
### Development & DevOps
|
||||
|
||||
#### Version Control
|
||||
- **Gitea** - ✅ Active (Calypso)
|
||||
- Repositories: 23 active
|
||||
- Users: 3 registered
|
||||
- CI/CD: Gitea Runner operational
|
||||
- OAuth: Authentik integration
|
||||
|
||||
#### Container Management
|
||||
- **Portainer** - ✅ Active (All hosts)
|
||||
- Stacks: 81 total (79 running, 2 stopped intentionally)
|
||||
- Containers: 157+ total
|
||||
- GitOps: 80/81 stacks automated (100% of managed stacks; gitea excluded as bootstrap)
|
||||
- Health: 97.5% success rate
|
||||
|
||||
- **Watchtower** - ✅ Active (All hosts)
|
||||
- Auto-updates: Enabled
|
||||
- Schedule: Daily at 3 AM
|
||||
- Notifications: NTFY integration
|
||||
- Success rate: 98.2%
|
||||
|
||||
#### Development Tools
|
||||
- **OpenHands** - ✅ Active (homelab_vm)
|
||||
- AI development assistant
|
||||
- GPU acceleration: Available
|
||||
- Model: GPT-4 integration
|
||||
|
||||
- **Code Server** - ✅ Active (Calypso)
|
||||
- VS Code in browser
|
||||
- Extensions: 25 installed
|
||||
- Git integration: Active
|
||||
|
||||
### Infrastructure & Networking
|
||||
|
||||
#### Network Services
|
||||
- **Nginx Proxy Manager** - ✅ Active (Calypso)
|
||||
- Proxy hosts: 45 configured
|
||||
- SSL certificates: 42 active
|
||||
- Access lists: 8 configured
|
||||
- Uptime: 99.9%
|
||||
|
||||
- **Pi-hole** - ✅ Active (concord_nuc)
|
||||
- Queries blocked: 23.4% (24h)
|
||||
- Blocklists: 15 active
|
||||
- Clients: 28 devices
|
||||
- Upstream DNS: Cloudflare
|
||||
|
||||
- **AdGuard Home** - ✅ Active (Calypso)
|
||||
- Secondary DNS filtering
|
||||
- Queries blocked: 21.8% (24h)
|
||||
- Parental controls: Enabled
|
||||
- Safe browsing: Active
|
||||
|
||||
#### VPN Services
|
||||
- **WireGuard** - ✅ Active (Multiple hosts)
|
||||
- Peers: 8 configured
|
||||
- Traffic: 2.3GB (7 days)
|
||||
- Handshakes: All successful
|
||||
- Mobile clients: 4 active
|
||||
|
||||
- **Tailscale** - ✅ Active (All hosts)
|
||||
- Mesh network: 12 nodes
|
||||
- Exit nodes: 2 configured
|
||||
- Magic DNS: Enabled
|
||||
- Subnet routing: Active
|
||||
|
||||
### Monitoring & Observability
|
||||
|
||||
#### Metrics & Monitoring
|
||||
- **Prometheus** - ✅ Active (homelab_vm)
|
||||
- Targets: 45 monitored
|
||||
- Metrics retention: 15 days
|
||||
- Storage: 2.1GB used
|
||||
- Scrape success: 99.1%
|
||||
|
||||
- **Grafana** - ✅ Active (homelab_vm)
|
||||
- Version: 12.4.0 (pinned, `grafana/grafana-oss:12.4.0`)
|
||||
- URL: `https://gf.vish.gg` (Authentik SSO) / `http://192.168.0.210:3300`
|
||||
- Dashboards: 4 (Infrastructure Overview, Node Details, Synology NAS, Node Exporter Full)
|
||||
- Default home: Node Details - Full Metrics (`node-details-v2`)
|
||||
- Auth: Authentik OAuth2 SSO + local admin account
|
||||
- Stack: `monitoring-stack` (GitOps, `hosts/vms/homelab-vm/monitoring.yaml`)
|
||||
|
||||
- **AlertManager** - ✅ Active (homelab_vm)
|
||||
- Alert rules: 28 configured
|
||||
- Notifications: NTFY, Email
|
||||
- Silences: 2 active
|
||||
- Firing alerts: 0 current
|
||||
|
||||
#### Uptime Monitoring
|
||||
- **Uptime Kuma** - ✅ Active (raspberry-pi-5-vish)
|
||||
- Monitors: 67 services
|
||||
- Uptime average: 99.4%
|
||||
- Notifications: NTFY integration
|
||||
- Status page: Public
|
||||
|
||||
### Security & Authentication
|
||||
|
||||
#### Identity Management
|
||||
- **Authentik** - ✅ Active (Calypso)
|
||||
- Users: 5 registered
|
||||
- Applications: 12 integrated
|
||||
- OAuth providers: 3 configured
|
||||
- MFA: TOTP enabled
|
||||
|
||||
- **Vaultwarden** - ✅ Active (Calypso)
|
||||
- Vault items: 247 stored
|
||||
- Organizations: 2 configured
|
||||
- Emergency access: Configured
|
||||
- Backup: Daily encrypted
|
||||
|
||||
#### Security Tools
|
||||
- **Fail2ban** - ✅ Active (All hosts)
|
||||
- Jails: 8 configured
|
||||
- Banned IPs: 23 (7 days)
|
||||
- SSH protection: Active
|
||||
- Log monitoring: Enabled
|
||||
|
||||
### Communication & Collaboration
|
||||
|
||||
#### Chat & Messaging
|
||||
- **Matrix Synapse** - ✅ Active (homelab_vm)
|
||||
- Users: 4 registered
|
||||
- Rooms: 12 active
|
||||
- Federation: Enabled
|
||||
- E2E encryption: Active
|
||||
|
||||
- **Element Web** - ✅ Active (homelab_vm)
|
||||
- Matrix client interface
|
||||
- Voice/video calls: Enabled
|
||||
- File sharing: Active
|
||||
- Themes: Custom configured
|
||||
|
||||
- **NTFY** - ✅ Active (homelab_vm)
|
||||
- Topics: 15 configured
|
||||
- Messages: 1,247 (30 days)
|
||||
- Subscribers: 8 active
|
||||
- Delivery rate: 99.8%
|
||||
|
||||
### Productivity & Office
|
||||
|
||||
#### Document Management
|
||||
- **Paperless-ngx** - ✅ Active (Calypso)
|
||||
- Documents: 1,456 stored
|
||||
- OCR processing: Active
|
||||
- Tags: 89 configured
|
||||
- Storage: 2.8GB used
|
||||
|
||||
- **Stirling PDF** - ✅ Active (homelab_vm)
|
||||
- PDF manipulation tools
|
||||
- Processing: 156 files (30 days)
|
||||
- Features: All modules active
|
||||
- Performance: Excellent
|
||||
|
||||
#### File Management
|
||||
- **Syncthing** - ✅ Active (Multiple hosts)
|
||||
- Folders: 8 synchronized
|
||||
- Devices: 6 connected
|
||||
- Sync status: Up to date
|
||||
- Conflicts: 0 current
|
||||
|
||||
- **Seafile** - ✅ Active (Calypso)
|
||||
- Libraries: 5 configured
|
||||
- Users: 3 active
|
||||
- Storage: 45GB used
|
||||
- Sync clients: 4 active
|
||||
|
||||
## Performance Metrics
|
||||
|
||||
### Resource Utilization (24h Average)
|
||||
- **CPU Usage**: 18.5% across all hosts
|
||||
- **Memory Usage**: 42.3% across all hosts
|
||||
- **Storage Usage**: 51.2% across all hosts
|
||||
- **Network Traffic**: 2.1TB ingress, 850GB egress
|
||||
|
||||
### Service Response Times
|
||||
- **Web Services**: 145ms average
|
||||
- **API Endpoints**: 89ms average
|
||||
- **Database Queries**: 23ms average
|
||||
- **File Operations**: 67ms average
|
||||
|
||||
### Backup Status
|
||||
- **Daily Backups**: ✅ 23/23 successful
|
||||
- **Weekly Backups**: ✅ 8/8 successful
|
||||
- **Monthly Backups**: ✅ 3/3 successful
|
||||
- **Offsite Backups**: ✅ Cloud sync active
|
||||
|
||||
## Recent Changes
|
||||
|
||||
### Last 7 Days
|
||||
- **2026-03-08**: Fixed Grafana default home dashboard (set to `node-details-v2` via org preferences API)
|
||||
- **2026-03-08**: Pinned Grafana image to `12.4.0`, disabled `kubernetesDashboards` feature toggle
|
||||
- **2026-03-08**: Completed full GitOps migration — all 81 stacks now on canonical `hosts/` paths
|
||||
- **2026-03-08**: SABnzbd disk-full recovery on Atlantis — freed 185GB, resumed downloads
|
||||
- **2026-03-08**: Added immich-stack to Calypso
|
||||
|
||||
### Planned Maintenance
|
||||
- Monitor Grafana `node-details-v2` and `Node Exporter Full` dashboards for export/backup into monitoring.yaml
|
||||
|
||||
## Alert Summary
|
||||
|
||||
### Active Alerts
|
||||
- **None** - All systems operational
|
||||
|
||||
### Recent Alerts (Resolved)
|
||||
- **2024-02-23 14:32**: High memory usage on homelab_vm (resolved)
|
||||
- **2024-02-22 09:15**: SSL certificate near expiry (renewed)
|
||||
- **2024-02-21 22:45**: Backup job delayed (completed)
|
||||
|
||||
### Alert Trends
|
||||
- **Critical alerts**: 0 (7 days)
|
||||
- **Warning alerts**: 3 (7 days)
|
||||
- **Info alerts**: 12 (7 days)
|
||||
- **MTTR**: 15 minutes average
|
||||
|
||||
## Capacity Planning
|
||||
|
||||
### Storage Growth
|
||||
- **Current usage**: 51.2% (15.8TB used / 30.9TB total)
|
||||
- **Monthly growth**: 2.3% average
|
||||
- **Projected full**: 18 months
|
||||
- **Next expansion**: Q4 2024
|
||||
|
||||
### Compute Resources
|
||||
- **CPU headroom**: 81.5% available
|
||||
- **Memory headroom**: 57.7% available
|
||||
- **Network utilization**: 12% peak
|
||||
- **Scaling needed**: None immediate
|
||||
|
||||
### Service Scaling
|
||||
- **Container density**: 156 containers across 5 hosts
|
||||
- **Resource efficiency**: 89% optimal
|
||||
- **Bottlenecks**: None identified
|
||||
- **Optimization opportunities**: 3 identified
|
||||
|
||||
---
|
||||
**Last Updated**: 2026-03-08 | **Next Review**: As needed
|
||||
Reference in New Issue
Block a user