Files
homelab-optimized/docs/OPERATIONAL_STATUS.md
Gitea Mirror Bot 6f38f4d241
Some checks failed
Documentation / Build Docusaurus (push) Failing after 9s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-03-10 09:13:26 UTC
2026-03-10 09:13:26 +00:00

333 lines
9.4 KiB
Markdown

# 📊 Operational Status
*Current operational status of all homelab services and infrastructure*
## Infrastructure Overview
### Host Status
| Host | Status | Uptime | CPU | Memory | Storage |
|------|--------|--------|-----|--------|---------|
| **Atlantis** (DS1821+) | ✅ Online | 99.8% | 15% | 45% | 78% |
| **Calypso** (Custom NAS) | ✅ Online | 99.5% | 12% | 38% | 65% |
| **homelab_vm** (Main VM) | ✅ Online | 99.9% | 25% | 55% | 42% |
| **concord_nuc** (Intel NUC) | ✅ Online | 99.7% | 18% | 48% | 35% |
| **raspberry-pi-5-vish** | ✅ Online | 99.6% | 8% | 32% | 28% |
### Network Status
- **Internet Connectivity**: ✅ Stable (1Gbps/50Mbps)
- **Internal Network**: ✅ 10GbE backbone operational
- **VPN Access**: ✅ WireGuard and Tailscale active
- **DNS Resolution**: ✅ Pi-hole and AdGuard operational
- **SSL Certificates**: ✅ All certificates valid
## Service Categories
### Media & Entertainment
#### Streaming Services
- **Plex Media Server** - ✅ Active (concord_nuc)
- Hardware transcoding: ✅ Intel Quick Sync enabled
- Remote access: ✅ Direct connection available
- Library size: 2.1TB movies, 850GB TV shows
- Active streams: 2/4 concurrent
- **Jellyfin** - ✅ Active (Atlantis)
- Alternative streaming platform
- 4K HDR support enabled
- Mobile apps configured
- **Navidrome** - ✅ Active (Calypso)
- Music streaming: 45GB library
- Subsonic API enabled
- Mobile sync active
#### Media Management (Arr Suite)
- **Sonarr** - ✅ Active (Atlantis)
- TV series monitoring: 127 series
- Quality profiles: 1080p/4K configured
- Indexers: 8 active
- **Radarr** - ✅ Active (Atlantis)
- Movie monitoring: 342 movies
- Quality profiles: 1080p/4K configured
- Custom formats enabled
- **Lidarr** - ✅ Active (Calypso)
- Music monitoring: 89 artists
- Quality profiles: FLAC/MP3 configured
- Metadata enhancement active
- **Prowlarr** - ✅ Active (Atlantis)
- Indexer management: 12 indexers
- API sync with all *arr services
- Health checks passing
### Gaming Services
#### Game Servers
- **Minecraft Server** - ✅ Active (homelab_vm)
- Version: 1.20.4 Paper
- Players: 0/20 online
- Plugins: 15 installed
- Backup: Daily automated
- **Satisfactory Server** - ✅ Active (homelab_vm)
- Version: Update 8
- Players: 0/4 online
- Save backup: Every 6 hours
- Mods: Vanilla
- **Left 4 Dead 2 Server** - ⚠️ Maintenance (homelab_vm)
- Status: Updating game files
- Expected online: 2 hours
- Custom campaigns installed
- **Garry's Mod PropHunt** - ✅ Active (homelab_vm)
- Players: 0/16 online
- Maps: 25 PropHunt maps
- Addons: 12 workshop items
#### Game Management
- **PufferPanel** - ✅ Active (homelab_vm)
- Managing: 4 game servers
- Web interface: https://games.vish.gg
- Automated backups enabled
### Development & DevOps
#### Version Control
- **Gitea** - ✅ Active (Calypso)
- Repositories: 23 active
- Users: 3 registered
- CI/CD: Gitea Runner operational
- OAuth: Authentik integration
#### Container Management
- **Portainer** - ✅ Active (All hosts)
- Stacks: 81 total (79 running, 2 stopped intentionally)
- Containers: 157+ total
- GitOps: 80/81 stacks automated (100% of managed stacks; gitea excluded as bootstrap)
- Health: 97.5% success rate
- **Watchtower** - ✅ Active (All hosts)
- Auto-updates: Enabled
- Schedule: Daily at 3 AM
- Notifications: NTFY integration
- Success rate: 98.2%
#### Development Tools
- **OpenHands** - ✅ Active (homelab_vm)
- AI development assistant
- GPU acceleration: Available
- Model: GPT-4 integration
- **Code Server** - ✅ Active (Calypso)
- VS Code in browser
- Extensions: 25 installed
- Git integration: Active
### Infrastructure & Networking
#### Network Services
- **Nginx Proxy Manager** - ✅ Active (Calypso)
- Proxy hosts: 45 configured
- SSL certificates: 42 active
- Access lists: 8 configured
- Uptime: 99.9%
- **Pi-hole** - ✅ Active (concord_nuc)
- Queries blocked: 23.4% (24h)
- Blocklists: 15 active
- Clients: 28 devices
- Upstream DNS: Cloudflare
- **AdGuard Home** - ✅ Active (Calypso)
- Secondary DNS filtering
- Queries blocked: 21.8% (24h)
- Parental controls: Enabled
- Safe browsing: Active
#### VPN Services
- **WireGuard** - ✅ Active (Multiple hosts)
- Peers: 8 configured
- Traffic: 2.3GB (7 days)
- Handshakes: All successful
- Mobile clients: 4 active
- **Tailscale** - ✅ Active (All hosts)
- Mesh network: 12 nodes
- Exit nodes: 2 configured
- Magic DNS: Enabled
- Subnet routing: Active
### Monitoring & Observability
#### Metrics & Monitoring
- **Prometheus** - ✅ Active (homelab_vm)
- Targets: 45 monitored
- Metrics retention: 15 days
- Storage: 2.1GB used
- Scrape success: 99.1%
- **Grafana** - ✅ Active (homelab_vm)
- Version: 12.4.0 (pinned, `grafana/grafana-oss:12.4.0`)
- URL: `https://gf.vish.gg` (Authentik SSO) / `http://192.168.0.210:3300`
- Dashboards: 4 (Infrastructure Overview, Node Details, Synology NAS, Node Exporter Full)
- Default home: Node Details - Full Metrics (`node-details-v2`)
- Auth: Authentik OAuth2 SSO + local admin account
- Stack: `monitoring-stack` (GitOps, `hosts/vms/homelab-vm/monitoring.yaml`)
- **AlertManager** - ✅ Active (homelab_vm)
- Alert rules: 28 configured
- Notifications: NTFY, Email
- Silences: 2 active
- Firing alerts: 0 current
#### Uptime Monitoring
- **Uptime Kuma** - ✅ Active (raspberry-pi-5-vish)
- Monitors: 67 services
- Uptime average: 99.4%
- Notifications: NTFY integration
- Status page: Public
### Security & Authentication
#### Identity Management
- **Authentik** - ✅ Active (Calypso)
- Users: 5 registered
- Applications: 12 integrated
- OAuth providers: 3 configured
- MFA: TOTP enabled
- **Vaultwarden** - ✅ Active (Calypso)
- Vault items: 247 stored
- Organizations: 2 configured
- Emergency access: Configured
- Backup: Daily encrypted
#### Security Tools
- **Fail2ban** - ✅ Active (All hosts)
- Jails: 8 configured
- Banned IPs: 23 (7 days)
- SSH protection: Active
- Log monitoring: Enabled
### Communication & Collaboration
#### Chat & Messaging
- **Matrix Synapse** - ✅ Active (homelab_vm)
- Users: 4 registered
- Rooms: 12 active
- Federation: Enabled
- E2E encryption: Active
- **Element Web** - ✅ Active (homelab_vm)
- Matrix client interface
- Voice/video calls: Enabled
- File sharing: Active
- Themes: Custom configured
- **NTFY** - ✅ Active (homelab_vm)
- Topics: 15 configured
- Messages: 1,247 (30 days)
- Subscribers: 8 active
- Delivery rate: 99.8%
### Productivity & Office
#### Document Management
- **Paperless-ngx** - ✅ Active (Calypso)
- Documents: 1,456 stored
- OCR processing: Active
- Tags: 89 configured
- Storage: 2.8GB used
- **Stirling PDF** - ✅ Active (homelab_vm)
- PDF manipulation tools
- Processing: 156 files (30 days)
- Features: All modules active
- Performance: Excellent
#### File Management
- **Syncthing** - ✅ Active (Multiple hosts)
- Folders: 8 synchronized
- Devices: 6 connected
- Sync status: Up to date
- Conflicts: 0 current
- **Seafile** - ✅ Active (Calypso)
- Libraries: 5 configured
- Users: 3 active
- Storage: 45GB used
- Sync clients: 4 active
## Performance Metrics
### Resource Utilization (24h Average)
- **CPU Usage**: 18.5% across all hosts
- **Memory Usage**: 42.3% across all hosts
- **Storage Usage**: 51.2% across all hosts
- **Network Traffic**: 2.1TB ingress, 850GB egress
### Service Response Times
- **Web Services**: 145ms average
- **API Endpoints**: 89ms average
- **Database Queries**: 23ms average
- **File Operations**: 67ms average
### Backup Status
- **Daily Backups**: ✅ 23/23 successful
- **Weekly Backups**: ✅ 8/8 successful
- **Monthly Backups**: ✅ 3/3 successful
- **Offsite Backups**: ✅ Cloud sync active
## Recent Changes
### Last 7 Days
- **2026-03-08**: Fixed Grafana default home dashboard (set to `node-details-v2` via org preferences API)
- **2026-03-08**: Pinned Grafana image to `12.4.0`, disabled `kubernetesDashboards` feature toggle
- **2026-03-08**: Completed full GitOps migration — all 81 stacks now on canonical `hosts/` paths
- **2026-03-08**: SABnzbd disk-full recovery on Atlantis — freed 185GB, resumed downloads
- **2026-03-08**: Added immich-stack to Calypso
### Planned Maintenance
- Monitor Grafana `node-details-v2` and `Node Exporter Full` dashboards for export/backup into monitoring.yaml
## Alert Summary
### Active Alerts
- **None** - All systems operational
### Recent Alerts (Resolved)
- **2024-02-23 14:32**: High memory usage on homelab_vm (resolved)
- **2024-02-22 09:15**: SSL certificate near expiry (renewed)
- **2024-02-21 22:45**: Backup job delayed (completed)
### Alert Trends
- **Critical alerts**: 0 (7 days)
- **Warning alerts**: 3 (7 days)
- **Info alerts**: 12 (7 days)
- **MTTR**: 15 minutes average
## Capacity Planning
### Storage Growth
- **Current usage**: 51.2% (15.8TB used / 30.9TB total)
- **Monthly growth**: 2.3% average
- **Projected full**: 18 months
- **Next expansion**: Q4 2024
### Compute Resources
- **CPU headroom**: 81.5% available
- **Memory headroom**: 57.7% available
- **Network utilization**: 12% peak
- **Scaling needed**: None immediate
### Service Scaling
- **Container density**: 156 containers across 5 hosts
- **Resource efficiency**: 89% optimal
- **Bottlenecks**: None identified
- **Optimization opportunities**: 3 identified
---
**Last Updated**: 2026-03-08 | **Next Review**: As needed