3.6 KiB
Operational Notes & Known Issues
Last Updated: 2026-01-26
This document contains important operational notes, known issues, and fixes for the homelab infrastructure.
Server-Specific Notes
Concord NUC (100.72.55.21)
Node Exporter
- Runs on bare metal (not containerized)
- Port: 9100
- Prometheus scrapes successfully from
100.72.55.21:9100 - Do NOT deploy containerized node_exporter - it will conflict with the host service
Watchtower
- Requires
DOCKER_API_VERSION=1.44environment variable - This is because the Portainer Edge Agent uses an older Docker API version
- Without this env var, watchtower fails with:
client version 1.25 is too old
Invidious
- Health check reports "unhealthy" but the application works fine
- The health check calls
/api/v1/trendingwhich returns HTTP 500 - This is a known upstream issue with YouTube's API changes
- Workaround: Ignore the unhealthy status or modify the health check endpoint
Prometheus Monitoring
Active Targets (as of 2026-01-26)
| Job | Target | Status |
|---|---|---|
| prometheus | prometheus:9090 | 🟢 UP |
| homelab-node | 100.67.40.126:9100 | 🟢 UP |
| atlantis-node | 100.83.230.112:9100 | 🟢 UP |
| atlantis-snmp | 100.83.230.112:9116 | 🟢 UP |
| calypso-node | 100.103.48.78:9100 | 🟢 UP |
| calypso-snmp | 100.103.48.78:9116 | 🟢 UP |
| concord-nuc-node | 100.72.55.21:9100 | 🟢 UP |
| setillo-node | 100.125.0.20:9100 | 🟢 UP |
| setillo-snmp | 100.125.0.20:9116 | 🟢 UP |
| truenas-node | 100.75.252.64:9100 | 🟢 UP |
| proxmox-node | 100.87.12.28:9100 | 🟢 UP |
| raspberry-pis (pi-5) | 100.77.151.40:9100 | 🟢 UP |
Intentionally Offline Targets
| Job | Target | Reason |
|---|---|---|
| raspberry-pis (pi-5-kevin) | 100.123.246.75:9100 | Intentionally offline |
| vmi2076105-node | 100.99.156.20:9100 | Intentionally offline |
Deployment Architecture
Git-Linked Stacks
- Most stacks are deployed from Gitea (
git.vish.gg/Vish/homelab) - Branch:
wip - Portainer pulls configs directly from the repo
- Changes to repo configs will affect deployed stacks on next redeploy/update
Standalone Containers
The following containers are managed directly in Portainer (NOT Git-linked):
portainer/portainer_edge_agent- Infrastructurewatchtower- Auto-updates (on some servers)node-exportercontainers (where not bare metal)- Various testing/temporary containers
Bare Metal Services
Some services run directly on hosts, not in containers:
- Concord NUC: node_exporter (port 9100)
Common Issues & Solutions
Issue: Watchtower restart loop on Edge Agent hosts
Symptom: Watchtower continuously restarts with API version error
Cause: Portainer Edge Agent uses older Docker API
Solution: Add DOCKER_API_VERSION=1.44 to watchtower container environment
Issue: Port 9100 already in use for node_exporter container
Symptom: Container fails to start, "address already in use" Cause: node_exporter running on bare metal Solution: Don't run containerized node_exporter; use the bare metal instance
Issue: Invidious health check failing
Symptom: Container shows "unhealthy" but works fine Cause: YouTube API changes causing /api/v1/trending to return 500 Solution: This is cosmetic; the app works. Consider updating health check endpoint.
Maintenance Checklist
- Check Prometheus targets regularly for DOWN status
- Monitor watchtower logs for update failures
- Review Portainer for containers in restart loops
- Keep Git repo configs in sync with running stacks
- Document any manual container changes in this file