Files
homelab-optimized/docs/infrastructure/hosts/calypso-runbook.md
Gitea Mirror Bot 29e47b18e9
Some checks failed
Documentation / Build Docusaurus (push) Failing after 13m3s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-03-31 10:10:42 UTC
2026-03-31 10:10:43 +00:00

5.6 KiB

Calypso Runbook

Synology DS723+ - Secondary NAS and Infrastructure

Endpoint ID: 443397
Status: 🟢 Online
Hardware: AMD Ryzen R1600, 32GB RAM, 2 bays + expansion
Access: calypso.vish.local


Overview

Calypso is the secondary Synology NAS handling critical infrastructure services including authentication, reverse proxy, and monitoring.

Hardware Specs

Component Specification
Model Synology DS723+
CPU AMD Ryzen R1600 (2-core/4-thread)
RAM 32GB
Storage 2-bay SHR + eSATA expansion
Network 2x 1GbE

Services

Critical Infrastructure

Service Port Purpose Status
Nginx Proxy Manager 80/443 SSL termination & routing Required
Authentik 9000 SSO authentication Required
Prometheus 9090 Metrics collection Required
Grafana 3000 Dashboards Required
Alertmanager 9093 Alert routing Required

Additional Services

Service Port Purpose
AdGuard 3053 DNS filtering (backup)
Paperless-NGX 8000 Document management
Reactive Resume 3001 Resume builder
Gitea 3000/22 Git hosting
Gitea Runner 3008 CI/CD
Headscale 8080 WireGuard VPN controller
Seafile 8082 File sync & share
Syncthing 8384 File sync
WireGuard 51820 VPN server
Portainer Agent 9001 Container management

Media (ARR Stack)

  • Sonarr, Radarr, Lidarr
  • Prowlarr (indexers)
  • Bazarr (subtitles)

Storage Layout

/volume1/
├── docker/
├── docker/compose/
├── appdata/           # Application data
│   ├── authentik/
│   ├── npm/
│   ├── prometheus/
│   └── grafana/
├── documents/        # Paperless
├── seafile/          # Seafile data
└── backups/          # Backup destination

Daily Operations

Check Service Health

# Via Portainer
open http://calypso.vish.local:9001

# Via SSH
ssh admin@calypso.vish.local
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"

Monitor Critical Services

# Check NPM
curl -I http://localhost:80

# Check Authentik
curl -I http://localhost:9000

# Check Prometheus
curl -I http://localhost:9090

Common Issues

NPM Not Routing

  1. Check if NPM is running: docker ps | grep npm
  2. Verify proxy hosts configured: Access NPM UI → Proxy Hosts
  3. Check SSL certificates
  4. Review NPM logs: docker logs nginx-proxy-manager

Authentik SSO Broken

  1. Check Authentik running: docker ps | grep authentik
  2. Verify PostgreSQL: docker logs authentik-postgresql
  3. Check Redis: docker logs authentik-redis
  4. Review OIDC configurations in services

Prometheus Down

  1. Check storage: docker system df
  2. Verify volume: docker volume ls | grep prometheus
  3. Check retention settings
  4. Review logs: docker logs prometheus

Maintenance

Weekly

  • Verify Authentik users can login
  • Check Prometheus metrics collection
  • Review Alertmanager notifications
  • Verify NPM certificates

Monthly

  • Clean unused Docker images
  • Review Prometheus retention
  • Update applications
  • Check disk usage

Quarterly

  • Test OAuth flows
  • Verify backup restoration
  • Review monitoring thresholds
  • Update SSL certificates

SSL Certificate Management

NPM handles all SSL certificates:

  1. Automatic Renewal: Let's Encrypt (default)
  2. Manual: Access NPM → SSL Certificates → Add
  3. Check Status: NPM Dashboard → SSL

Common Certificate Issues

  • Rate limits: Wait 1 hour between requests
  • DNS challenge: Verify external DNS
  • Self-signed: Use for internal services

Backup Procedures

Configuration Backup

# Via Ansible
ansible-playbook ansible/automation/playbooks/backup_configs.yml --tags calypso

Key Data to Backup

  • NPM configurations: /volume1/docker/compose/nginx_proxy_manager/
  • Authentik: /volume1/docker/appdata/authentik/
  • Prometheus: /volume1/docker/appdata/prometheus/
  • Grafana: /volume1/docker/appdata/grafana/

Emergency Procedures

Authentik Down

Impact: SSO broken for all services

  1. Verify containers running
  2. Check PostgreSQL: docker logs authentik-postgresql
  3. Check Redis: docker logs authentik-redis
  4. Restart Authentik: docker-compose restart
  5. If needed, restore from backup

NPM Down

Impact: No external access

  1. Verify container: docker ps | grep npm
  2. Check ports 80/443: netstat -tulpn | grep -E '80|443'
  3. Restart: docker-compose restart
  4. Check DNS resolution

Prometheus Full

Impact: No metrics

  1. Check storage: docker system df
  2. Reduce retention: Edit prometheus.yml
  3. Clean old data: docker exec prometheus promtool tsdb delete-insufficient
  4. Restart container

Useful Commands

# SSH access
ssh admin@calypso.vish.local

# Check critical services
docker ps --filter "name=nginx" --filter "name=authentik" --filter "name=prometheus"

# Restart infrastructure
cd /volume1/docker/compose/nginx_proxy_manager && docker-compose restart
cd /volume1/docker/compose/authentik && docker-compose restart

# View logs
docker logs -f nginx-proxy-manager
docker logs -f authentik-server
docker logs -f prometheus