Files
homelab-optimized/docs/infrastructure/hosts/calypso-runbook.md
Gitea Mirror Bot 29e47b18e9
Some checks failed
Documentation / Build Docusaurus (push) Failing after 13m3s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-03-31 10:10:42 UTC
2026-03-31 10:10:43 +00:00

238 lines
5.6 KiB
Markdown

# Calypso Runbook
*Synology DS723+ - Secondary NAS and Infrastructure*
**Endpoint ID:** 443397
**Status:** 🟢 Online
**Hardware:** AMD Ryzen R1600, 32GB RAM, 2 bays + expansion
**Access:** `calypso.vish.local`
---
## Overview
Calypso is the secondary Synology NAS handling critical infrastructure services including authentication, reverse proxy, and monitoring.
## Hardware Specs
| Component | Specification |
|----------|---------------|
| Model | Synology DS723+ |
| CPU | AMD Ryzen R1600 (2-core/4-thread) |
| RAM | 32GB |
| Storage | 2-bay SHR + eSATA expansion |
| Network | 2x 1GbE |
## Services
### Critical Infrastructure
| Service | Port | Purpose | Status |
|---------|------|---------|--------|
| **Nginx Proxy Manager** | 80/443 | SSL termination & routing | Required |
| **Authentik** | 9000 | SSO authentication | Required |
| **Prometheus** | 9090 | Metrics collection | Required |
| **Grafana** | 3000 | Dashboards | Required |
| **Alertmanager** | 9093 | Alert routing | Required |
### Additional Services
| Service | Port | Purpose |
|---------|------|---------|
| AdGuard | 3053 | DNS filtering (backup) |
| Paperless-NGX | 8000 | Document management |
| Reactive Resume | 3001 | Resume builder |
| Gitea | 3000/22 | Git hosting |
| Gitea Runner | 3008 | CI/CD |
| Headscale | 8080 | WireGuard VPN controller |
| Seafile | 8082 | File sync & share |
| Syncthing | 8384 | File sync |
| WireGuard | 51820 | VPN server |
| Portainer Agent | 9001 | Container management |
### Media (ARR Stack)
- Sonarr, Radarr, Lidarr
- Prowlarr (indexers)
- Bazarr (subtitles)
---
## Storage Layout
```
/volume1/
├── docker/
├── docker/compose/
├── appdata/ # Application data
│ ├── authentik/
│ ├── npm/
│ ├── prometheus/
│ └── grafana/
├── documents/ # Paperless
├── seafile/ # Seafile data
└── backups/ # Backup destination
```
---
## Daily Operations
### Check Service Health
```bash
# Via Portainer
open http://calypso.vish.local:9001
# Via SSH
ssh admin@calypso.vish.local
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
```
### Monitor Critical Services
```bash
# Check NPM
curl -I http://localhost:80
# Check Authentik
curl -I http://localhost:9000
# Check Prometheus
curl -I http://localhost:9090
```
---
## Common Issues
### NPM Not Routing
1. Check if NPM is running: `docker ps | grep npm`
2. Verify proxy hosts configured: Access NPM UI → Proxy Hosts
3. Check SSL certificates
4. Review NPM logs: `docker logs nginx-proxy-manager`
### Authentik SSO Broken
1. Check Authentik running: `docker ps | grep authentik`
2. Verify PostgreSQL: `docker logs authentik-postgresql`
3. Check Redis: `docker logs authentik-redis`
4. Review OIDC configurations in services
### Prometheus Down
1. Check storage: `docker system df`
2. Verify volume: `docker volume ls | grep prometheus`
3. Check retention settings
4. Review logs: `docker logs prometheus`
---
## Maintenance
### Weekly
- [ ] Verify Authentik users can login
- [ ] Check Prometheus metrics collection
- [ ] Review Alertmanager notifications
- [ ] Verify NPM certificates
### Monthly
- [ ] Clean unused Docker images
- [ ] Review Prometheus retention
- [ ] Update applications
- [ ] Check disk usage
### Quarterly
- [ ] Test OAuth flows
- [ ] Verify backup restoration
- [ ] Review monitoring thresholds
- [ ] Update SSL certificates
---
## SSL Certificate Management
NPM handles all SSL certificates:
1. **Automatic Renewal**: Let's Encrypt (default)
2. **Manual**: Access NPM → SSL Certificates → Add
3. **Check Status**: NPM Dashboard → SSL
### Common Certificate Issues
- Rate limits: Wait 1 hour between requests
- DNS challenge: Verify external DNS
- Self-signed: Use for internal services
---
## Backup Procedures
### Configuration Backup
```bash
# Via Ansible
ansible-playbook ansible/automation/playbooks/backup_configs.yml --tags calypso
```
### Key Data to Backup
- NPM configurations: `/volume1/docker/compose/nginx_proxy_manager/`
- Authentik: `/volume1/docker/appdata/authentik/`
- Prometheus: `/volume1/docker/appdata/prometheus/`
- Grafana: `/volume1/docker/appdata/grafana/`
---
## Emergency Procedures
### Authentik Down
**Impact**: SSO broken for all services
1. Verify containers running
2. Check PostgreSQL: `docker logs authentik-postgresql`
3. Check Redis: `docker logs authentik-redis`
4. Restart Authentik: `docker-compose restart`
5. If needed, restore from backup
### NPM Down
**Impact**: No external access
1. Verify container: `docker ps | grep npm`
2. Check ports 80/443: `netstat -tulpn | grep -E '80|443'`
3. Restart: `docker-compose restart`
4. Check DNS resolution
### Prometheus Full
**Impact**: No metrics
1. Check storage: `docker system df`
2. Reduce retention: Edit prometheus.yml
3. Clean old data: `docker exec prometheus promtool tsdb delete-insufficient`
4. Restart container
---
## Useful Commands
```bash
# SSH access
ssh admin@calypso.vish.local
# Check critical services
docker ps --filter "name=nginx" --filter "name=authentik" --filter "name=prometheus"
# Restart infrastructure
cd /volume1/docker/compose/nginx_proxy_manager && docker-compose restart
cd /volume1/docker/compose/authentik && docker-compose restart
# View logs
docker logs -f nginx-proxy-manager
docker logs -f authentik-server
docker logs -f prometheus
```
---
## Links
- [Synology DSM](https://calypso.vish.local:5001)
- [Nginx Proxy Manager](http://calypso.vish.local:81)
- [Authentik](http://calypso.vish.local:9000)
- [Prometheus](http://calypso.vish.local:9090)
- [Grafana](http://calypso.vish.local:3000)
- [Alertmanager](http://calypso.vish.local:9093)