238 lines
5.6 KiB
Markdown
238 lines
5.6 KiB
Markdown
# Calypso Runbook
|
|
|
|
*Synology DS723+ - Secondary NAS and Infrastructure*
|
|
|
|
**Endpoint ID:** 443397
|
|
**Status:** 🟢 Online
|
|
**Hardware:** AMD Ryzen R1600, 32GB RAM, 2 bays + expansion
|
|
**Access:** `calypso.vish.local`
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
Calypso is the secondary Synology NAS handling critical infrastructure services including authentication, reverse proxy, and monitoring.
|
|
|
|
## Hardware Specs
|
|
|
|
| Component | Specification |
|
|
|----------|---------------|
|
|
| Model | Synology DS723+ |
|
|
| CPU | AMD Ryzen R1600 (2-core/4-thread) |
|
|
| RAM | 32GB |
|
|
| Storage | 2-bay SHR + eSATA expansion |
|
|
| Network | 2x 1GbE |
|
|
|
|
## Services
|
|
|
|
### Critical Infrastructure
|
|
|
|
| Service | Port | Purpose | Status |
|
|
|---------|------|---------|--------|
|
|
| **Nginx Proxy Manager** | 80/443 | SSL termination & routing | Required |
|
|
| **Authentik** | 9000 | SSO authentication | Required |
|
|
| **Prometheus** | 9090 | Metrics collection | Required |
|
|
| **Grafana** | 3000 | Dashboards | Required |
|
|
| **Alertmanager** | 9093 | Alert routing | Required |
|
|
|
|
### Additional Services
|
|
|
|
| Service | Port | Purpose |
|
|
|---------|------|---------|
|
|
| AdGuard | 3053 | DNS filtering (backup) |
|
|
| Paperless-NGX | 8000 | Document management |
|
|
| Reactive Resume | 3001 | Resume builder |
|
|
| Gitea | 3000/22 | Git hosting |
|
|
| Gitea Runner | 3008 | CI/CD |
|
|
| Headscale | 8080 | WireGuard VPN controller |
|
|
| Seafile | 8082 | File sync & share |
|
|
| Syncthing | 8384 | File sync |
|
|
| WireGuard | 51820 | VPN server |
|
|
| Portainer Agent | 9001 | Container management |
|
|
|
|
### Media (ARR Stack)
|
|
|
|
- Sonarr, Radarr, Lidarr
|
|
- Prowlarr (indexers)
|
|
- Bazarr (subtitles)
|
|
|
|
---
|
|
|
|
## Storage Layout
|
|
|
|
```
|
|
/volume1/
|
|
├── docker/
|
|
├── docker/compose/
|
|
├── appdata/ # Application data
|
|
│ ├── authentik/
|
|
│ ├── npm/
|
|
│ ├── prometheus/
|
|
│ └── grafana/
|
|
├── documents/ # Paperless
|
|
├── seafile/ # Seafile data
|
|
└── backups/ # Backup destination
|
|
```
|
|
|
|
---
|
|
|
|
## Daily Operations
|
|
|
|
### Check Service Health
|
|
```bash
|
|
# Via Portainer
|
|
open http://calypso.vish.local:9001
|
|
|
|
# Via SSH
|
|
ssh admin@calypso.vish.local
|
|
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
|
|
```
|
|
|
|
### Monitor Critical Services
|
|
```bash
|
|
# Check NPM
|
|
curl -I http://localhost:80
|
|
|
|
# Check Authentik
|
|
curl -I http://localhost:9000
|
|
|
|
# Check Prometheus
|
|
curl -I http://localhost:9090
|
|
```
|
|
|
|
---
|
|
|
|
## Common Issues
|
|
|
|
### NPM Not Routing
|
|
1. Check if NPM is running: `docker ps | grep npm`
|
|
2. Verify proxy hosts configured: Access NPM UI → Proxy Hosts
|
|
3. Check SSL certificates
|
|
4. Review NPM logs: `docker logs nginx-proxy-manager`
|
|
|
|
### Authentik SSO Broken
|
|
1. Check Authentik running: `docker ps | grep authentik`
|
|
2. Verify PostgreSQL: `docker logs authentik-postgresql`
|
|
3. Check Redis: `docker logs authentik-redis`
|
|
4. Review OIDC configurations in services
|
|
|
|
### Prometheus Down
|
|
1. Check storage: `docker system df`
|
|
2. Verify volume: `docker volume ls | grep prometheus`
|
|
3. Check retention settings
|
|
4. Review logs: `docker logs prometheus`
|
|
|
|
---
|
|
|
|
## Maintenance
|
|
|
|
### Weekly
|
|
- [ ] Verify Authentik users can login
|
|
- [ ] Check Prometheus metrics collection
|
|
- [ ] Review Alertmanager notifications
|
|
- [ ] Verify NPM certificates
|
|
|
|
### Monthly
|
|
- [ ] Clean unused Docker images
|
|
- [ ] Review Prometheus retention
|
|
- [ ] Update applications
|
|
- [ ] Check disk usage
|
|
|
|
### Quarterly
|
|
- [ ] Test OAuth flows
|
|
- [ ] Verify backup restoration
|
|
- [ ] Review monitoring thresholds
|
|
- [ ] Update SSL certificates
|
|
|
|
---
|
|
|
|
## SSL Certificate Management
|
|
|
|
NPM handles all SSL certificates:
|
|
|
|
1. **Automatic Renewal**: Let's Encrypt (default)
|
|
2. **Manual**: Access NPM → SSL Certificates → Add
|
|
3. **Check Status**: NPM Dashboard → SSL
|
|
|
|
### Common Certificate Issues
|
|
- Rate limits: Wait 1 hour between requests
|
|
- DNS challenge: Verify external DNS
|
|
- Self-signed: Use for internal services
|
|
|
|
---
|
|
|
|
## Backup Procedures
|
|
|
|
### Configuration Backup
|
|
```bash
|
|
# Via Ansible
|
|
ansible-playbook ansible/automation/playbooks/backup_configs.yml --tags calypso
|
|
```
|
|
|
|
### Key Data to Backup
|
|
- NPM configurations: `/volume1/docker/compose/nginx_proxy_manager/`
|
|
- Authentik: `/volume1/docker/appdata/authentik/`
|
|
- Prometheus: `/volume1/docker/appdata/prometheus/`
|
|
- Grafana: `/volume1/docker/appdata/grafana/`
|
|
|
|
---
|
|
|
|
## Emergency Procedures
|
|
|
|
### Authentik Down
|
|
**Impact**: SSO broken for all services
|
|
|
|
1. Verify containers running
|
|
2. Check PostgreSQL: `docker logs authentik-postgresql`
|
|
3. Check Redis: `docker logs authentik-redis`
|
|
4. Restart Authentik: `docker-compose restart`
|
|
5. If needed, restore from backup
|
|
|
|
### NPM Down
|
|
**Impact**: No external access
|
|
|
|
1. Verify container: `docker ps | grep npm`
|
|
2. Check ports 80/443: `netstat -tulpn | grep -E '80|443'`
|
|
3. Restart: `docker-compose restart`
|
|
4. Check DNS resolution
|
|
|
|
### Prometheus Full
|
|
**Impact**: No metrics
|
|
|
|
1. Check storage: `docker system df`
|
|
2. Reduce retention: Edit prometheus.yml
|
|
3. Clean old data: `docker exec prometheus promtool tsdb delete-insufficient`
|
|
4. Restart container
|
|
|
|
---
|
|
|
|
## Useful Commands
|
|
|
|
```bash
|
|
# SSH access
|
|
ssh admin@calypso.vish.local
|
|
|
|
# Check critical services
|
|
docker ps --filter "name=nginx" --filter "name=authentik" --filter "name=prometheus"
|
|
|
|
# Restart infrastructure
|
|
cd /volume1/docker/compose/nginx_proxy_manager && docker-compose restart
|
|
cd /volume1/docker/compose/authentik && docker-compose restart
|
|
|
|
# View logs
|
|
docker logs -f nginx-proxy-manager
|
|
docker logs -f authentik-server
|
|
docker logs -f prometheus
|
|
```
|
|
|
|
---
|
|
|
|
## Links
|
|
|
|
- [Synology DSM](https://calypso.vish.local:5001)
|
|
- [Nginx Proxy Manager](http://calypso.vish.local:81)
|
|
- [Authentik](http://calypso.vish.local:9000)
|
|
- [Prometheus](http://calypso.vish.local:9090)
|
|
- [Grafana](http://calypso.vish.local:3000)
|
|
- [Alertmanager](http://calypso.vish.local:9093)
|