Sanitized mirror from private repository - 2026-03-29 13:33:25 UTC
Some checks failed
Documentation / Build Docusaurus (push) Failing after 5m0s
Documentation / Deploy to GitHub Pages (push) Has been skipped

This commit is contained in:
Gitea Mirror Bot
2026-03-29 13:33:25 +00:00
commit 75d4f4e02b
1280 changed files with 331190 additions and 0 deletions

View File

@@ -0,0 +1,228 @@
# Atlantis Runbook
*Synology DS1821+ - Primary NAS and Media Server*
**Endpoint ID:** 2
**Status:** 🟢 Online
**Hardware:** AMD Ryzen V1500B, 32GB RAM, 8 bays
**Access:** `atlantis.vish.local`
---
## Overview
Atlantis is the primary Synology NAS serving as the homelab's central storage and media infrastructure.
## Hardware Specs
| Component | Specification |
|----------|---------------|
| Model | Synology DS1821+ |
| CPU | AMD Ryzen V1500B (4-core) |
| RAM | 32GB |
| Storage | 8-bay RAID6 + SSD cache |
| Network | 4x 1GbE (Link aggregated) |
## Services
### Critical Services
| Service | Port | Purpose | Docker Image |
|---------|------|---------|--------------|
| **Vaultwarden** | 8080 | Password manager | vaultwarden/server |
| **Immich** | 2283 | Photo backup | immich-app/immich |
| **Plex** | 32400 | Media server | plexinc/pms-docker |
| **Ollama** | 11434 | AI/ML | ollama/ollama |
### Media Stack
| Service | Port | Purpose |
|---------|------|---------|
| arr-suite | Various | Sonarr, Radarr, Lidarr, Prowlarr |
| qBittorrent | 8080 | Download client |
| Jellyseerr | 5055 | Media requests |
### Infrastructure
| Service | Port | Purpose |
|---------|------|---------|
| Portainer | 9000 | Container management |
| Watchtower | 9001 | Auto-updates |
| Dozzle | 8081 | Log viewer |
| Nginx Proxy Manager | 81/444 | Legacy proxy |
### Additional Services
- Jitsi (Video conferencing)
- Matrix/Synapse (Chat)
- Mastodon (Social)
- Paperless-NGX (Documents)
- Syncthing (File sync)
- Grafana + Prometheus (Monitoring)
---
## Storage Layout
```
/volume1/
├── docker/ # Docker volumes
├── docker/compose/ # Service configurations
├── media/ # Media files
│ ├── movies/
│ ├── tv/
│ ├── music/
│ └── books/
├── photos/ # Immich storage
├── backups/ # Backup destination
└── shared/ # Shared folders
```
---
## Daily Operations
### Check Service Health
```bash
# Via Portainer
open http://atlantis.vish.local:9000
# Via SSH
ssh admin@atlantis.vish.local
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
```
### Check Disk Usage
```bash
# SSH to Atlantis
ssh admin@atlantis.vish.local
# Synology storage manager
sudo syno-storage-usage -a
# Or via Docker
docker system df
```
### View Logs
```bash
# Specific service
docker logs vaultwarden
# Follow logs
docker logs -f vaultwarden
```
---
## Common Issues
### Service Won't Start
1. Check if port is already in use: `netstat -tulpn | grep <port>`
2. Check logs: `docker logs <container>`
3. Verify volume paths exist
4. Restart Docker: `sudo systemctl restart docker`
### Storage Full
1. Identify large files: `docker system df -v`
2. Clean Docker: `docker system prune -a`
3. Check Synology Storage Analyzer
4. Archive old media files
### Performance Issues
1. Check resource usage: `docker stats`
2. Review Plex transcode logs
3. Check RAID health: `sudo mdadm --detail /dev/md0`
---
## Maintenance
### Weekly
- [ ] Verify backup completion
- [ ] Check disk health (S.M.A.R.T.)
- [ ] Review Watchtower updates
- [ ] Check Plex library integrity
### Monthly
- [ ] Run Docker cleanup
- [ ] Update Docker Compose files
- [ ] Review storage usage trends
- [ ] Check security updates
### Quarterly
- [ ] Deep clean unused images/containers
- [ ] Review service dependencies
- [ ] Test disaster recovery
- [ ] Update documentation
---
## Backup Procedures
### Configuration Backup
```bash
# Via Ansible
ansible-playbook ansible/automation/playbooks/backup_configs.yml --tags atlantis
```
### Data Backup
- Synology Hyper Backup to external drive
- Cloud sync to Backblaze B2
- Critical configs to Git repository
### Verification
```bash
ansible-playbook ansible/automation/playbooks/backup_verification.yml
```
---
## Emergency Procedures
### Complete Outage
1. Verify Synology is powered on
2. Check network connectivity
3. Access via DSM: `https://atlantis.vish.local:5001`
4. Check Storage Manager for RAID status
5. Contact via serial if no network
### RAID Degraded
1. Identify failed drive via Storage Manager
2. Power down and replace drive
3. Rebuild will start automatically
4. Monitor rebuild progress
### Data Recovery
See [Disaster Recovery Guide](../troubleshooting/disaster-recovery.md)
---
## Useful Commands
```bash
# SSH access
ssh admin@atlantis.vish.local
# Container management
cd /volume1/docker/compose/<service>
docker-compose restart <service>
# View all containers
docker ps -a --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
# Logs for critical services
docker logs vaultwarden
docker logs plex
docker logs immich
```
---
## Links
- [Synology DSM](https://atlantis.vish.local:5001)
- [Portainer](http://atlantis.vish.local:9000)
- [Vaultwarden](http://atlantis.vish.local:8080)
- [Plex](http://atlantis.vish.local:32400)
- [Immich](http://atlantis.vish.local:2283)

View File

@@ -0,0 +1,237 @@
# Calypso Runbook
*Synology DS723+ - Secondary NAS and Infrastructure*
**Endpoint ID:** 443397
**Status:** 🟢 Online
**Hardware:** AMD Ryzen R1600, 32GB RAM, 2 bays + expansion
**Access:** `calypso.vish.local`
---
## Overview
Calypso is the secondary Synology NAS handling critical infrastructure services including authentication, reverse proxy, and monitoring.
## Hardware Specs
| Component | Specification |
|----------|---------------|
| Model | Synology DS723+ |
| CPU | AMD Ryzen R1600 (2-core/4-thread) |
| RAM | 32GB |
| Storage | 2-bay SHR + eSATA expansion |
| Network | 2x 1GbE |
## Services
### Critical Infrastructure
| Service | Port | Purpose | Status |
|---------|------|---------|--------|
| **Nginx Proxy Manager** | 80/443 | SSL termination & routing | Required |
| **Authentik** | 9000 | SSO authentication | Required |
| **Prometheus** | 9090 | Metrics collection | Required |
| **Grafana** | 3000 | Dashboards | Required |
| **Alertmanager** | 9093 | Alert routing | Required |
### Additional Services
| Service | Port | Purpose |
|---------|------|---------|
| AdGuard | 3053 | DNS filtering (backup) |
| Paperless-NGX | 8000 | Document management |
| Reactive Resume | 3001 | Resume builder |
| Gitea | 3000/22 | Git hosting |
| Gitea Runner | 3008 | CI/CD |
| Headscale | 8080 | WireGuard VPN controller |
| Seafile | 8082 | File sync & share |
| Syncthing | 8384 | File sync |
| WireGuard | 51820 | VPN server |
| Portainer Agent | 9001 | Container management |
### Media (ARR Stack)
- Sonarr, Radarr, Lidarr
- Prowlarr (indexers)
- Bazarr (subtitles)
---
## Storage Layout
```
/volume1/
├── docker/
├── docker/compose/
├── appdata/ # Application data
│ ├── authentik/
│ ├── npm/
│ ├── prometheus/
│ └── grafana/
├── documents/ # Paperless
├── seafile/ # Seafile data
└── backups/ # Backup destination
```
---
## Daily Operations
### Check Service Health
```bash
# Via Portainer
open http://calypso.vish.local:9001
# Via SSH
ssh admin@calypso.vish.local
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
```
### Monitor Critical Services
```bash
# Check NPM
curl -I http://localhost:80
# Check Authentik
curl -I http://localhost:9000
# Check Prometheus
curl -I http://localhost:9090
```
---
## Common Issues
### NPM Not Routing
1. Check if NPM is running: `docker ps | grep npm`
2. Verify proxy hosts configured: Access NPM UI → Proxy Hosts
3. Check SSL certificates
4. Review NPM logs: `docker logs nginx-proxy-manager`
### Authentik SSO Broken
1. Check Authentik running: `docker ps | grep authentik`
2. Verify PostgreSQL: `docker logs authentik-postgresql`
3. Check Redis: `docker logs authentik-redis`
4. Review OIDC configurations in services
### Prometheus Down
1. Check storage: `docker system df`
2. Verify volume: `docker volume ls | grep prometheus`
3. Check retention settings
4. Review logs: `docker logs prometheus`
---
## Maintenance
### Weekly
- [ ] Verify Authentik users can login
- [ ] Check Prometheus metrics collection
- [ ] Review Alertmanager notifications
- [ ] Verify NPM certificates
### Monthly
- [ ] Clean unused Docker images
- [ ] Review Prometheus retention
- [ ] Update applications
- [ ] Check disk usage
### Quarterly
- [ ] Test OAuth flows
- [ ] Verify backup restoration
- [ ] Review monitoring thresholds
- [ ] Update SSL certificates
---
## SSL Certificate Management
NPM handles all SSL certificates:
1. **Automatic Renewal**: Let's Encrypt (default)
2. **Manual**: Access NPM → SSL Certificates → Add
3. **Check Status**: NPM Dashboard → SSL
### Common Certificate Issues
- Rate limits: Wait 1 hour between requests
- DNS challenge: Verify external DNS
- Self-signed: Use for internal services
---
## Backup Procedures
### Configuration Backup
```bash
# Via Ansible
ansible-playbook ansible/automation/playbooks/backup_configs.yml --tags calypso
```
### Key Data to Backup
- NPM configurations: `/volume1/docker/compose/nginx_proxy_manager/`
- Authentik: `/volume1/docker/appdata/authentik/`
- Prometheus: `/volume1/docker/appdata/prometheus/`
- Grafana: `/volume1/docker/appdata/grafana/`
---
## Emergency Procedures
### Authentik Down
**Impact**: SSO broken for all services
1. Verify containers running
2. Check PostgreSQL: `docker logs authentik-postgresql`
3. Check Redis: `docker logs authentik-redis`
4. Restart Authentik: `docker-compose restart`
5. If needed, restore from backup
### NPM Down
**Impact**: No external access
1. Verify container: `docker ps | grep npm`
2. Check ports 80/443: `netstat -tulpn | grep -E '80|443'`
3. Restart: `docker-compose restart`
4. Check DNS resolution
### Prometheus Full
**Impact**: No metrics
1. Check storage: `docker system df`
2. Reduce retention: Edit prometheus.yml
3. Clean old data: `docker exec prometheus promtool tsdb delete-insufficient`
4. Restart container
---
## Useful Commands
```bash
# SSH access
ssh admin@calypso.vish.local
# Check critical services
docker ps --filter "name=nginx" --filter "name=authentik" --filter "name=prometheus"
# Restart infrastructure
cd /volume1/docker/compose/nginx_proxy_manager && docker-compose restart
cd /volume1/docker/compose/authentik && docker-compose restart
# View logs
docker logs -f nginx-proxy-manager
docker logs -f authentik-server
docker logs -f prometheus
```
---
## Links
- [Synology DSM](https://calypso.vish.local:5001)
- [Nginx Proxy Manager](http://calypso.vish.local:81)
- [Authentik](http://calypso.vish.local:9000)
- [Prometheus](http://calypso.vish.local:9090)
- [Grafana](http://calypso.vish.local:3000)
- [Alertmanager](http://calypso.vish.local:9093)

View File

@@ -0,0 +1,244 @@
# Concord NUC Runbook
*Intel NUC6i3SYB - Home Automation & DNS*
**Endpoint ID:** 443398
**Status:** 🟢 Online
**Hardware:** Intel Core i3-6100U, 16GB RAM, 256GB SSD
**Access:** `concordnuc.vish.local`
---
## Overview
Concord NUC runs lightweight services focused on home automation, DNS filtering, and local network services.
## Hardware Specs
| Component | Specification |
|----------|---------------|
| Model | Intel NUC6i3SYB |
| CPU | Intel Core i3-6100U (2-core) |
| RAM | 16GB |
| Storage | 256GB SSD |
| Network | 1x 1GbE |
## Services
### Critical Services
| Service | Port | Purpose | Docker Image |
|---------|------|---------|---------------|
| **AdGuard Home** | 3053/53 | DNS filtering | adguard/adguardhome |
| **Home Assistant** | 8123 | Home automation | homeassistant/home-assistant |
| **Matter Server** | 5580 | Matter protocol | matter-server/matter-server |
### Additional Services
| Service | Port | Purpose |
|---------|------|---------|
| Plex | 32400 | Media server |
| Invidious | 2999 | YouTube frontend |
| Piped | 1234 | YouTube music |
| Syncthing | 8384 | File sync |
| WireGuard | 51820 | VPN server |
| Portainer Agent | 9001 | Container management |
| Node Exporter | 9100 | Metrics |
---
## Network Position
```
Internet
[Home Router] ──WAN──► (Public IP)
├─► [Pi-hole Primary]
└─► [AdGuard Home] ──► Local DNS
[Home Assistant] ──► Zigbee/Z-Wave
```
---
## Daily Operations
### Check Service Health
```bash
# Via Portainer
open http://concordnuc.vish.local:9001
# Via SSH
ssh homelab@concordnuc.vish.local
docker ps
```
### Home Assistant
```bash
# Access UI
open http://concordnuc.vish.local:8123
# Check logs
docker logs homeassistant
```
### AdGuard Home
```bash
# Access UI
open http://concordnuc.vish.local:3053
# Check DNS filtering
# Admin → Dashboard → DNS Queries
```
---
## Common Issues
### Home Assistant Won't Start
1. Check logs: `docker logs homeassistant`
2. Verify config: `config/configuration.yaml`
3. Check Zigbee/Z-Wave stick
4. Restore from backup if needed
### AdGuard Not Filtering
1. Check service: `docker ps | grep adguard`
2. Verify DNS settings on router
3. Check filter lists: Admin → Filters
4. Review query log
### No Network Connectivity
1. Check Docker: `systemctl status docker`
2. Verify network: `ip addr`
3. Check firewall: `sudo ufw status`
---
## Home Assistant Configuration
### Add-ons Running
- Zigbee2MQTT
- Z-Wave JS UI
- File editor
- Terminal
### Backup
```bash
# Manual backup via UI
Configuration → Backups → Create backup
# Automated to Synology
Syncthing → Backups/homeassistant/
```
### Restoration
1. Access HA in safe mode
2. Configuration → Backups
3. Select backup → Restore
---
## AdGuard Home Configuration
### DNS Providers
- Cloudflare: 1.1.1.1
- Google: 8.8.8.8
### Blocklists Enabled
- AdGuard Default
- AdAway
- Malware domains
### Query Log
Access: Admin → Logs
- Useful for debugging DNS issues
- Check for blocked domains
---
## Maintenance
### Weekly
- [ ] Check HA logs for errors
- [ ] Review AdGuard query log
- [ ] Verify backups completed
### Monthly
- [ ] Update Home Assistant
- [ ] Review AdGuard filters
- [ ] Clean unused Docker images
### Quarterly
- [ ] Test automation reliability
- [ ] Review device states
- [ ] Check Zigbee network health
---
## Emergency Procedures
### Home Assistant Down
**Impact**: Smart home controls unavailable
1. Check container: `docker ps | grep homeassistant`
2. Restart: `docker-compose restart`
3. Check logs: `docker logs homeassistant`
4. If corrupted, restore from backup
### AdGuard Down
**Impact**: DNS issues on network
1. Verify: `dig google.com @localhost`
2. Restart: `docker-compose restart`
3. Check config in UI
4. Fallback to Pi-hole
### Complete Hardware Failure
1. Replace NUC hardware
2. Reinstall Ubuntu/Debian
3. Run deploy playbook:
```bash
ansible-playbook ansible/homelab/playbooks/deploy_concord_nuc.yml
```
---
## Useful Commands
```bash
# SSH access
ssh homelab@concordnuc.vish.local
# Restart services
docker-compose -f /opt/docker/compose/homeassistant.yaml restart
docker-compose -f /opt/docker/compose/adguard.yaml restart
# View logs
docker logs -f homeassistant
docker logs -f adguard
# Check resource usage
docker stats
```
---
## Device Access
| Device | Protocol | Address |
|--------|----------|---------|
| Zigbee Coordinator | USB | /dev/serial/by-id/* |
| Z-Wave Controller | USB | /dev/serial/by-id/* |
---
## Links
- [Home Assistant](http://concordnuc.vish.local:8123)
- [AdGuard Home](http://concordnuc.vish.local:3053)
- [Plex](http://concordnuc.vish.local:32400)
- [Invidious](http://concordnuc.vish.local:2999)

View File

@@ -0,0 +1,218 @@
# Homelab VM Runbook
*Proxmox VM - Monitoring & DevOps*
**Endpoint ID:** 443399
**Status:** 🟢 Online
**Hardware:** 4 vCPU, 28GB RAM
**Access:** `192.168.0.210`
---
## Overview
Homelab VM runs monitoring, alerting, and development services on Proxmox.
## Hardware Specs
| Component | Specification |
|----------|---------------|
| Platform | Proxmox VE |
| vCPU | 4 cores |
| RAM | 28GB |
| Storage | 100GB SSD |
| Network | 1x 1GbE |
## Services
### Monitoring Stack
| Service | Port | Purpose |
|---------|------|---------|
| **Prometheus** | 9090 | Metrics collection |
| **Grafana** | 3000 | Dashboards |
| **Alertmanager** | 9093 | Alert routing |
| **Node Exporter** | 9100 | System metrics |
| **cAdvisor** | 8080 | Container metrics |
| **Uptime Kuma** | 3001 | Uptime monitoring |
### Development
| Service | Port | Purpose |
|---------|------|---------|
| Gitea | 3000 | Git hosting |
| Gitea Runner | 3008 | CI/CD runner |
| OpenHands | 8000 | AI developer |
### Database
| Service | Port | Purpose |
|---------|------|---------|
| PostgreSQL | 5432 | Database |
| Redis | 6379 | Caching |
---
## Daily Operations
### Check Monitoring
```bash
# Prometheus targets
curl http://192.168.0.210:9090/api/v1/targets | jq
# Grafana dashboards
open http://192.168.0.210:3000
```
### Alert Status
```bash
# Alertmanager
open http://192.168.0.210:9093
# Check ntfy for alerts
curl -s ntfy.vish.local/homelab-alerts | head -20
```
---
## Prometheus Configuration
### Scraping Targets
- Node exporters (all hosts)
- cAdvisor (all hosts)
- Prometheus self-monitoring
- Application-specific metrics
### Retention
- Time: 30 days
- Storage: 20GB
### Maintenance
```bash
# Check TSDB size
du -sh /var/lib/prometheus/
# Manual compaction
docker exec prometheus promtool tsdb compact /prometheus
```
---
## Grafana Dashboards
### Key Dashboards
- Infrastructure Overview
- Container Health
- Network Traffic
- Service-specific metrics
### Alert Rules
- CPU > 80% for 5 minutes
- Memory > 90% for 5 minutes
- Disk > 85%
- Service down > 2 minutes
---
## Common Issues
### Prometheus Not Scraping
1. Check targets: Prometheus UI → Status → Targets
2. Verify network connectivity
3. Check firewall rules
4. Review scrape errors in logs
### Grafana Dashboards Slow
1. Check Prometheus query performance
2. Reduce time range
3. Optimize queries
4. Check resource usage
### Alerts Not Firing
1. Verify Alertmanager config
2. Check ntfy integration
3. Review alert rules syntax
4. Test with artificial alert
---
## Maintenance
### Weekly
- [ ] Review alert history
- [ ] Check disk space
- [ ] Verify backups
### Monthly
- [ ] Clean old metrics
- [ ] Update dashboards
- [ ] Review alert thresholds
### Quarterly
- [ ] Test alert notifications
- [ ] Review retention policy
- [ ] Optimize queries
---
## Backup Procedures
### Configuration
```bash
# Grafana dashboards
cp -r /opt/grafana/dashboards /backup/
# Prometheus rules
cp -r /opt/prometheus/rules /backup/
```
### Ansible
```bash
ansible-playbook ansible/automation/playbooks/backup_configs.yml --tags homelab_vm
```
---
## Emergency Procedures
### Prometheus Full
1. Check storage: `docker system df`
2. Reduce retention in prometheus.yml
3. Delete old data: `docker exec prometheus rm -rf /prometheus/wal/*`
4. Restart container
### VM Down
1. Check Proxmox: `qm list`
2. Start VM: `qm start <vmid>`
3. Check console: `qm terminal <vmid>`
4. Review logs in Proxmox UI
---
## Useful Commands
```bash
# SSH access
ssh homelab@192.168.0.210
# Restart monitoring
cd /opt/docker/prometheus && docker-compose restart
cd /opt/docker/grafana && docker-compose restart
# Check targets
curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.health=="down")'
# View logs
docker logs prometheus
docker logs grafana
docker logs alertmanager
```
---
## Links
- [Prometheus](http://192.168.0.210:9090)
- [Grafana](http://192.168.0.210:3000)
- [Alertmanager](http://192.168.0.210:9093)
- [Uptime Kuma](http://192.168.0.210:3001)

View File

@@ -0,0 +1,179 @@
# RPi5 Runbook
*Raspberry Pi 5 - Edge Services*
**Endpoint ID:** 443395
**Status:** 🟢 Online
**Hardware:** ARM Cortex-A76, 16GB RAM, 512GB USB SSD
**Access:** `rpi5-vish.local`
---
## Overview
Raspberry Pi 5 runs edge services including Immich backup and lightweight applications.
## Hardware Specs
| Component | Specification |
|----------|---------------|
| Model | Raspberry Pi 5 |
| CPU | ARM Cortex-A76 (4-core) |
| RAM | 16GB |
| Storage | 512GB USB-C SSD |
| Network | 1x 1GbE (Pi 4 adapter) |
## Services
### Primary Services
| Service | Port | Purpose |
|---------|------|---------|
| **Immich** | 2283 | Photo backup (edge) |
| Portainer Agent | 9001 | Container management |
| Node Exporter | 9100 | Metrics |
### Services (if enabled)
| Service | Port | Purpose |
|---------|------|---------|
| Plex | 32400 | Media server |
| WireGuard | 51820 | VPN |
## Secondary Pi Nodes
### Pi-5-Kevin
This is a secondary Raspberry Pi 5 node with identical specifications but not typically online.
- **CPU**: Broadcom BCM2712 (4-core, 2.4GHz)
- **RAM**: 8GB LPDDR4X
- **Storage**: 64GB microSD
- **Network**: Gigabit Ethernet + WiFi 6
---
## Daily Operations
### Check Service Health
```bash
# Via Portainer
open http://rpi5-vish.local:9001
# Via SSH
ssh pi@rpi5-vish.local
docker ps
```
### Immich Status
```bash
# Access UI
open http://rpi5-vish.local:2283
# Check sync status
docker logs immich-server | grep -i sync
```
---
## Common Issues
### Container Won't Start (ARM compatibility)
1. Verify image supports ARM64: `docker pull --platform linux/arm64 <image>`
2. Check container logs
3. Verify Raspberry Pi OS 64-bit
### Storage Slow
1. Check USB drive: `lsusb`
2. Verify SSD: `sudo hdparm -t /dev/sda`
3. Use fast USB port (USB-C)
### Network Issues
1. Check adapter compatibility
2. Verify driver loaded: `lsmod | grep smsc95xx`
3. Update firmware: `sudo rpi-eeprom-update`
---
## Storage
### Layout
```
/home/pi/
├── docker/ # Docker data
├── immich/ # Photo storage
└── backups/ # Local backups
```
### Performance Tips
- Use USB 3.0 SSD
- Usequality power supply (5V 5A)
- Enable USB max_current in config.txt
---
## Maintenance
### Weekly
- [ ] Check Docker disk usage
- [ ] Verify Immich backup
- [ ] Check container health
### Monthly
- [ ] Update Raspberry Pi OS
- [ ] Clean unused images
- [ ] Review resource usage
### Quarterly
- [ ] Test backup restoration
- [ ] Verify ARM image compatibility
- [ ] Check firmware updates
---
## Emergency Procedures
### SD Card/Storage Failure
1. Replace storage drive
2. Reinstall Raspberry Pi OS
3. Run deploy playbook:
```bash
ansible-playbook ansible/homelab/playbooks/deploy_rpi5_vish.yml
```
### Overheating
1. Add heatsinks
2. Enable fan
3. Reduce CPU frequency: `sudo echo "arm_freq=1800" >> /boot/config.txt`
## Notes
This Raspberry Pi 5 system is the primary node that runs Immich and other services, with the secondary node **pi-5-kevin** intentionally kept offline for backup purposes when needed.
---
## Useful Commands
```bash
# SSH access
ssh pi@rpi5-vish.local
# Check temperature
vcgencmd measure_temp
# Check throttling
vcgencmd get_throttled
# Update firmware
sudo rpi-eeprom-update
sudo rpi-eeprom-update -a
# View Immich logs
docker logs -f immich-server
```
---
## Links
- [Immich](http://rpi5-vish.local:2283)
- [Portainer](http://rpi5-vish.local:9001)

View File

@@ -0,0 +1,66 @@
# Host Runbooks
This directory contains operational runbooks for each host in the homelab infrastructure.
## Available Runbooks
- [Atlantis Runbook](./atlantis-runbook.md) - Synology DS1821+ (Primary NAS)
- [Calypso Runbook](./calypso-runbook.md) - Synology DS723+ (Secondary NAS)
- [Concord NUC Runbook](./concord-nuc-runbook.md) - Intel NUC (Home Automation & DNS)
- [Homelab VM Runbook](./homelab-vm-runbook.md) - Proxmox VM (Monitoring & DevOps)
- [RPi5 Runbook](./rpi5-runbook.md) - Raspberry Pi 5 (Edge Services)
---
## Common Tasks
All hosts share common operational procedures:
### Viewing Logs
```bash
# Via SSH to host
docker logs <container_name>
# Via Portainer
Portainer → Containers → <container> → Logs
```
### Restarting Services
```bash
# Via docker-compose
cd hosts/<host>/<service>
docker-compose restart <service>
# Via Portainer
Portainer → Stacks → <stack> → Restart
```
### Checking Resource Usage
```bash
# Via Portainer
Portainer → Containers → Sort by CPU/Memory
# Via CLI
docker stats
```
---
## Emergency Contacts
| Role | Contact | When to Contact |
|------|---------|------------------|
| Primary Admin | User | All critical issues |
| Emergency | NTFY | Critical alerts only |
---
## Quick Reference
| Host | Primary Role | Critical Services | SSH Access |
|------|--------------|-------------------|------------|
| Atlantis | Media, Vault | Vaultwarden, Plex, Immich | atlantis.local |
| Calypso | Infrastructure | NPM, Authentik, Prometheus | calypso.local |
| Concord NUC | DNS, HA | AdGuard, Home Assistant | concord-nuc.local |
| Homelab VM | Monitoring | Prometheus, Grafana | 192.168.0.210 |
| RPi5 | Edge | Immich (backup) | rpi5-vish.local |