475 lines
12 KiB
Markdown
475 lines
12 KiB
Markdown
# Add New Service Runbook
|
|
|
|
> **Looking for the step-by-step deployment guide?**
|
|
> See [Deploy a New Service — End-to-End](../guides/deploy-new-service-gitops.md) for a
|
|
> complete walkthrough from compose file to live container, including CI pipeline verification.
|
|
> This runbook covers the extended checklist (monitoring, backups, SSO) for production readiness.
|
|
|
|
## Overview
|
|
This runbook guides you through deploying a new containerized service to the homelab using GitOps with Portainer. The procedure ensures proper configuration, monitoring, and documentation.
|
|
|
|
## Prerequisites
|
|
- [ ] Git access to the homelab repository
|
|
- [ ] Portainer access (https://192.168.0.200:9443) - Portainer EE v2.33.7
|
|
- [ ] Target host selected and available
|
|
- [ ] Service Docker Compose file prepared
|
|
- [ ] Required environment variables identified
|
|
- [ ] Network requirements understood (ports, domains, etc.)
|
|
|
|
## Current GitOps Status
|
|
- **Active Deployments**: 18 compose stacks on Atlantis (verified Feb 14, 2026)
|
|
- **Total Containers**: 50+ containers across infrastructure
|
|
- **GitOps Method**: Automatic sync from Git repository via Portainer EE
|
|
|
|
## Metadata
|
|
- **Estimated Time**: 30-60 minutes
|
|
- **Risk Level**: Low (if following proper testing)
|
|
- **Requires Downtime**: No (for new services)
|
|
- **Reversible**: Yes (can remove stack)
|
|
- **Tested On**: 2026-02-14
|
|
|
|
## Decision: Which Host?
|
|
|
|
Choose the appropriate host based on service requirements:
|
|
|
|
| Host | Best For | Available Resources | GitOps Status |
|
|
|------|----------|-------------------|---------------|
|
|
| **Atlantis** (DS1823xs+) | Media services, high I/O, primary storage | 8 CPU, 31GB RAM, 50+ containers | ✅ 18 Active Stacks |
|
|
| **Calypso** (DS723+) | Secondary media, backup services | 4 CPU, 31GB RAM, 46 containers | ✅ GitOps Ready |
|
|
| **Concord NUC** | Network services, DNS, VPN | 4 CPU, 15.5GB RAM, 17 containers | ✅ GitOps Ready |
|
|
| **Homelab VM** | Development, monitoring, testing | 4 CPU, 28.7GB RAM, 23 containers | ✅ GitOps Ready |
|
|
| **Raspberry Pi 5** | IoT, edge computing, lightweight services | 4 CPU, 15.8GB RAM, 4 containers | ✅ GitOps Ready |
|
|
|
|
## Procedure
|
|
|
|
### Step 1: Create Docker Compose Configuration
|
|
|
|
Create a new compose file in the appropriate host directory:
|
|
|
|
```bash
|
|
cd ~/Documents/repos/homelab
|
|
# Choose the appropriate path:
|
|
# - hosts/synology/atlantis/
|
|
# - hosts/synology/calypso/
|
|
# - hosts/physical/concord-nuc/
|
|
# - hosts/vms/homelab-vm/
|
|
# - hosts/edge/raspberry-pi-5/
|
|
|
|
# Create new service file
|
|
nano hosts/[host]/[service-name].yaml
|
|
```
|
|
|
|
Example Docker Compose structure:
|
|
|
|
```yaml
|
|
version: '3.8'
|
|
|
|
services:
|
|
service-name:
|
|
image: organization/image:tag
|
|
container_name: service-name
|
|
restart: unless-stopped
|
|
|
|
environment:
|
|
- PUID=1000
|
|
- PGID=1000
|
|
- TZ=America/Los_Angeles
|
|
# Add service-specific variables
|
|
|
|
volumes:
|
|
- /path/to/config:/config
|
|
- /path/to/data:/data
|
|
|
|
ports:
|
|
- "8080:8080" # external:internal
|
|
|
|
networks:
|
|
- service-network
|
|
|
|
# Optional: health check
|
|
healthcheck:
|
|
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
|
|
interval: 30s
|
|
timeout: 10s
|
|
retries: 3
|
|
start_period: 40s
|
|
|
|
# Optional: resource limits
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
cpus: '2.0'
|
|
memory: 4G
|
|
reservations:
|
|
cpus: '1.0'
|
|
memory: 2G
|
|
|
|
networks:
|
|
service-network:
|
|
driver: bridge
|
|
|
|
# Optional: named volumes
|
|
volumes:
|
|
service-data:
|
|
driver: local
|
|
```
|
|
|
|
### Step 2: Configure Environment Variables
|
|
|
|
If your service requires sensitive data, create an `.env` file (ensure it's in `.gitignore`):
|
|
|
|
```bash
|
|
# Create .env file (DO NOT commit to git)
|
|
nano .env.example # Template for others
|
|
|
|
# Example .env content:
|
|
# SERVICE_API_KEY=REDACTED_API_KEY
|
|
# SERVICE_SECRET=your_secret_here
|
|
# DATABASE_PASSWORD="REDACTED_PASSWORD"
|
|
```
|
|
|
|
### Step 3: Validate Configuration Locally (Optional but Recommended)
|
|
|
|
Test the compose file syntax:
|
|
|
|
```bash
|
|
# Validate syntax
|
|
docker-compose -f hosts/[host]/[service-name].yaml config
|
|
|
|
# Expected output: Valid YAML with no errors
|
|
```
|
|
|
|
### Step 4: Commit and Push to Git Repository
|
|
|
|
```bash
|
|
# Add the new service file
|
|
git add hosts/[host]/[service-name].yaml
|
|
|
|
# If adding .env.example template
|
|
git add .env.example
|
|
|
|
# Commit with descriptive message
|
|
git commit -m "Add [service-name] deployment for [host]
|
|
|
|
- Add Docker Compose configuration
|
|
- Configure environment variables
|
|
- Set resource limits and health checks
|
|
- Documentation: [purpose of service]
|
|
|
|
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>"
|
|
|
|
# Push to remote
|
|
git push origin main
|
|
```
|
|
|
|
Expected output:
|
|
```
|
|
[main abc1234] Add service-name deployment for host
|
|
1 file changed, 45 insertions(+)
|
|
create mode 100644 hosts/host/service-name.yaml
|
|
```
|
|
|
|
### Step 5: Deploy via Portainer GitOps
|
|
|
|
#### Option A: CI Auto-Deploy (for existing stacks)
|
|
Once a stack is registered in Portainer, every future `git push` to `main` triggers
|
|
`portainer-deploy.yml` in Gitea CI, which redeploies matching stacks automatically within
|
|
~30 seconds. Watch the run at `https://git.vish.gg/Vish/homelab/actions`.
|
|
|
|
For a **new** stack (first deploy), you must register it in Portainer manually via Option B below —
|
|
the CI can only redeploy stacks that already exist in Portainer.
|
|
|
|
#### Option B: Manual Deployment via Portainer UI
|
|
1. Open Portainer: http://vishinator.synology.me:10000
|
|
2. Navigate to the target endpoint (e.g., "Atlantis", "Calypso")
|
|
3. Click **Stacks** → **Add stack**
|
|
4. Configure stack:
|
|
- **Name**: `service-name`
|
|
- **Build method**: **Git Repository**
|
|
- **Repository URL**: Your git repository URL
|
|
- **Repository reference**: `refs/heads/main`
|
|
- **Compose path**: `hosts/[host]/[service-name].yaml`
|
|
- **GitOps updates**: ✅ Enable (optional)
|
|
5. Add environment variables if needed
|
|
6. Click **Deploy the stack**
|
|
|
|
### Step 6: Verify Deployment
|
|
|
|
Check container status:
|
|
|
|
```bash
|
|
# SSH to the host
|
|
ssh atlantis # or appropriate host
|
|
|
|
# Check container is running
|
|
docker ps | grep service-name
|
|
|
|
# Check logs for errors
|
|
docker logs service-name --tail 50
|
|
|
|
# Check resource usage
|
|
docker stats service-name --no-stream
|
|
```
|
|
|
|
Expected output:
|
|
```
|
|
CONTAINER ID IMAGE STATUS PORTS
|
|
abc123def456 org/image:tag Up 2 minutes 0.0.0.0:8080->8080/tcp
|
|
```
|
|
|
|
### Step 7: Configure Networking (If External Access Needed)
|
|
|
|
#### For Services Behind Reverse Proxy:
|
|
1. Add to Nginx Proxy Manager or configure reverse proxy
|
|
2. Create DNS record (Cloudflare or local DNS)
|
|
3. Configure SSL certificate (Let's Encrypt)
|
|
|
|
#### For Services Using Authentik SSO:
|
|
1. Add application in Authentik
|
|
2. Configure OAuth2/SAML provider
|
|
3. Update service with Authentik integration
|
|
|
|
See [Authentik SSO Guide](../infrastructure/authentik-sso.md) for details.
|
|
|
|
### Step 8: Add to Monitoring (Optional but Recommended)
|
|
|
|
Add service to monitoring stack:
|
|
|
|
```yaml
|
|
# Add to prometheus/prometheus.yml
|
|
scrape_configs:
|
|
- job_name: 'service-name'
|
|
static_configs:
|
|
- targets: ['service-name:8080']
|
|
```
|
|
|
|
Update Grafana dashboards if needed.
|
|
|
|
### Step 9: Document the Service
|
|
|
|
Update service inventory:
|
|
|
|
```bash
|
|
# Edit service documentation
|
|
nano docs/services/VERIFIED_SERVICE_INVENTORY.md
|
|
|
|
# Add entry:
|
|
# | Service Name | Host | Port | URL | Status |
|
|
# |--------------|------|------|-----|--------|
|
|
# | Service Name | Atlantis | 8080 | https://service.vish.gg | ✅ Active |
|
|
```
|
|
|
|
### Step 10: Configure Backups (If Storing Important Data)
|
|
|
|
Add service to backup scripts:
|
|
|
|
```bash
|
|
# Edit backup configuration
|
|
nano backup.sh
|
|
|
|
# Add service data directory to backup list
|
|
BACKUP_DIRS=(
|
|
# ... existing dirs ...
|
|
"/path/to/service-name/data"
|
|
)
|
|
```
|
|
|
|
Test backup:
|
|
```bash
|
|
./backup.sh --test
|
|
```
|
|
|
|
## Verification Checklist
|
|
|
|
After deployment, verify the following:
|
|
|
|
- [ ] Container is running: `docker ps | grep service-name`
|
|
- [ ] Logs show no critical errors: `docker logs service-name`
|
|
- [ ] Service responds on expected port: `curl http://localhost:8080`
|
|
- [ ] Health check passes (if configured): `docker inspect service-name | grep Health`
|
|
- [ ] Resource usage is reasonable: `docker stats service-name --no-stream`
|
|
- [ ] External access works (if configured): `curl https://service.vish.gg`
|
|
- [ ] SSO authentication works (if using Authentik)
|
|
- [ ] Service is added to documentation
|
|
- [ ] Monitoring is configured (if applicable)
|
|
- [ ] Backups include service data (if applicable)
|
|
|
|
## Rollback Procedure
|
|
|
|
If the deployment fails or causes issues:
|
|
|
|
### Via Portainer UI:
|
|
1. Go to **Stacks** → Select the problematic stack
|
|
2. Click **Stop** to stop the stack
|
|
3. Click **Remove** to delete the stack
|
|
4. Delete associated volumes if needed
|
|
|
|
### Via Command Line:
|
|
```bash
|
|
# SSH to host
|
|
ssh [host]
|
|
|
|
# Stop and remove container
|
|
docker stop service-name
|
|
docker rm service-name
|
|
|
|
# Remove associated volumes (if needed)
|
|
docker volume ls | grep service-name
|
|
docker volume rm [volume-name]
|
|
|
|
# Remove from git
|
|
cd ~/Documents/repos/homelab
|
|
git rm hosts/[host]/[service-name].yaml
|
|
git commit -m "Rollback: Remove service-name deployment"
|
|
git push origin main
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Issue: Container Fails to Start
|
|
|
|
**Symptoms**: Container status shows "Exited" or "Restarting"
|
|
|
|
**Solution**:
|
|
```bash
|
|
# Check logs for error messages
|
|
docker logs service-name --tail 100
|
|
|
|
# Common issues:
|
|
# - Port already in use: Change port mapping
|
|
# - Permission denied: Check PUID/PGID
|
|
# - Missing env variables: Add to compose file
|
|
# - Volume mount issues: Verify paths exist
|
|
```
|
|
|
|
### Issue: Container Starts But Service Unreachable
|
|
|
|
**Symptoms**: Container running but can't access service
|
|
|
|
**Solution**:
|
|
```bash
|
|
# Check if service is listening on correct port
|
|
docker exec service-name netstat -tlnp
|
|
|
|
# Check container network
|
|
docker network inspect [network-name]
|
|
|
|
# Test from within container
|
|
docker exec service-name curl localhost:8080
|
|
|
|
# Check firewall rules on host
|
|
sudo ufw status
|
|
```
|
|
|
|
### Issue: GitOps Auto-Deploy Not Working
|
|
|
|
**Symptoms**: Pushed changes but Portainer doesn't update
|
|
|
|
**Solution**:
|
|
1. Check `https://git.vish.gg/Vish/homelab/actions` — did the `portainer-deploy.yml` run trigger?
|
|
2. If it ran but shows "No stacks matched": the **Compose path** in Portainer doesn't exactly match the repo file path — check Stacks → your stack → Editor tab
|
|
3. If the CI run didn't trigger at all: the changed file path isn't in the workflow's `paths:` filter (only `hosts/**`, `common/**`, `Calypso/**`, `Atlantis/**` trigger it)
|
|
4. Manual fallback: Portainer → Stacks → your stack → Pull and redeploy
|
|
|
|
### Issue: High Resource Usage
|
|
|
|
**Symptoms**: Container using too much CPU/RAM
|
|
|
|
**Solution**:
|
|
```bash
|
|
# Add resource limits to compose file
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
cpus: '1.0'
|
|
memory: 2G
|
|
|
|
# Redeploy with limits
|
|
docker-compose up -d
|
|
```
|
|
|
|
## Post-Deployment Tasks
|
|
|
|
After successful deployment:
|
|
|
|
1. **Test the service thoroughly** - Ensure all features work as expected
|
|
2. **Set up monitoring alerts** - Configure Grafana alerts for the service
|
|
3. **Document usage** - Add user guide if others will use the service
|
|
4. **Schedule maintenance** - Add to maintenance calendar for updates
|
|
5. **Test backups** - Verify backup includes service data
|
|
6. **Update runbook** - Note any deviations or improvements
|
|
|
|
## Related Documentation
|
|
|
|
- [Deploy a New Service — End-to-End](../guides/deploy-new-service-gitops.md) ⭐ Complete step-by-step guide
|
|
- [GitOps Deployment Guide](../GITOPS_DEPLOYMENT_GUIDE.md)
|
|
- [Infrastructure Overview](../infrastructure/INFRASTRUCTURE_OVERVIEW.md)
|
|
- [Service Inventory](../services/VERIFIED_SERVICE_INVENTORY.md)
|
|
- [Monitoring Setup](../admin/monitoring-setup.md)
|
|
|
|
## Examples
|
|
|
|
### Example 1: Adding Uptime Kuma
|
|
```yaml
|
|
version: '3.8'
|
|
|
|
services:
|
|
uptime-kuma:
|
|
image: louislam/uptime-kuma:1
|
|
container_name: uptime-kuma
|
|
restart: unless-stopped
|
|
|
|
volumes:
|
|
- /volume1/docker/uptime-kuma:/app/data
|
|
|
|
ports:
|
|
- "3001:3001"
|
|
|
|
networks:
|
|
- monitoring
|
|
|
|
networks:
|
|
monitoring:
|
|
external: true
|
|
```
|
|
|
|
### Example 2: Adding a Service with Database
|
|
```yaml
|
|
version: '3.8'
|
|
|
|
services:
|
|
app:
|
|
image: myapp:latest
|
|
depends_on:
|
|
- postgres
|
|
environment:
|
|
- DATABASE_URL=postgresql://user:REDACTED_PASSWORD@postgres:5432/dbname
|
|
ports:
|
|
- "8080:8080"
|
|
networks:
|
|
- app-network
|
|
|
|
postgres:
|
|
image: postgres:15
|
|
environment:
|
|
- POSTGRES_USER=user
|
|
- POSTGRES_PASSWORD="REDACTED_PASSWORD"
|
|
- POSTGRES_DB=dbname
|
|
volumes:
|
|
- postgres-data:/var/lib/postgresql/data
|
|
networks:
|
|
- app-network
|
|
|
|
networks:
|
|
app-network:
|
|
driver: bridge
|
|
|
|
volumes:
|
|
postgres-data:
|
|
```
|
|
|
|
## Change Log
|
|
|
|
- 2026-02-14 - Initial creation with GitOps workflow
|
|
- 2026-02-14 - Added examples and troubleshooting section
|