332 lines
12 KiB
Markdown
332 lines
12 KiB
Markdown
# 🏗️ Architecture Overview
|
|
|
|
**🟡 Intermediate Guide**
|
|
|
|
## 🎯 High-Level Architecture
|
|
|
|
This homelab follows a **distributed microservices architecture** using Docker containers across multiple physical and virtual hosts. Each service runs in isolation while being orchestrated through a combination of Docker Compose and Ansible automation.
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ HOMELAB NETWORK │
|
|
│ (Tailscale VPN) │
|
|
├─────────────────┬─────────────────┬─────────────────────────┤
|
|
│ SYNOLOGY NAS │ COMPUTE NODES │ EDGE DEVICES │
|
|
│ │ │ │
|
|
│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────────────┐ │
|
|
│ │ Atlantis │ │ │ Homelab VM │ │ │ Concord NUC │ │
|
|
│ │ (55 svcs) │ │ │ (36 svcs) │ │ │ (9 svcs) │ │
|
|
│ └─────────────┘ │ └─────────────┘ │ └─────────────────────┘ │
|
|
│ │ │ │
|
|
│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────────────┐ │
|
|
│ │ Calypso │ │ │ Chicago VM │ │ │ Raspberry Pi │ │
|
|
│ │ (17 svcs) │ │ │ (8 svcs) │ │ │ (2 nodes) │ │
|
|
│ └─────────────┘ │ └─────────────┘ │ └─────────────────────┘ │
|
|
│ │ │ │
|
|
│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────────────┐ │
|
|
│ │ Setillo │ │ │Bulgaria VM │ │ │ Remote VMs │ │
|
|
│ │ (4 svcs) │ │ │ (12 svcs) │ │ │ (Contabo, etc.) │ │
|
|
│ └─────────────┘ │ └─────────────┘ │ └─────────────────────┘ │
|
|
└─────────────────┴─────────────────┴─────────────────────────┘
|
|
```
|
|
|
|
## 🏠 Host Categories
|
|
|
|
### 📦 **Synology NAS Cluster** (Primary Storage & Core Services)
|
|
|
|
**Purpose**: Centralized storage, media services, and always-on applications
|
|
|
|
| Host | Model | Services | Primary Role |
|
|
|------|-------|----------|--------------|
|
|
| **Atlantis** | Synology NAS | 55 services | Media hub, monitoring, core infrastructure |
|
|
| **Calypso** | Synology NAS | 17 services | Development, backup, secondary services |
|
|
| **Setillo** | Synology NAS | 4 services | Monitoring, network services |
|
|
|
|
**Key Characteristics**:
|
|
- **Always-on**: 24/7 operation with UPS backup
|
|
- **High storage capacity**: Multiple TB of redundant storage
|
|
- **Low power consumption**: Efficient ARM/x86 processors
|
|
- **Built-in RAID**: Data protection and redundancy
|
|
|
|
### 💻 **Compute Nodes** (Processing & Workloads)
|
|
|
|
**Purpose**: CPU/RAM intensive applications, isolated workloads, testing
|
|
|
|
| Host | Type | Services | Primary Role |
|
|
|------|------|----------|--------------|
|
|
| **Homelab VM** | Proxmox VM | 36 services | General purpose, experimentation |
|
|
| **Chicago VM** | Proxmox VM | 8 services | Gaming servers, entertainment |
|
|
| **Bulgaria VM** | Proxmox VM | 12 services | Communication, productivity |
|
|
| **Anubis** | Physical | 8 services | High-performance computing |
|
|
| **Guava** | Physical | 6 services | AI/ML workloads, development |
|
|
|
|
**Key Characteristics**:
|
|
- **Scalable resources**: Can allocate CPU/RAM as needed
|
|
- **Isolation**: VMs provide security boundaries
|
|
- **Flexibility**: Easy to create/destroy for testing
|
|
- **Performance**: Dedicated resources for demanding applications
|
|
|
|
### 🌐 **Edge Devices** (IoT, Networking, Remote Access)
|
|
|
|
**Purpose**: Network services, IoT hub, remote connectivity
|
|
|
|
| Host | Type | Services | Primary Role |
|
|
|------|------|----------|--------------|
|
|
| **Concord NUC** | Intel NUC | 9 services | Home automation, edge computing |
|
|
| **Pi-5** | Raspberry Pi 5 | 1 service | Lightweight services, sensors |
|
|
| **Pi-5-Kevin** | Raspberry Pi 5 | 1 service | Secondary Pi node |
|
|
| **Contabo VM** | Remote VPS | 1 service | External services, backup |
|
|
|
|
**Key Characteristics**:
|
|
- **Low power**: Efficient ARM processors
|
|
- **Always accessible**: External connectivity
|
|
- **IoT integration**: GPIO pins, sensors, automation
|
|
- **Redundancy**: Multiple edge nodes for reliability
|
|
|
|
## 🌐 Network Architecture
|
|
|
|
### 🔗 **Connectivity Layer**
|
|
|
|
```
|
|
Internet
|
|
│
|
|
├── Tailscale VPN (Overlay Network)
|
|
│ ├── 100.x.x.x addresses for all nodes
|
|
│ └── Secure mesh networking
|
|
│
|
|
└── Local Network (10.0.0.0/24)
|
|
├── Core Infrastructure
|
|
├── IoT Devices
|
|
└── User Devices
|
|
```
|
|
|
|
**Key Features**:
|
|
- **Tailscale VPN**: Secure mesh network connecting all nodes
|
|
- **Zero-trust networking**: Each connection is authenticated
|
|
- **Remote access**: Access homelab from anywhere securely
|
|
- **Automatic failover**: Multiple connection paths
|
|
|
|
### 🚦 **Service Discovery & Load Balancing**
|
|
|
|
```
|
|
External Request
|
|
│
|
|
├── Nginx Proxy Manager (Atlantis)
|
|
│ ├── SSL Termination
|
|
│ ├── Domain routing
|
|
│ └── Access control
|
|
│
|
|
└── Internal Services
|
|
├── Docker networks
|
|
├── Service mesh
|
|
└── Health checks
|
|
```
|
|
|
|
## 🐳 Container Architecture
|
|
|
|
### 📦 **Docker Compose Patterns**
|
|
|
|
Each service follows consistent patterns:
|
|
|
|
```yaml
|
|
version: '3.9'
|
|
services:
|
|
service-name:
|
|
image: official/image:tag
|
|
container_name: Service-Name
|
|
hostname: service-hostname
|
|
|
|
# Security
|
|
security_opt:
|
|
- no-new-privileges:true
|
|
user: 1026:100 # Synology user mapping
|
|
|
|
# Health & Reliability
|
|
healthcheck:
|
|
test: ["CMD", "health-check-command"]
|
|
interval: 30s
|
|
timeout: 10s
|
|
retries: 3
|
|
restart: on-failure:5
|
|
|
|
# Resources
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
memory: 2G
|
|
cpus: '1.0'
|
|
|
|
# Networking
|
|
networks:
|
|
- service-network
|
|
ports:
|
|
- "8080:80"
|
|
|
|
# Storage
|
|
volumes:
|
|
- /volume1/docker/service:/data:rw
|
|
- /etc/localtime:/etc/localtime:ro
|
|
|
|
# Configuration
|
|
environment:
|
|
- TZ=America/Los_Angeles
|
|
- CUSTOM_VAR=value
|
|
env_file:
|
|
- .env
|
|
|
|
networks:
|
|
service-network:
|
|
name: service-network
|
|
ipam:
|
|
config:
|
|
- subnet: 192.168.x.0/24
|
|
```
|
|
|
|
### 🔧 **Common Patterns**
|
|
|
|
1. **Security Hardening**:
|
|
- Non-root users where possible
|
|
- Read-only containers for stateless services
|
|
- No new privileges flag
|
|
- Minimal base images
|
|
|
|
2. **Resource Management**:
|
|
- Memory and CPU limits
|
|
- Health checks for reliability
|
|
- Restart policies for resilience
|
|
|
|
3. **Data Management**:
|
|
- Persistent volumes for data
|
|
- Backup-friendly mount points
|
|
- Timezone synchronization
|
|
|
|
4. **Networking**:
|
|
- Custom networks for isolation
|
|
- Consistent port mapping
|
|
- Service discovery via hostnames
|
|
|
|
## 📊 Data Flow Architecture
|
|
|
|
### 🔄 **Monitoring & Observability**
|
|
|
|
```
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ Prometheus │◄───│ Node Exporters │◄───│ Services │
|
|
│ (Metrics) │ │ (Collectors) │ │ (Health Data) │
|
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ Grafana │◄───│ AlertManager │◄───│ Uptime │
|
|
│ (Dashboards) │ │ (Notifications)│ │ Kuma │
|
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
```
|
|
|
|
### 💾 **Data Storage Strategy**
|
|
|
|
```
|
|
┌─────────────────┐
|
|
│ Application │
|
|
│ Data │
|
|
├─────────────────┤
|
|
│ /volume1/docker │ ◄── Primary storage (Synology)
|
|
│ /volume2/backup │ ◄── Backup storage (Synology)
|
|
│ /mnt/external │ ◄── External backup (USB/Cloud)
|
|
└─────────────────┘
|
|
```
|
|
|
|
**Storage Tiers**:
|
|
1. **Hot Storage**: Frequently accessed data on SSDs
|
|
2. **Warm Storage**: Regular data on fast HDDs
|
|
3. **Cold Storage**: Backups on slower HDDs
|
|
4. **Archive Storage**: Long-term backups off-site
|
|
|
|
## 🔐 Security Architecture
|
|
|
|
### 🛡️ **Defense in Depth**
|
|
|
|
```
|
|
Internet
|
|
│
|
|
├── Firewall (Router level)
|
|
│ └── Port restrictions, DDoS protection
|
|
│
|
|
├── VPN (Tailscale)
|
|
│ └── Encrypted mesh network
|
|
│
|
|
├── Reverse Proxy (Nginx)
|
|
│ └── SSL termination, access control
|
|
│
|
|
├── Container Security
|
|
│ └── User namespaces, capabilities
|
|
│
|
|
└── Application Security
|
|
└── Authentication, authorization
|
|
```
|
|
|
|
### 🔑 **Authentication & Authorization**
|
|
|
|
- **Single Sign-On**: Where possible, integrated auth
|
|
- **Strong passwords**: Generated and stored in Vaultwarden
|
|
- **2FA**: Multi-factor authentication for critical services
|
|
- **API keys**: Secure service-to-service communication
|
|
- **Certificate management**: Automated SSL/TLS certificates
|
|
|
|
## 🚀 Deployment Architecture
|
|
|
|
### 🤖 **Infrastructure as Code**
|
|
|
|
```
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ Git Repository│───►│ Ansible Control│───►│ Target Hosts │
|
|
│ (This repo) │ │ Node │ │ (All systems) │
|
|
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
|
```
|
|
|
|
**Ansible Automation**:
|
|
- **Inventory management**: All hosts and their roles
|
|
- **Playbook execution**: Automated deployment
|
|
- **Configuration management**: Consistent settings
|
|
- **Health monitoring**: Automated checks
|
|
|
|
### 📈 **Scaling Strategy**
|
|
|
|
1. **Horizontal Scaling**: Add more hosts as needed
|
|
2. **Vertical Scaling**: Upgrade existing hardware
|
|
3. **Service Distribution**: Spread load across hosts
|
|
4. **Resource Optimization**: Monitor and adjust allocations
|
|
|
|
## 🔄 Backup & Recovery Architecture
|
|
|
|
### 💾 **Backup Strategy**
|
|
|
|
```
|
|
Production Data
|
|
│
|
|
├── Local Snapshots (Hourly)
|
|
│ └── Synology snapshot replication
|
|
│
|
|
├── Cross-site Backup (Daily)
|
|
│ └── Synology to Synology replication
|
|
│
|
|
└── Off-site Backup (Weekly)
|
|
└── Cloud storage (encrypted)
|
|
```
|
|
|
|
**Recovery Objectives**:
|
|
- **RTO** (Recovery Time): < 4 hours for critical services
|
|
- **RPO** (Recovery Point): < 1 hour data loss maximum
|
|
- **Testing**: Monthly recovery drills
|
|
|
|
## 📋 Next Steps
|
|
|
|
Now that you understand the architecture:
|
|
|
|
1. **[Prerequisites](prerequisites.md)**: What you need to get started
|
|
2. **[Quick Start Guide](quick-start.md)**: Deploy your first service
|
|
3. **[Service Categories](../services/categories.md)**: Explore available services
|
|
4. **[Infrastructure Details](../infrastructure/hosts.md)**: REDACTED_APP_PASSWORD host
|
|
|
|
---
|
|
|
|
*This architecture has evolved over time and continues to grow. Start simple and expand based on your needs!* |