Sanitized mirror from private repository - 2026-04-16 07:18:01 UTC
This commit is contained in:
332
docs/getting-started/architecture.md
Normal file
332
docs/getting-started/architecture.md
Normal file
@@ -0,0 +1,332 @@
|
||||
# 🏗️ Architecture Overview
|
||||
|
||||
**🟡 Intermediate Guide**
|
||||
|
||||
## 🎯 High-Level Architecture
|
||||
|
||||
This homelab follows a **distributed microservices architecture** using Docker containers across multiple physical and virtual hosts. Each service runs in isolation while being orchestrated through a combination of Docker Compose and Ansible automation.
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ HOMELAB NETWORK │
|
||||
│ (Tailscale VPN) │
|
||||
├─────────────────┬─────────────────┬─────────────────────────┤
|
||||
│ SYNOLOGY NAS │ COMPUTE NODES │ EDGE DEVICES │
|
||||
│ │ │ │
|
||||
│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────────────┐ │
|
||||
│ │ Atlantis │ │ │ Homelab VM │ │ │ Concord NUC │ │
|
||||
│ │ (55 svcs) │ │ │ (36 svcs) │ │ │ (9 svcs) │ │
|
||||
│ └─────────────┘ │ └─────────────┘ │ └─────────────────────┘ │
|
||||
│ │ │ │
|
||||
│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────────────┐ │
|
||||
│ │ Calypso │ │ │ Chicago VM │ │ │ Raspberry Pi │ │
|
||||
│ │ (17 svcs) │ │ │ (8 svcs) │ │ │ (2 nodes) │ │
|
||||
│ └─────────────┘ │ └─────────────┘ │ └─────────────────────┘ │
|
||||
│ │ │ │
|
||||
│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────────────┐ │
|
||||
│ │ Setillo │ │ │Bulgaria VM │ │ │ Remote VMs │ │
|
||||
│ │ (4 svcs) │ │ │ (12 svcs) │ │ │ (Contabo, etc.) │ │
|
||||
│ └─────────────┘ │ └─────────────┘ │ └─────────────────────┘ │
|
||||
└─────────────────┴─────────────────┴─────────────────────────┘
|
||||
```
|
||||
|
||||
## 🏠 Host Categories
|
||||
|
||||
### 📦 **Synology NAS Cluster** (Primary Storage & Core Services)
|
||||
|
||||
**Purpose**: Centralized storage, media services, and always-on applications
|
||||
|
||||
| Host | Model | Services | Primary Role |
|
||||
|------|-------|----------|--------------|
|
||||
| **Atlantis** | Synology NAS | 55 services | Media hub, monitoring, core infrastructure |
|
||||
| **Calypso** | Synology NAS | 17 services | Development, backup, secondary services |
|
||||
| **Setillo** | Synology NAS | 4 services | Monitoring, network services |
|
||||
|
||||
**Key Characteristics**:
|
||||
- **Always-on**: 24/7 operation with UPS backup
|
||||
- **High storage capacity**: Multiple TB of redundant storage
|
||||
- **Low power consumption**: Efficient ARM/x86 processors
|
||||
- **Built-in RAID**: Data protection and redundancy
|
||||
|
||||
### 💻 **Compute Nodes** (Processing & Workloads)
|
||||
|
||||
**Purpose**: CPU/RAM intensive applications, isolated workloads, testing
|
||||
|
||||
| Host | Type | Services | Primary Role |
|
||||
|------|------|----------|--------------|
|
||||
| **Homelab VM** | Proxmox VM | 36 services | General purpose, experimentation |
|
||||
| **Chicago VM** | Proxmox VM | 8 services | Gaming servers, entertainment |
|
||||
| **Bulgaria VM** | Proxmox VM | 12 services | Communication, productivity |
|
||||
| **Anubis** | Physical | 8 services | High-performance computing |
|
||||
| **Guava** | Physical | 6 services | AI/ML workloads, development |
|
||||
|
||||
**Key Characteristics**:
|
||||
- **Scalable resources**: Can allocate CPU/RAM as needed
|
||||
- **Isolation**: VMs provide security boundaries
|
||||
- **Flexibility**: Easy to create/destroy for testing
|
||||
- **Performance**: Dedicated resources for demanding applications
|
||||
|
||||
### 🌐 **Edge Devices** (IoT, Networking, Remote Access)
|
||||
|
||||
**Purpose**: Network services, IoT hub, remote connectivity
|
||||
|
||||
| Host | Type | Services | Primary Role |
|
||||
|------|------|----------|--------------|
|
||||
| **Concord NUC** | Intel NUC | 9 services | Home automation, edge computing |
|
||||
| **Pi-5** | Raspberry Pi 5 | 1 service | Lightweight services, sensors |
|
||||
| **Pi-5-Kevin** | Raspberry Pi 5 | 1 service | Secondary Pi node |
|
||||
| **Contabo VM** | Remote VPS | 1 service | External services, backup |
|
||||
|
||||
**Key Characteristics**:
|
||||
- **Low power**: Efficient ARM processors
|
||||
- **Always accessible**: External connectivity
|
||||
- **IoT integration**: GPIO pins, sensors, automation
|
||||
- **Redundancy**: Multiple edge nodes for reliability
|
||||
|
||||
## 🌐 Network Architecture
|
||||
|
||||
### 🔗 **Connectivity Layer**
|
||||
|
||||
```
|
||||
Internet
|
||||
│
|
||||
├── Tailscale VPN (Overlay Network)
|
||||
│ ├── 100.x.x.x addresses for all nodes
|
||||
│ └── Secure mesh networking
|
||||
│
|
||||
└── Local Network (10.0.0.0/24)
|
||||
├── Core Infrastructure
|
||||
├── IoT Devices
|
||||
└── User Devices
|
||||
```
|
||||
|
||||
**Key Features**:
|
||||
- **Tailscale VPN**: Secure mesh network connecting all nodes
|
||||
- **Zero-trust networking**: Each connection is authenticated
|
||||
- **Remote access**: Access homelab from anywhere securely
|
||||
- **Automatic failover**: Multiple connection paths
|
||||
|
||||
### 🚦 **Service Discovery & Load Balancing**
|
||||
|
||||
```
|
||||
External Request
|
||||
│
|
||||
├── Nginx Proxy Manager (Atlantis)
|
||||
│ ├── SSL Termination
|
||||
│ ├── Domain routing
|
||||
│ └── Access control
|
||||
│
|
||||
└── Internal Services
|
||||
├── Docker networks
|
||||
├── Service mesh
|
||||
└── Health checks
|
||||
```
|
||||
|
||||
## 🐳 Container Architecture
|
||||
|
||||
### 📦 **Docker Compose Patterns**
|
||||
|
||||
Each service follows consistent patterns:
|
||||
|
||||
```yaml
|
||||
version: '3.9'
|
||||
services:
|
||||
service-name:
|
||||
image: official/image:tag
|
||||
container_name: Service-Name
|
||||
hostname: service-hostname
|
||||
|
||||
# Security
|
||||
security_opt:
|
||||
- no-new-privileges:true
|
||||
user: 1026:100 # Synology user mapping
|
||||
|
||||
# Health & Reliability
|
||||
healthcheck:
|
||||
test: ["CMD", "health-check-command"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
restart: on-failure:5
|
||||
|
||||
# Resources
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 2G
|
||||
cpus: '1.0'
|
||||
|
||||
# Networking
|
||||
networks:
|
||||
- service-network
|
||||
ports:
|
||||
- "8080:80"
|
||||
|
||||
# Storage
|
||||
volumes:
|
||||
- /volume1/docker/service:/data:rw
|
||||
- /etc/localtime:/etc/localtime:ro
|
||||
|
||||
# Configuration
|
||||
environment:
|
||||
- TZ=America/Los_Angeles
|
||||
- CUSTOM_VAR=value
|
||||
env_file:
|
||||
- .env
|
||||
|
||||
networks:
|
||||
service-network:
|
||||
name: service-network
|
||||
ipam:
|
||||
config:
|
||||
- subnet: 192.168.x.0/24
|
||||
```
|
||||
|
||||
### 🔧 **Common Patterns**
|
||||
|
||||
1. **Security Hardening**:
|
||||
- Non-root users where possible
|
||||
- Read-only containers for stateless services
|
||||
- No new privileges flag
|
||||
- Minimal base images
|
||||
|
||||
2. **Resource Management**:
|
||||
- Memory and CPU limits
|
||||
- Health checks for reliability
|
||||
- Restart policies for resilience
|
||||
|
||||
3. **Data Management**:
|
||||
- Persistent volumes for data
|
||||
- Backup-friendly mount points
|
||||
- Timezone synchronization
|
||||
|
||||
4. **Networking**:
|
||||
- Custom networks for isolation
|
||||
- Consistent port mapping
|
||||
- Service discovery via hostnames
|
||||
|
||||
## 📊 Data Flow Architecture
|
||||
|
||||
### 🔄 **Monitoring & Observability**
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||||
│ Prometheus │◄───│ Node Exporters │◄───│ Services │
|
||||
│ (Metrics) │ │ (Collectors) │ │ (Health Data) │
|
||||
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||||
│ Grafana │◄───│ AlertManager │◄───│ Uptime │
|
||||
│ (Dashboards) │ │ (Notifications)│ │ Kuma │
|
||||
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
||||
```
|
||||
|
||||
### 💾 **Data Storage Strategy**
|
||||
|
||||
```
|
||||
┌─────────────────┐
|
||||
│ Application │
|
||||
│ Data │
|
||||
├─────────────────┤
|
||||
│ /volume1/docker │ ◄── Primary storage (Synology)
|
||||
│ /volume2/backup │ ◄── Backup storage (Synology)
|
||||
│ /mnt/external │ ◄── External backup (USB/Cloud)
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
**Storage Tiers**:
|
||||
1. **Hot Storage**: Frequently accessed data on SSDs
|
||||
2. **Warm Storage**: Regular data on fast HDDs
|
||||
3. **Cold Storage**: Backups on slower HDDs
|
||||
4. **Archive Storage**: Long-term backups off-site
|
||||
|
||||
## 🔐 Security Architecture
|
||||
|
||||
### 🛡️ **Defense in Depth**
|
||||
|
||||
```
|
||||
Internet
|
||||
│
|
||||
├── Firewall (Router level)
|
||||
│ └── Port restrictions, DDoS protection
|
||||
│
|
||||
├── VPN (Tailscale)
|
||||
│ └── Encrypted mesh network
|
||||
│
|
||||
├── Reverse Proxy (Nginx)
|
||||
│ └── SSL termination, access control
|
||||
│
|
||||
├── Container Security
|
||||
│ └── User namespaces, capabilities
|
||||
│
|
||||
└── Application Security
|
||||
└── Authentication, authorization
|
||||
```
|
||||
|
||||
### 🔑 **Authentication & Authorization**
|
||||
|
||||
- **Single Sign-On**: Where possible, integrated auth
|
||||
- **Strong passwords**: Generated and stored in Vaultwarden
|
||||
- **2FA**: Multi-factor authentication for critical services
|
||||
- **API keys**: Secure service-to-service communication
|
||||
- **Certificate management**: Automated SSL/TLS certificates
|
||||
|
||||
## 🚀 Deployment Architecture
|
||||
|
||||
### 🤖 **Infrastructure as Code**
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
||||
│ Git Repository│───►│ Ansible Control│───►│ Target Hosts │
|
||||
│ (This repo) │ │ Node │ │ (All systems) │
|
||||
└─────────────────┘ └─────────────────┘ └─────────────────┘
|
||||
```
|
||||
|
||||
**Ansible Automation**:
|
||||
- **Inventory management**: All hosts and their roles
|
||||
- **Playbook execution**: Automated deployment
|
||||
- **Configuration management**: Consistent settings
|
||||
- **Health monitoring**: Automated checks
|
||||
|
||||
### 📈 **Scaling Strategy**
|
||||
|
||||
1. **Horizontal Scaling**: Add more hosts as needed
|
||||
2. **Vertical Scaling**: Upgrade existing hardware
|
||||
3. **Service Distribution**: Spread load across hosts
|
||||
4. **Resource Optimization**: Monitor and adjust allocations
|
||||
|
||||
## 🔄 Backup & Recovery Architecture
|
||||
|
||||
### 💾 **Backup Strategy**
|
||||
|
||||
```
|
||||
Production Data
|
||||
│
|
||||
├── Local Snapshots (Hourly)
|
||||
│ └── Synology snapshot replication
|
||||
│
|
||||
├── Cross-site Backup (Daily)
|
||||
│ └── Synology to Synology replication
|
||||
│
|
||||
└── Off-site Backup (Weekly)
|
||||
└── Cloud storage (encrypted)
|
||||
```
|
||||
|
||||
**Recovery Objectives**:
|
||||
- **RTO** (Recovery Time): < 4 hours for critical services
|
||||
- **RPO** (Recovery Point): < 1 hour data loss maximum
|
||||
- **Testing**: Monthly recovery drills
|
||||
|
||||
## 📋 Next Steps
|
||||
|
||||
Now that you understand the architecture:
|
||||
|
||||
1. **[Prerequisites](prerequisites.md)**: What you need to get started
|
||||
2. **[Quick Start Guide](quick-start.md)**: Deploy your first service
|
||||
3. **[Service Categories](../services/categories.md)**: Explore available services
|
||||
4. **[Infrastructure Details](../infrastructure/hosts.md)**: REDACTED_APP_PASSWORD host
|
||||
|
||||
---
|
||||
|
||||
*This architecture has evolved over time and continues to grow. Start simple and expand based on your needs!*
|
||||
Reference in New Issue
Block a user