15 KiB
15 KiB
💾 Storage Systems
🟡 Intermediate Guide
This document covers the storage architecture, RAID configurations, backup strategies, and data management practices for the homelab infrastructure.
🏗️ Storage Architecture Overview
┌─────────────────────────────────────────────────────────────────────────────┐
│ STORAGE INFRASTRUCTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ PRIMARY STORAGE BACKUP TARGETS │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ ATLANTIS │ │ CALYPSO │ │
│ │ Synology NAS │ ──────► │ Synology NAS │ │
│ │ │ Hyper │ │ │
│ │ 8x 16TB RAID 6 │ Backup │ 2x 12TB RAID 1 │ │
│ │ ≈96TB usable │ │ ≈12TB usable │ │
│ │ │ │ │ │
│ │ + 2x 480GB NVMe │ │ + 2x 480GB NVMe │ │
│ │ (SSD Cache) │ │ (SSD Cache) │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ │ │ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ BACKBLAZE B2 │ │
│ │ Cloud Offsite Backup │ │
│ │ Encrypted, Versioned Storage │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ SECONDARY STORAGE │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ GUAVA │ │ SETILLO │ │ PROXMOX │ │
│ │ RAID 1 HDD │ │ Single 1TB │ │ Local SSD │ │
│ │ + NVMe SSD │ │ │ │ │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
📊 Storage Summary
| Host | Total Raw | Usable | RAID Level | Purpose |
|---|---|---|---|---|
| Atlantis | 128TB (8x16TB) | ~96TB | RAID 6 | Primary storage, media |
| Calypso | 24TB (2x12TB) | ~12TB | RAID 1 | Backup, development |
| Guava | 6TB+ | ~3TB | RAID 1 | AI/ML, compute |
| Setillo | 1TB | 1TB | Single | Monitoring |
| Proxmox | ~500GB | 500GB | Local SSD | VM storage |
🏛️ Atlantis - Primary Storage
Hardware Configuration
| Component | Specification |
|---|---|
| NAS Model | Synology DS1823xs+ |
| Drive Bays | 8x 3.5" hot-swap |
| Drives | 8x Seagate IronWolf Pro 16TB (ST16000NT001) |
| Cache | 2x WD Black SN750 480GB NVMe |
| RAID Level | RAID 6 (dual parity) |
| Raw Capacity | 128TB |
| Usable Capacity | ~96TB |
| Fault Tolerance | 2 drive failures |
RAID 6 Benefits
RAID 6 Configuration:
┌────┬────┬────┬────┬────┬────┬────┬────┐
│ D1 │ D2 │ D3 │ D4 │ D5 │ D6 │ P1 │ P2 │ ← Data + Dual Parity
├────┼────┼────┼────┼────┼────┼────┼────┤
│ D1 │ D2 │ D3 │ D4 │ D5 │ P1 │ P2 │ D6 │ ← Parity distributed
├────┼────┼────┼────┼────┼────┼────┼────┤
│ D1 │ D2 │ D3 │ D4 │ P1 │ P2 │ D5 │ D6 │
└────┴────┴────┴────┴────┴────┴────┴────┘
✅ Survives 2 simultaneous drive failures
✅ Good read performance
✅ 6 drives worth of usable space (75% efficiency)
⚠️ Slower writes due to parity calculation
Volume Layout
/volume1/ (Atlantis - ~96TB usable)
│
├── /docker/ # Container persistent data
│ ├── plex/
│ ├── immich/
│ ├── grafana/
│ └── ... (all stack data)
│
├── /media/ # Media library
│ ├── movies/ # 4K + 1080p movies
│ ├── tv/ # TV series
│ ├── music/ # Music library
│ └── audiobooks/ # Audiobook collection
│
├── /photos/ # Immich photo library
│ ├── library/ # Organized photos
│ └── upload/ # Incoming uploads
│
├── /documents/ # Paperless-NGX
│ ├── consume/ # Incoming documents
│ └── archive/ # Processed documents
│
├── /backups/ # Local backup storage
│ ├── calypso/ # Cross-NAS backups
│ └── vm-snapshots/ # VM backup images
│
└── /archive/ # Long-term cold storage
└── old-projects/
NVMe SSD Cache
- Type: Read-write cache
- Drives: 2x WD Black SN750 480GB
- Configuration: RAID 1 (mirrored for safety)
- Purpose: Accelerate frequently accessed data
🏢 Calypso - Secondary Storage
Hardware Configuration
| Component | Specification |
|---|---|
| NAS Model | Synology DS723+ |
| Drive Bays | 2x 3.5" hot-swap |
| Drives | 2x Seagate IronWolf Pro 12TB (ST12000NT001) |
| Cache | 2x WD Black SN750 480GB NVMe |
| RAID Level | RAID 1 (mirrored) |
| Raw Capacity | 24TB |
| Usable Capacity | ~12TB |
| Fault Tolerance | 1 drive failure |
RAID 1 Benefits
RAID 1 Configuration:
┌────────────────┐ ┌────────────────┐
│ Drive 1 │ │ Drive 2 │
│ (12TB) │◄─► (12TB) │ ← Mirror
│ │ │ │
│ All data is │ │ Exact copy │
│ written to │ │ of Drive 1 │
│ both drives │ │ │
└────────────────┘ └────────────────┘
✅ Survives 1 drive failure
✅ Fast read performance (can read from either)
✅ Simple recovery (just replace failed drive)
⚠️ 50% storage efficiency
Volume Layout
/volume1/ (Calypso - ~12TB usable)
│
├── /docker/ # Container persistent data
│ ├── gitea/
│ ├── firefly/
│ ├── arr-suite/
│ └── ... (dev stacks)
│
├── /apt-cache/ # APT-Cacher-NG
│ └── cache/ # Debian package cache
│
├── /backups/ # Backup destination
│ ├── atlantis/ # Hyper Backup from Atlantis
│ └── databases/ # Database dumps
│
└── /development/ # Development data
├── repos/ # Git repositories
└── projects/ # Project files
🖥️ Other Storage Systems
Guava - AI/ML Workstation
| Component | Specification |
|---|---|
| Primary | 1TB NVMe SSD (OS + fast storage) |
| Secondary | 2x HDD in RAID 1 (~3TB usable) |
| Purpose | AI model storage, datasets, compute scratch |
Setillo - Monitoring
| Component | Specification |
|---|---|
| Storage | 1TB single drive |
| Purpose | Prometheus metrics, AdGuard data |
| Note | Non-critical data, can be rebuilt |
Proxmox - VM Host
| Component | Specification |
|---|---|
| Storage | ~500GB local SSD |
| Purpose | VM disk images |
| Backup | VMs backed up to Atlantis |
📦 Backup Strategy
3-2-1 Rule Implementation
| Rule | Implementation | Status |
|---|---|---|
| 3 Copies | Original + Calypso + Backblaze | ✅ |
| 2 Media Types | NAS HDDs + Cloud | ✅ |
| 1 Offsite | Backblaze B2 | ✅ |
Backup Flow
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ ATLANTIS │────►│ CALYPSO │────►│ BACKBLAZE │
│ (Primary) │ │ (Local) │ │ B2 │
│ │ │ │ │ (Offsite) │
│ Original │ │ Hyper │ │ Cloud │
│ Data │ │ Backup │ │ Backup │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
│ │ │
▼ ▼ ▼
Immediate < 24 hours < 24 hours
Access Recovery Recovery
Backup Software
| Tool | Source | Destination | Schedule |
|---|---|---|---|
| Synology Hyper Backup | Atlantis | Calypso | Daily |
| Synology Cloud Sync | Atlantis | Backblaze B2 | Daily |
| Synology Hyper Backup | Calypso | Backblaze B2 | Weekly |
What Gets Backed Up
| Data Type | Priority | Frequency | Retention |
|---|---|---|---|
| Docker configs | Critical | Daily | 30 days |
| Databases | Critical | Daily | 30 days |
| Photos (Immich) | High | Daily | Forever |
| Documents | High | Daily | 1 year |
| Media library | Medium | Weekly | Latest only |
| VM snapshots | Medium | Weekly | 4 versions |
| Logs | Low | Not backed up | N/A |
Recovery Time Objectives
| Scenario | RTO Target | Recovery Method |
|---|---|---|
| Single file recovery | < 1 hour | Hyper Backup restore |
| Service recovery | < 4 hours | Docker volume restore |
| Full NAS recovery | < 24 hours | Bare metal + B2 restore |
| Disaster recovery | < 48 hours | New hardware + B2 restore |
📂 Shared Storage (NFS/SMB)
Network Shares
| Share | Protocol | Host | Access | Purpose |
|---|---|---|---|---|
/media |
SMB | Atlantis | Read-only (most), RW (arr) | Media streaming |
/photos |
SMB | Atlantis | RW (Immich user) | Photo backup |
/docker |
NFS | Atlantis | RW (Docker hosts) | Container data |
/backups |
SMB | Calypso | RW (backup service) | Backup destination |
Docker Volume Mounts
Containers access NAS storage via NFS mounts:
# Example: Plex accessing media
volumes:
- /volume1/docker/plex:/config
- /volume1/media:/media:ro
Permission Model
NAS User: docker (UID 1000)
├── Owns /volume1/docker/
├── Read access to /volume1/media/
└── Write access to specific paths
NAS User: media (UID 1001)
├── Write access to /volume1/media/
└── Used by *arr suite for downloads
📈 Storage Monitoring
Metrics Collected
| Metric | Tool | Alert Threshold |
|---|---|---|
| Disk usage | Prometheus + Node Exporter | > 85% |
| RAID health | Synology DSM | Degraded |
| Drive SMART | Synology DSM | Warning/Critical |
| I/O latency | Prometheus | > 100ms |
| Backup status | Hyper Backup | Failed |
Grafana Dashboard
Storage dashboard shows:
- Volume utilization trends
- I/O throughput
- RAID rebuild status
- Drive temperatures
- Backup completion status
🔮 Storage Expansion Plan
Current Utilization
| Host | Used | Total | % Used |
|---|---|---|---|
| Atlantis | ~60TB | 96TB | 62% |
| Calypso | ~12TB | 12TB | ~100% |
Future Expansion Options
-
Atlantis: Already at max capacity (8 bays)
- Replace 16TB drives with larger (24TB+) when available
- Add expansion unit (DX517)
-
Calypso: At capacity
- Replace 12TB drives with 20TB+ drives
- Consider migration to larger NAS
-
New NAS: For cold/archive storage
- Lower-powered unit for infrequent access
- RAID 5 acceptable for archive data
🛠️ Maintenance Tasks
Regular Maintenance
| Task | Frequency | Procedure |
|---|---|---|
| SMART check | Weekly | Review DSM health |
| Scrub | Monthly | Synology scheduled task |
| Backup verification | Monthly | Test restore of random files |
| Capacity review | Quarterly | Plan for growth |
Drive Replacement Procedure
- Identify failed drive via DSM notification
- Order replacement (same or larger capacity)
- Hot-swap failed drive
- Monitor rebuild (can take 24-48 hours for large arrays)
- Verify RAID health after rebuild completes
📚 Related Documentation
- Host Infrastructure: Server specifications
- Security Model: Backup encryption details
- Network Architecture: NFS/SMB networking
Storage infrastructure is critical. Regular monitoring and proactive maintenance prevent data loss.