# 💾 Storage Systems **🟡 Intermediate Guide** This document covers the storage architecture, RAID configurations, backup strategies, and data management practices for the homelab infrastructure. --- ## 🏗️ Storage Architecture Overview ``` ┌─────────────────────────────────────────────────────────────────────────────┐ │ STORAGE INFRASTRUCTURE │ ├─────────────────────────────────────────────────────────────────────────────┤ │ │ │ PRIMARY STORAGE BACKUP TARGETS │ │ ┌─────────────────────┐ ┌─────────────────────┐ │ │ │ ATLANTIS │ │ CALYPSO │ │ │ │ Synology NAS │ ──────► │ Synology NAS │ │ │ │ │ Hyper │ │ │ │ │ 8x 16TB RAID 6 │ Backup │ 2x 12TB RAID 1 │ │ │ │ ≈96TB usable │ │ ≈12TB usable │ │ │ │ │ │ │ │ │ │ + 2x 480GB NVMe │ │ + 2x 480GB NVMe │ │ │ │ (SSD Cache) │ │ (SSD Cache) │ │ │ └─────────────────────┘ └─────────────────────┘ │ │ │ │ │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ BACKBLAZE B2 │ │ │ │ Cloud Offsite Backup │ │ │ │ Encrypted, Versioned Storage │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ SECONDARY STORAGE │ │ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │ │ │ GUAVA │ │ SETILLO │ │ PROXMOX │ │ │ │ RAID 1 HDD │ │ Single 1TB │ │ Local SSD │ │ │ │ + NVMe SSD │ │ │ │ │ │ │ └───────────────┘ └───────────────┘ └───────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────────────────┘ ``` --- ## 📊 Storage Summary | Host | Total Raw | Usable | RAID Level | Purpose | |------|-----------|--------|------------|---------| | **Atlantis** | 128TB (8x16TB) | ~96TB | RAID 6 | Primary storage, media | | **Calypso** | 24TB (2x12TB) | ~12TB | RAID 1 | Backup, development | | **Guava** | 6TB+ | ~3TB | RAID 1 | AI/ML, compute | | **Setillo** | 1TB | 1TB | Single | Monitoring | | **Proxmox** | ~500GB | 500GB | Local SSD | VM storage | --- ## 🏛️ Atlantis - Primary Storage ### **Hardware Configuration** | Component | Specification | |-----------|--------------| | **NAS Model** | Synology DS1823xs+ | | **Drive Bays** | 8x 3.5" hot-swap | | **Drives** | 8x Seagate IronWolf Pro 16TB (ST16000NT001) | | **Cache** | 2x WD Black SN750 480GB NVMe | | **RAID Level** | RAID 6 (dual parity) | | **Raw Capacity** | 128TB | | **Usable Capacity** | ~96TB | | **Fault Tolerance** | 2 drive failures | ### **RAID 6 Benefits** ``` RAID 6 Configuration: ┌────┬────┬────┬────┬────┬────┬────┬────┐ │ D1 │ D2 │ D3 │ D4 │ D5 │ D6 │ P1 │ P2 │ ← Data + Dual Parity ├────┼────┼────┼────┼────┼────┼────┼────┤ │ D1 │ D2 │ D3 │ D4 │ D5 │ P1 │ P2 │ D6 │ ← Parity distributed ├────┼────┼────┼────┼────┼────┼────┼────┤ │ D1 │ D2 │ D3 │ D4 │ P1 │ P2 │ D5 │ D6 │ └────┴────┴────┴────┴────┴────┴────┴────┘ ✅ Survives 2 simultaneous drive failures ✅ Good read performance ✅ 6 drives worth of usable space (75% efficiency) ⚠️ Slower writes due to parity calculation ``` ### **Volume Layout** ``` /volume1/ (Atlantis - ~96TB usable) │ ├── /docker/ # Container persistent data │ ├── plex/ │ ├── immich/ │ ├── grafana/ │ └── ... (all stack data) │ ├── /media/ # Media library │ ├── movies/ # 4K + 1080p movies │ ├── tv/ # TV series │ ├── music/ # Music library │ └── audiobooks/ # Audiobook collection │ ├── /photos/ # Immich photo library │ ├── library/ # Organized photos │ └── upload/ # Incoming uploads │ ├── /documents/ # Paperless-NGX │ ├── consume/ # Incoming documents │ └── archive/ # Processed documents │ ├── /backups/ # Local backup storage │ ├── calypso/ # Cross-NAS backups │ └── vm-snapshots/ # VM backup images │ └── /archive/ # Long-term cold storage └── old-projects/ ``` ### **NVMe SSD Cache** - **Type**: Read-write cache - **Drives**: 2x WD Black SN750 480GB - **Configuration**: RAID 1 (mirrored for safety) - **Purpose**: Accelerate frequently accessed data --- ## 🏢 Calypso - Secondary Storage ### **Hardware Configuration** | Component | Specification | |-----------|--------------| | **NAS Model** | Synology DS723+ | | **Drive Bays** | 2x 3.5" hot-swap | | **Drives** | 2x Seagate IronWolf Pro 12TB (ST12000NT001) | | **Cache** | 2x WD Black SN750 480GB NVMe | | **RAID Level** | RAID 1 (mirrored) | | **Raw Capacity** | 24TB | | **Usable Capacity** | ~12TB | | **Fault Tolerance** | 1 drive failure | ### **RAID 1 Benefits** ``` RAID 1 Configuration: ┌────────────────┐ ┌────────────────┐ │ Drive 1 │ │ Drive 2 │ │ (12TB) │◄─► (12TB) │ ← Mirror │ │ │ │ │ All data is │ │ Exact copy │ │ written to │ │ of Drive 1 │ │ both drives │ │ │ └────────────────┘ └────────────────┘ ✅ Survives 1 drive failure ✅ Fast read performance (can read from either) ✅ Simple recovery (just replace failed drive) ⚠️ 50% storage efficiency ``` ### **Volume Layout** ``` /volume1/ (Calypso - ~12TB usable) │ ├── /docker/ # Container persistent data │ ├── gitea/ │ ├── firefly/ │ ├── arr-suite/ │ └── ... (dev stacks) │ ├── /apt-cache/ # APT-Cacher-NG │ └── cache/ # Debian package cache │ ├── /backups/ # Backup destination │ ├── atlantis/ # Hyper Backup from Atlantis │ └── databases/ # Database dumps │ └── /development/ # Development data ├── repos/ # Git repositories └── projects/ # Project files ``` --- ## 🖥️ Other Storage Systems ### **Guava - AI/ML Workstation** | Component | Specification | |-----------|--------------| | **Primary** | 1TB NVMe SSD (OS + fast storage) | | **Secondary** | 2x HDD in RAID 1 (~3TB usable) | | **Purpose** | AI model storage, datasets, compute scratch | ### **Setillo - Monitoring** | Component | Specification | |-----------|--------------| | **Storage** | 1TB single drive | | **Purpose** | Prometheus metrics, AdGuard data | | **Note** | Non-critical data, can be rebuilt | ### **Proxmox - VM Host** | Component | Specification | |-----------|--------------| | **Storage** | ~500GB local SSD | | **Purpose** | VM disk images | | **Backup** | VMs backed up to Atlantis | --- ## 📦 Backup Strategy ### **3-2-1 Rule Implementation** | Rule | Implementation | Status | |------|----------------|--------| | **3 Copies** | Original + Calypso + Backblaze | ✅ | | **2 Media Types** | NAS HDDs + Cloud | ✅ | | **1 Offsite** | Backblaze B2 | ✅ | ### **Backup Flow** ``` ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ ATLANTIS │────►│ CALYPSO │────►│ BACKBLAZE │ │ (Primary) │ │ (Local) │ │ B2 │ │ │ │ │ │ (Offsite) │ │ Original │ │ Hyper │ │ Cloud │ │ Data │ │ Backup │ │ Backup │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ │ │ ▼ ▼ ▼ Immediate < 24 hours < 24 hours Access Recovery Recovery ``` ### **Backup Software** | Tool | Source | Destination | Schedule | |------|--------|-------------|----------| | **Synology Hyper Backup** | Atlantis | Calypso | Daily | | **Synology Cloud Sync** | Atlantis | Backblaze B2 | Daily | | **Synology Hyper Backup** | Calypso | Backblaze B2 | Weekly | ### **What Gets Backed Up** | Data Type | Priority | Frequency | Retention | |-----------|----------|-----------|-----------| | **Docker configs** | Critical | Daily | 30 days | | **Databases** | Critical | Daily | 30 days | | **Photos (Immich)** | High | Daily | Forever | | **Documents** | High | Daily | 1 year | | **Media library** | Medium | Weekly | Latest only | | **VM snapshots** | Medium | Weekly | 4 versions | | **Logs** | Low | Not backed up | N/A | ### **Recovery Time Objectives** | Scenario | RTO Target | Recovery Method | |----------|------------|-----------------| | Single file recovery | < 1 hour | Hyper Backup restore | | Service recovery | < 4 hours | Docker volume restore | | Full NAS recovery | < 24 hours | Bare metal + B2 restore | | Disaster recovery | < 48 hours | New hardware + B2 restore | --- ## 📂 Shared Storage (NFS/SMB) ### **Network Shares** | Share | Protocol | Host | Access | Purpose | |-------|----------|------|--------|---------| | `/media` | SMB | Atlantis | Read-only (most), RW (arr) | Media streaming | | `/photos` | SMB | Atlantis | RW (Immich user) | Photo backup | | `/docker` | NFS | Atlantis | RW (Docker hosts) | Container data | | `/backups` | SMB | Calypso | RW (backup service) | Backup destination | ### **Docker Volume Mounts** Containers access NAS storage via NFS mounts: ```yaml # Example: Plex accessing media volumes: - /volume1/docker/plex:/config - /volume1/media:/media:ro ``` ### **Permission Model** ``` NAS User: docker (UID 1000) ├── Owns /volume1/docker/ ├── Read access to /volume1/media/ └── Write access to specific paths NAS User: media (UID 1001) ├── Write access to /volume1/media/ └── Used by *arr suite for downloads ``` --- ## 📈 Storage Monitoring ### **Metrics Collected** | Metric | Tool | Alert Threshold | |--------|------|-----------------| | Disk usage | Prometheus + Node Exporter | > 85% | | RAID health | Synology DSM | Degraded | | Drive SMART | Synology DSM | Warning/Critical | | I/O latency | Prometheus | > 100ms | | Backup status | Hyper Backup | Failed | ### **Grafana Dashboard** Storage dashboard shows: - Volume utilization trends - I/O throughput - RAID rebuild status - Drive temperatures - Backup completion status --- ## 🔮 Storage Expansion Plan ### **Current Utilization** | Host | Used | Total | % Used | |------|------|-------|--------| | Atlantis | ~60TB | 96TB | 62% | | Calypso | ~12TB | 12TB | ~100% | ### **Future Expansion Options** 1. **Atlantis**: Already at max capacity (8 bays) - Replace 16TB drives with larger (24TB+) when available - Add expansion unit (DX517) 2. **Calypso**: At capacity - Replace 12TB drives with 20TB+ drives - Consider migration to larger NAS 3. **New NAS**: For cold/archive storage - Lower-powered unit for infrequent access - RAID 5 acceptable for archive data --- ## 🛠️ Maintenance Tasks ### **Regular Maintenance** | Task | Frequency | Procedure | |------|-----------|-----------| | SMART check | Weekly | Review DSM health | | Scrub | Monthly | Synology scheduled task | | Backup verification | Monthly | Test restore of random files | | Capacity review | Quarterly | Plan for growth | ### **Drive Replacement Procedure** 1. **Identify failed drive** via DSM notification 2. **Order replacement** (same or larger capacity) 3. **Hot-swap** failed drive 4. **Monitor rebuild** (can take 24-48 hours for large arrays) 5. **Verify RAID health** after rebuild completes --- ## 📚 Related Documentation - **[Host Infrastructure](hosts.md)**: Server specifications - **[Security Model](security.md)**: Backup encryption details - **[Network Architecture](networking.md)**: NFS/SMB networking --- *Storage infrastructure is critical. Regular monitoring and proactive maintenance prevent data loss.*