Files
homelab-optimized/docs/runbooks/synology-dsm-upgrade.md
Gitea Mirror Bot 6bc8605e2f
Some checks failed
Documentation / Build Docusaurus (push) Failing after 9s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-03-11 09:26:00 UTC
2026-03-11 09:26:01 +00:00

623 lines
15 KiB
Markdown

# Synology DSM Upgrade Runbook
## Overview
This runbook provides a safe procedure for upgrading DiskStation Manager (DSM) on Synology NAS devices (Atlantis DS1823xs+ and Calypso DS723+). The procedure minimizes downtime and ensures data integrity during major and minor DSM upgrades.
## Prerequisites
- [ ] DSM admin credentials
- [ ] Complete backup of NAS (HyperBackup or external)
- [ ] Backup verification completed
- [ ] List of installed packages and their versions
- [ ] SSH access to NAS (for troubleshooting)
- [ ] Maintenance window scheduled (1-3 hours)
- [ ] All Docker containers documented and backed up
- [ ] Tailscale or alternative remote access configured
## Metadata
- **Estimated Time**: 1-3 hours (including backups and verification)
- **Risk Level**: Medium-High (system-level upgrade)
- **Requires Downtime**: Yes (30-60 minutes for upgrade itself)
- **Reversible**: Limited (can rollback but complicated)
- **Tested On**: 2026-02-14
## Upgrade Types
| Type | Example | Risk | Downtime | Reversibility |
|------|---------|------|----------|---------------|
| **Patch Update** | 7.2.1 → 7.2.2 | Low | 15-30 min | Easy |
| **Minor Update** | 7.2 → 7.3 | Medium | 30-60 min | Moderate |
| **Major Update** | 7.x → 8.0 | High | 60-120 min | Difficult |
## Pre-Upgrade Planning
### Step 1: Check Compatibility
Before upgrading, verify compatibility:
```bash
# SSH to NAS
ssh admin@atlantis # or calypso
# Check current DSM version
cat /etc.defaults/VERSION
# Check hardware compatibility
# Visit: https://www.synology.com/en-us/dsm
# Verify your model supports the target DSM version
# Check RAM requirements (DSM 7.2+ needs at least 1GB)
free -h
# Check disk space (need at least 5GB free in system partition)
df -h
```
### Step 2: Document Current State
Create a pre-upgrade snapshot of your configuration:
```bash
# Document installed packages
# DSM UI → Package Center → Installed
# Take screenshot or note down:
# - Package names and versions
# - Custom configurations
# Export Docker Compose files (already in git)
cd ~/Documents/repos/homelab
git status # Ensure all configs are committed
# Document running containers
ssh atlantis "docker ps --format 'table {{.Names}}\t{{.Image}}\t{{.Status}}' > /volume1/docker/pre-upgrade-containers.txt"
ssh calypso "docker ps --format 'table {{.Names}}\t{{.Image}}\t{{.Status}}' > /volume1/docker/pre-upgrade-containers.txt"
# Export package list
ssh atlantis "synopkg list > /volume1/docker/pre-upgrade-packages.txt"
ssh calypso "synopkg list > /volume1/docker/pre-upgrade-packages.txt"
```
### Step 3: Backup Everything
**Critical**: Complete a full backup before proceeding.
```bash
# 1. Backup via HyperBackup (if configured)
# DSM UI → HyperBackup → Backup Now
# 2. Export DSM configuration
# DSM UI → Control Panel → Update & Restore → Configuration Backup → Back Up Configuration
# 3. Backup Docker volumes
cd ~/Documents/repos/homelab
./backup.sh
# 4. Snapshot (if using Btrfs)
# Storage Manager → Storage Pool → Snapshots → Take Snapshot
# 5. Verify backups
ls -lh /volume1/backups/
# Ensure backup completed successfully
```
### Step 4: Notify Users
If other users rely on your homelab:
```bash
# Send notification (via your notification system)
curl -H "Title: Scheduled Maintenance" \
-H "Priority: high" \
-H "Tags: maintenance" \
-d "DSM upgrade scheduled for [DATE/TIME]. Services will be unavailable for approximately 1-2 hours." \
https://ntfy.sh/REDACTED_TOPIC
# Or send notification via Signal/Discord/etc.
```
### Step 5: Plan Rollback Strategy
Document your rollback plan:
- [ ] Backup location verified: ___________
- [ ] Restore procedure tested: Yes/No
- [ ] Alternative access method ready (direct keyboard/monitor)
- [ ] Support contact available if needed
## Upgrade Procedure
### Step 1: Download DSM Update
**Option A: Via DSM UI (Recommended)**
1. Log in to DSM web interface
2. **Control Panel****Update & Restore**
3. **DSM Update** tab
4. If update available, click **Download** (don't install yet)
5. Wait for download to complete
6. Read release notes carefully
**Option B: Manual Download**
1. Visit Synology Download Center
2. Find your model (DS1823xs+ or DS723+)
3. Download appropriate DSM version
4. Upload via DSM → **Manual DSM Update**
### Step 2: Prepare for Downtime
```bash
# Stop non-critical Docker containers (optional, reduces memory pressure)
ssh atlantis
docker stop $(docker ps -q --filter "name=pattern") # Stop specific containers
# Or stop all non-critical containers
# Review which containers can be safely stopped
docker ps
docker stop container1 container2 container3
# Leave critical services running:
# - Portainer (for post-upgrade management)
# - Monitoring (to track upgrade progress)
# - Core network services (AdGuard, VPN if critical)
```
### Step 3: Initiate Upgrade
**Via DSM UI**:
1. **Control Panel****Update & Restore****DSM Update**
2. Click **Update Now**
3. Review release notes and warnings
4. Check **Yes, I understand I need to perform a backup before updating DSM**
5. Click **OK** to start
**Via SSH** (advanced, not recommended unless necessary):
```bash
# SSH to NAS
ssh admin@atlantis
# Start upgrade manually
sudo synoupgrade --start /volume1/@tmp/upd@te/update.pat
# Monitor progress
tail -f /var/log/messages
```
### Step 4: Monitor Upgrade Progress
During upgrade, you'll see:
1. **Checking system**: Verifying prerequisites
2. **Downloading**: If not pre-downloaded
3. **Installing**: Actual upgrade process (30-45 minutes)
4. **Optimizing system**: Post-install tasks
5. **Reboot**: System will restart
**Monitoring via SSH** (if you have access during upgrade):
```bash
# Watch upgrade progress
tail -f /var/log/upgrade.log
# Or watch system messages
tail -f /var/log/messages | grep -i upgrade
```
**Expected timeline**:
- Preparation: 5-10 minutes
- Installation: 30-45 minutes
- First reboot: 3-5 minutes
- Optimization: 10-20 minutes
- Final reboot: 3-5 minutes
- **Total**: 60-90 minutes
### Step 5: Wait for Completion
**⚠️ IMPORTANT**: Do not power off or interrupt the upgrade!
Signs of normal upgrade:
- DSM UI becomes inaccessible
- NAS may beep once (starting upgrade)
- Disk lights active
- NAS will reboot 1-2 times
- Final beep when complete
### Step 6: First Login After Upgrade
1. Wait for NAS to complete all restarts
2. Access DSM UI (may take 5-10 minutes after last reboot)
3. Log in with admin credentials
4. You may see "Optimization in progress" - this is normal
5. Review the "What's New" page
6. Accept any new terms/agreements
## Post-Upgrade Verification
### Step 1: Verify System Health
```bash
# SSH to NAS
ssh admin@atlantis
# Check DSM version
cat /etc.defaults/VERSION
# Should show new version
# Check system status
sudo syno_disk_check
# Check RAID status
cat /proc/mdstat
# Check disk health
sudo smartctl -a /dev/sda
# Verify storage pools
synospace --get
```
Via DSM UI:
- **Storage Manager** → Verify all pools are "Healthy"
- **Resource Monitor** → Check CPU, RAM, network
- **Log Center** → Review any errors during upgrade
### Step 2: Verify Packages
```bash
# Check all packages are running
synopkg list
# Compare with pre-upgrade package list
diff /volume1/docker/pre-upgrade-packages.txt <(synopkg list)
# Start any stopped packages
# DSM UI → Package Center → Installed
# Check each package, start if needed
```
Common packages to verify:
- [ ] Docker
- [ ] Synology Drive
- [ ] Hyper Backup
- [ ] Snapshot Replication
- [ ] Any other installed packages
### Step 3: Verify Docker Containers
```bash
# SSH to NAS
ssh atlantis
# Check Docker is running
docker --version
docker info
# Check all containers
docker ps -a
# Compare with pre-upgrade state
diff /volume1/docker/pre-upgrade-containers.txt <(docker ps --format 'table {{.Names}}\t{{.Image}}\t{{.Status}}')
# Start stopped containers
docker start $(docker ps -a -q -f status=exited)
# Check container logs for errors
docker ps --format "{{.Names}}" | xargs -I {} sh -c 'echo "=== {} ===" && docker logs --tail 20 {}'
```
### Step 4: Test Key Services
Verify critical services are working:
```bash
# Test network connectivity
ping -c 4 8.8.8.8
curl -I https://google.com
# Test Docker networking
docker exec [container] ping -c 2 8.8.8.8
# Test Portainer access
curl http://localhost:9000
# Test Plex
curl http://localhost:32400/web
# Test monitoring
curl http://localhost:3000 # Grafana
curl http://localhost:9090 # Prometheus
```
Via browser:
- [ ] Portainer accessible
- [ ] Grafana dashboards loading
- [ ] Plex/Jellyfin streaming works
- [ ] File shares accessible
- [ ] SSO (Authentik) working
### Step 5: Verify Scheduled Tasks
```bash
# Check cron jobs
crontab -l
# Via DSM UI
# Control Panel → Task Scheduler
# Verify all tasks are enabled
```
### Step 6: Test Remote Access
- [ ] Tailscale VPN working
- [ ] External access via domain (if configured)
- [ ] SSH access working
- [ ] Mobile app access working (DS File, DS Photo, etc.)
## Post-Upgrade Optimization
### Step 1: Update Packages
After DSM upgrade, packages may need updates:
1. **Package Center****Update** tab
2. Update available packages
3. Prioritize critical packages:
- Docker (if updated)
- Surveillance Station (if used)
- Drive, Office, etc.
### Step 2: Review New Features
DSM upgrades often include new features:
1. Review "What's New" page
2. Check for new security features
3. Review changed settings
4. Update documentation if needed
### Step 3: Re-enable Auto-Updates (if disabled)
```bash
# Via DSM UI
# Control Panel → Update & Restore → DSM Update
# Check "Notify me when DSM updates are available"
# Or "Install latest DSM updates automatically" (if you trust auto-updates)
```
### Step 4: Update Documentation
```bash
cd ~/Documents/repos/homelab
# Update infrastructure docs
nano docs/infrastructure/INFRASTRUCTURE_OVERVIEW.md
# Note DSM version upgrade
# Document any configuration changes
# Update troubleshooting docs if procedures changed
git add .
git commit -m "Update docs: DSM upgraded to X.X on Atlantis/Calypso"
git push
```
## Troubleshooting
### Issue: Upgrade Fails or Stalls
**Symptoms**: Progress bar stuck, no activity for >30 minutes
**Solutions**:
```bash
# If you have SSH access:
ssh admin@atlantis
# Check if upgrade process is running
ps aux | grep -i upgrade
# Check system logs
tail -100 /var/log/messages
tail -100 /var/log/upgrade.log
# Check disk space
df -h
# If completely stuck (>1 hour no progress):
# 1. Do NOT force reboot unless absolutely necessary
# 2. Contact Synology support first
# 3. As last resort, force reboot via physical button
```
### Issue: NAS Won't Boot After Upgrade
**Symptoms**: Cannot access DSM UI, NAS beeping continuously
**Solutions**:
1. **Check beep pattern** (indicates specific error)
- 1 beep: Normal boot
- 3 beeps: RAM issue
- 4 beeps: Disk issue
- Continuous: Critical failure
2. **Try Safe Mode**:
- Power off NAS
- Hold reset button
- Power on while holding reset
- Hold for 4 seconds until beep
- Release and wait for boot
3. **Check via Synology Assistant**:
- Download Synology Assistant on PC
- Scan network for NAS
- May show recovery mode option
4. **Last resort: Reinstall DSM**:
- Download latest DSM .pat file
- Access via http://[nas-ip]:5000
- Install DSM (will not erase data)
### Issue: Docker Not Working After Upgrade
**Symptoms**: Docker containers won't start, Docker package shows stopped
**Solutions**:
```bash
# SSH to NAS
ssh admin@atlantis
# Check Docker status
sudo synoservicectl --status pkgctl-Docker
# Restart Docker
sudo synoservicectl --restart pkgctl-Docker
# If Docker won't start, check logs
cat /var/log/docker.log
# Reinstall Docker package (preserves volumes)
# Via DSM UI → Package Center → Docker → Uninstall
# Then reinstall Docker
# Your volumes and data will be preserved
```
### Issue: Network Shares Not Accessible
**Symptoms**: Can't connect to SMB/NFS shares
**Solutions**:
```bash
# Check share services
sudo synoservicectl --status smbd # SMB
sudo synoservicectl --status nfsd # NFS
# Restart services
sudo synoservicectl --restart smbd
sudo synoservicectl --restart nfsd
# Check firewall
# Control Panel → Security → Firewall
# Ensure file sharing ports allowed
```
### Issue: Performance Degradation After Upgrade
**Symptoms**: Slow response, high CPU/RAM usage
**Solutions**:
```bash
# Check what's using resources
top
htop # If installed
# Via DSM UI → Resource Monitor
# Identify resource-hungry processes
# Common causes:
# 1. Indexing in progress (Photos, Drive, Universal Search)
# - Wait for indexing to complete (can take hours)
# 2. Optimization running
# - Check: ps aux | grep optimize
# - Let it complete
# 3. Too many containers started at once
# - Stagger container startup
```
## Rollback Procedure
⚠️ **WARNING**: Rollback is complex and risky. Only attempt if absolutely necessary.
### Method 1: DSM Archive (If Available)
```bash
# SSH to NAS
ssh admin@atlantis
# Check if previous DSM version archived
ls -la /volume1/@appstore/
# If archive exists, you can attempt rollback
# CAUTION: This is not officially supported and may cause data loss
```
### Method 2: Restore from Backup
If upgrade caused critical issues:
1. REDACTED_APP_PASSWORD
2. Restore from HyperBackup
3. Or restore from configuration backup:
- **Control Panel** → **Update & Restore**
- **Configuration Backup** → **Restore**
### Method 3: Fresh Install (Nuclear Option)
⚠️ **DANGER**: This will erase everything. Only for catastrophic failure.
1. Download previous DSM version
2. Install via Synology Assistant in "Recovery Mode"
3. Restore from complete backup
4. Reconfigure everything
## Best Practices
### Timing
- Schedule upgrades during low-usage periods
- Allow 3-4 hour maintenance window
- Don't upgrade before important events
- Wait 2-4 weeks after major DSM release (let others find bugs)
### Testing
- If you have 2 NAS units, upgrade one first
- Test on less critical NAS before primary
- Read community forums for known issues
- Review Synology release notes thoroughly
### Preparation
- Always complete full backup
- Test backup restore before upgrade
- Document all configurations
- Have physical access to NAS if possible
- Keep Synology Assistant installed on PC
### Post-Upgrade
- Monitor closely for 24-48 hours
- Check logs daily for first week
- Report any bugs to Synology
- Update your documentation
## Verification Checklist
- [ ] DSM upgraded to target version
- [ ] All storage pools healthy
- [ ] All packages running
- [ ] All Docker containers running
- [ ] Network shares accessible
- [ ] Remote access working (Tailscale, QuickConnect)
- [ ] Scheduled tasks running
- [ ] Monitoring dashboards functional
- [ ] Backups completing successfully
- [ ] No errors in system logs
- [ ] Performance normal
- [ ] Documentation updated
## Related Documentation
- [Infrastructure Overview](../infrastructure/INFRASTRUCTURE_OVERVIEW.md)
- [Emergency Access Guide](../troubleshooting/EMERGENCY_ACCESS_GUIDE.md)
- [Disaster Recovery](../troubleshooting/disaster-recovery.md)
- [Synology Disaster Recovery](../troubleshooting/synology-disaster-recovery.md)
- [Backup Strategies](../admin/backup-strategies.md)
## Additional Resources
- [Synology DSM Release Notes](https://www.synology.com/en-us/releaseNote/DSM)
- [Synology Community Forums](https://community.synology.com/)
- [Synology Knowledge Base](https://kb.synology.com/)
## Change Log
- 2026-02-14 - Initial creation
- 2026-02-14 - Added comprehensive troubleshooting and rollback procedures