Files
homelab-optimized/docs/runbooks/synology-dsm-upgrade.md
Gitea Mirror Bot e71c8ddb4b
Some checks failed
Documentation / Build Docusaurus (push) Failing after 5m5s
Documentation / Deploy to GitHub Pages (push) Has been skipped
Sanitized mirror from private repository - 2026-04-20 01:24:42 UTC
2026-04-20 01:24:42 +00:00

15 KiB

Synology DSM Upgrade Runbook

Overview

This runbook provides a safe procedure for upgrading DiskStation Manager (DSM) on Synology NAS devices (Atlantis DS1823xs+ and Calypso DS723+). The procedure minimizes downtime and ensures data integrity during major and minor DSM upgrades.

Prerequisites

  • DSM admin credentials
  • Complete backup of NAS (HyperBackup or external)
  • Backup verification completed
  • List of installed packages and their versions
  • SSH access to NAS (for troubleshooting)
  • Maintenance window scheduled (1-3 hours)
  • All Docker containers documented and backed up
  • Tailscale or alternative remote access configured

Metadata

  • Estimated Time: 1-3 hours (including backups and verification)
  • Risk Level: Medium-High (system-level upgrade)
  • Requires Downtime: Yes (30-60 minutes for upgrade itself)
  • Reversible: Limited (can rollback but complicated)
  • Tested On: 2026-02-14

Upgrade Types

Type Example Risk Downtime Reversibility
Patch Update 7.2.1 → 7.2.2 Low 15-30 min Easy
Minor Update 7.2 → 7.3 Medium 30-60 min Moderate
Major Update 7.x → 8.0 High 60-120 min Difficult

Pre-Upgrade Planning

Step 1: Check Compatibility

Before upgrading, verify compatibility:

# SSH to NAS
ssh admin@atlantis  # or calypso

# Check current DSM version
cat /etc.defaults/VERSION

# Check hardware compatibility
# Visit: https://www.synology.com/en-us/dsm
# Verify your model supports the target DSM version

# Check RAM requirements (DSM 7.2+ needs at least 1GB)
free -h

# Check disk space (need at least 5GB free in system partition)
df -h

Step 2: Document Current State

Create a pre-upgrade snapshot of your configuration:

# Document installed packages
# DSM UI → Package Center → Installed
# Take screenshot or note down:
# - Package names and versions
# - Custom configurations

# Export Docker Compose files (already in git)
cd ~/Documents/repos/homelab
git status  # Ensure all configs are committed

# Document running containers
ssh atlantis "docker ps --format 'table {{.Names}}\t{{.Image}}\t{{.Status}}' > /volume1/docker/pre-upgrade-containers.txt"
ssh calypso "docker ps --format 'table {{.Names}}\t{{.Image}}\t{{.Status}}' > /volume1/docker/pre-upgrade-containers.txt"

# Export package list
ssh atlantis "synopkg list > /volume1/docker/pre-upgrade-packages.txt"
ssh calypso "synopkg list > /volume1/docker/pre-upgrade-packages.txt"

Step 3: Backup Everything

Critical: Complete a full backup before proceeding.

# 1. Backup via HyperBackup (if configured)
# DSM UI → HyperBackup → Backup Now

# 2. Export DSM configuration
# DSM UI → Control Panel → Update & Restore → Configuration Backup → Back Up Configuration

# 3. Backup Docker volumes
cd ~/Documents/repos/homelab
./backup.sh

# 4. Snapshot (if using Btrfs)
# Storage Manager → Storage Pool → Snapshots → Take Snapshot

# 5. Verify backups
ls -lh /volume1/backups/
# Ensure backup completed successfully

Step 4: Notify Users

If other users rely on your homelab:

# Send notification (via your notification system)
curl -H "Title: Scheduled Maintenance" \
     -H "Priority: high" \
     -H "Tags: maintenance" \
     -d "DSM upgrade scheduled for [DATE/TIME]. Services will be unavailable for approximately 1-2 hours." \
     https://ntfy.sh/REDACTED_TOPIC

# Or send notification via Signal/Discord/etc.

Step 5: Plan Rollback Strategy

Document your rollback plan:

  • Backup location verified: ___________
  • Restore procedure tested: Yes/No
  • Alternative access method ready (direct keyboard/monitor)
  • Support contact available if needed

Upgrade Procedure

Step 1: Download DSM Update

Option A: Via DSM UI (Recommended)

  1. Log in to DSM web interface
  2. Control PanelUpdate & Restore
  3. DSM Update tab
  4. If update available, click Download (don't install yet)
  5. Wait for download to complete
  6. Read release notes carefully

Option B: Manual Download

  1. Visit Synology Download Center
  2. Find your model (DS1823xs+ or DS723+)
  3. Download appropriate DSM version
  4. Upload via DSM → Manual DSM Update

Step 2: Prepare for Downtime

# Stop non-critical Docker containers (optional, reduces memory pressure)
ssh atlantis
docker stop $(docker ps -q --filter "name=pattern")  # Stop specific containers

# Or stop all non-critical containers
# Review which containers can be safely stopped
docker ps
docker stop container1 container2 container3

# Leave critical services running:
# - Portainer (for post-upgrade management)
# - Monitoring (to track upgrade progress)
# - Core network services (AdGuard, VPN if critical)

Step 3: Initiate Upgrade

Via DSM UI:

  1. Control PanelUpdate & RestoreDSM Update
  2. Click Update Now
  3. Review release notes and warnings
  4. Check Yes, I understand I need to perform a backup before updating DSM
  5. Click OK to start

Via SSH (advanced, not recommended unless necessary):

# SSH to NAS
ssh admin@atlantis

# Start upgrade manually
sudo synoupgrade --start /volume1/@tmp/upd@te/update.pat

# Monitor progress
tail -f /var/log/messages

Step 4: Monitor Upgrade Progress

During upgrade, you'll see:

  1. Checking system: Verifying prerequisites
  2. Downloading: If not pre-downloaded
  3. Installing: Actual upgrade process (30-45 minutes)
  4. Optimizing system: Post-install tasks
  5. Reboot: System will restart

Monitoring via SSH (if you have access during upgrade):

# Watch upgrade progress
tail -f /var/log/upgrade.log

# Or watch system messages
tail -f /var/log/messages | grep -i upgrade

Expected timeline:

  • Preparation: 5-10 minutes
  • Installation: 30-45 minutes
  • First reboot: 3-5 minutes
  • Optimization: 10-20 minutes
  • Final reboot: 3-5 minutes
  • Total: 60-90 minutes

Step 5: Wait for Completion

⚠️ IMPORTANT: Do not power off or interrupt the upgrade!

Signs of normal upgrade:

  • DSM UI becomes inaccessible
  • NAS may beep once (starting upgrade)
  • Disk lights active
  • NAS will reboot 1-2 times
  • Final beep when complete

Step 6: First Login After Upgrade

  1. Wait for NAS to complete all restarts
  2. Access DSM UI (may take 5-10 minutes after last reboot)
  3. Log in with admin credentials
  4. You may see "Optimization in progress" - this is normal
  5. Review the "What's New" page
  6. Accept any new terms/agreements

Post-Upgrade Verification

Step 1: Verify System Health

# SSH to NAS
ssh admin@atlantis

# Check DSM version
cat /etc.defaults/VERSION
# Should show new version

# Check system status
sudo syno_disk_check

# Check RAID status
cat /proc/mdstat

# Check disk health
sudo smartctl -a /dev/sda

# Verify storage pools
synospace --get

Via DSM UI:

  • Storage Manager → Verify all pools are "Healthy"
  • Resource Monitor → Check CPU, RAM, network
  • Log Center → Review any errors during upgrade

Step 2: Verify Packages

# Check all packages are running
synopkg list

# Compare with pre-upgrade package list
diff /volume1/docker/pre-upgrade-packages.txt <(synopkg list)

# Start any stopped packages
# DSM UI → Package Center → Installed
# Check each package, start if needed

Common packages to verify:

  • Docker
  • Synology Drive
  • Hyper Backup
  • Snapshot Replication
  • Any other installed packages

Step 3: Verify Docker Containers

# SSH to NAS
ssh atlantis

# Check Docker is running
docker --version
docker info

# Check all containers
docker ps -a

# Compare with pre-upgrade state
diff /volume1/docker/pre-upgrade-containers.txt <(docker ps --format 'table {{.Names}}\t{{.Image}}\t{{.Status}}')

# Start stopped containers
docker start $(docker ps -a -q -f status=exited)

# Check container logs for errors
docker ps --format "{{.Names}}" | xargs -I {} sh -c 'echo "=== {} ===" && docker logs --tail 20 {}'

Step 4: Test Key Services

Verify critical services are working:

# Test network connectivity
ping -c 4 8.8.8.8
curl -I https://google.com

# Test Docker networking
docker exec [container] ping -c 2 8.8.8.8

# Test Portainer access
curl http://localhost:9000

# Test Plex
curl http://localhost:32400/web

# Test monitoring
curl http://localhost:3000  # Grafana
curl http://localhost:9090  # Prometheus

Via browser:

  • Portainer accessible
  • Grafana dashboards loading
  • Plex/Jellyfin streaming works
  • File shares accessible
  • SSO (Authentik) working

Step 5: Verify Scheduled Tasks

# Check cron jobs
crontab -l

# Via DSM UI
# Control Panel → Task Scheduler
# Verify all tasks are enabled

Step 6: Test Remote Access

  • Tailscale VPN working
  • External access via domain (if configured)
  • SSH access working
  • Mobile app access working (DS File, DS Photo, etc.)

Post-Upgrade Optimization

Step 1: Update Packages

After DSM upgrade, packages may need updates:

  1. Package CenterUpdate tab
  2. Update available packages
  3. Prioritize critical packages:
    • Docker (if updated)
    • Surveillance Station (if used)
    • Drive, Office, etc.

Step 2: Review New Features

DSM upgrades often include new features:

  1. Review "What's New" page
  2. Check for new security features
  3. Review changed settings
  4. Update documentation if needed

Step 3: Re-enable Auto-Updates (if disabled)

# Via DSM UI
# Control Panel → Update & Restore → DSM Update
# Check "Notify me when DSM updates are available"
# Or "Install latest DSM updates automatically" (if you trust auto-updates)

Step 4: Update Documentation

cd ~/Documents/repos/homelab

# Update infrastructure docs
nano docs/infrastructure/INFRASTRUCTURE_OVERVIEW.md

# Note DSM version upgrade
# Document any configuration changes
# Update troubleshooting docs if procedures changed

git add .
git commit -m "Update docs: DSM upgraded to X.X on Atlantis/Calypso"
git push

Troubleshooting

Issue: Upgrade Fails or Stalls

Symptoms: Progress bar stuck, no activity for >30 minutes

Solutions:

# If you have SSH access:
ssh admin@atlantis

# Check if upgrade process is running
ps aux | grep -i upgrade

# Check system logs
tail -100 /var/log/messages
tail -100 /var/log/upgrade.log

# Check disk space
df -h

# If completely stuck (>1 hour no progress):
# 1. Do NOT force reboot unless absolutely necessary
# 2. Contact Synology support first
# 3. As last resort, force reboot via physical button

Issue: NAS Won't Boot After Upgrade

Symptoms: Cannot access DSM UI, NAS beeping continuously

Solutions:

  1. Check beep pattern (indicates specific error)

    • 1 beep: Normal boot
    • 3 beeps: RAM issue
    • 4 beeps: Disk issue
    • Continuous: Critical failure
  2. Try Safe Mode:

    • Power off NAS
    • Hold reset button
    • Power on while holding reset
    • Hold for 4 seconds until beep
    • Release and wait for boot
  3. Check via Synology Assistant:

    • Download Synology Assistant on PC
    • Scan network for NAS
    • May show recovery mode option
  4. Last resort: Reinstall DSM:

Issue: Docker Not Working After Upgrade

Symptoms: Docker containers won't start, Docker package shows stopped

Solutions:

# SSH to NAS
ssh admin@atlantis

# Check Docker status
sudo synoservicectl --status pkgctl-Docker

# Restart Docker
sudo synoservicectl --restart pkgctl-Docker

# If Docker won't start, check logs
cat /var/log/docker.log

# Reinstall Docker package (preserves volumes)
# Via DSM UI → Package Center → Docker → Uninstall
# Then reinstall Docker
# Your volumes and data will be preserved

Issue: Network Shares Not Accessible

Symptoms: Can't connect to SMB/NFS shares

Solutions:

# Check share services
sudo synoservicectl --status smbd  # SMB
sudo synoservicectl --status nfsd  # NFS

# Restart services
sudo synoservicectl --restart smbd
sudo synoservicectl --restart nfsd

# Check firewall
# Control Panel → Security → Firewall
# Ensure file sharing ports allowed

Issue: Performance Degradation After Upgrade

Symptoms: Slow response, high CPU/RAM usage

Solutions:

# Check what's using resources
top
htop  # If installed

# Via DSM UI → Resource Monitor
# Identify resource-hungry processes

# Common causes:
# 1. Indexing in progress (Photos, Drive, Universal Search)
#    - Wait for indexing to complete (can take hours)
# 2. Optimization running
#    - Check: ps aux | grep optimize
#    - Let it complete
# 3. Too many containers started at once
#    - Stagger container startup

Rollback Procedure

⚠️ WARNING: Rollback is complex and risky. Only attempt if absolutely necessary.

Method 1: DSM Archive (If Available)

# SSH to NAS
ssh admin@atlantis

# Check if previous DSM version archived
ls -la /volume1/@appstore/

# If archive exists, you can attempt rollback
# CAUTION: This is not officially supported and may cause data loss

Method 2: Restore from Backup

If upgrade caused critical issues:

  1. REDACTED_APP_PASSWORD
  2. Restore from HyperBackup
  3. Or restore from configuration backup:
    • Control PanelUpdate & Restore
    • Configuration BackupRestore

Method 3: Fresh Install (Nuclear Option)

⚠️ DANGER: This will erase everything. Only for catastrophic failure.

  1. Download previous DSM version
  2. Install via Synology Assistant in "Recovery Mode"
  3. Restore from complete backup
  4. Reconfigure everything

Best Practices

Timing

  • Schedule upgrades during low-usage periods
  • Allow 3-4 hour maintenance window
  • Don't upgrade before important events
  • Wait 2-4 weeks after major DSM release (let others find bugs)

Testing

  • If you have 2 NAS units, upgrade one first
  • Test on less critical NAS before primary
  • Read community forums for known issues
  • Review Synology release notes thoroughly

Preparation

  • Always complete full backup
  • Test backup restore before upgrade
  • Document all configurations
  • Have physical access to NAS if possible
  • Keep Synology Assistant installed on PC

Post-Upgrade

  • Monitor closely for 24-48 hours
  • Check logs daily for first week
  • Report any bugs to Synology
  • Update your documentation

Verification Checklist

  • DSM upgraded to target version
  • All storage pools healthy
  • All packages running
  • All Docker containers running
  • Network shares accessible
  • Remote access working (Tailscale, QuickConnect)
  • Scheduled tasks running
  • Monitoring dashboards functional
  • Backups completing successfully
  • No errors in system logs
  • Performance normal
  • Documentation updated

Additional Resources

Change Log

  • 2026-02-14 - Initial creation
  • 2026-02-14 - Added comprehensive troubleshooting and rollback procedures