# 🚨 Common Issues & Solutions **🟢 Beginner-Friendly Troubleshooting Guide** This guide covers the most frequent problems encountered in the homelab and their solutions. Issues are organized by category with step-by-step resolution instructions. ## 🎯 Quick Diagnosis ### 🔍 **First Steps for Any Problem** 1. **Check service status**: `docker ps` or `docker-compose ps` 2. **Review logs**: `docker-compose logs service-name` 3. **Verify connectivity**: Can you reach the service URL? 4. **Check resources**: `docker stats` for CPU/memory usage 5. **Test network**: `ping` and `curl` commands --- ## 🐳 Container Issues ### ❌ **Container Won't Start** #### **Symptoms** - Service shows as "Exited" in `docker ps` - Error messages in logs about startup failures - Service unreachable despite being "running" #### **Common Causes & Solutions** **🔧 Port Already in Use** ```bash # Check what's using the port sudo netstat -tulpn | grep :8080 # or sudo lsof -i :8080 # Solution: Change port in docker-compose.yml ports: - "8081:8080" # Use different external port ``` **🔧 Permission Issues (Synology)** ```bash # Fix ownership for Synology NAS sudo chown -R 1026:100 /volume1/docker/service-name sudo chmod -R 755 /volume1/docker/service-name # For other systems sudo chown -R 1000:1000 ./service-data ``` **🔧 Missing Environment Variables** ```bash # Check if .env file exists ls -la .env # Verify environment variables are set docker-compose config # Create missing .env file cat > .env << 'EOF' TZ=America/Los_Angeles PUID=1026 PGID=100 EOF ``` **🔧 Image Pull Failures** ```bash # Manually pull the image docker pull image:tag # Check if image exists docker images | grep image-name # Try different image tag image: service:stable # Instead of :latest ``` --- ### 🔄 **Container Keeps Restarting** #### **Symptoms** - Container status shows "Restarting" - High restart count in `docker ps` - Service intermittently available #### **Solutions** **🔧 Check Resource Limits** ```bash # Monitor resource usage docker stats --no-stream # Increase memory limit deploy: resources: limits: memory: 2G # Increase from 1G ``` **🔧 Fix Health Check Issues** ```bash # Test health check manually docker exec container-name curl -f http://localhost:8080/health # Adjust health check timing healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8080/health"] interval: 60s # Increase from 30s timeout: 30s # Increase from 10s start_period: 120s # Increase startup time ``` **🔧 Database Connection Issues** ```bash # Check database connectivity docker exec app-container ping database-container # Verify database is ready docker exec db-container pg_isready -U username # Add proper depends_on depends_on: database: condition: service_healthy ``` --- ## 🌐 Network & Connectivity Issues ### 🚫 **Service Not Accessible** #### **Symptoms** - "Connection refused" or "Site can't be reached" - Service running but not responding to requests - Timeout errors when accessing web interface #### **Solutions** **🔧 Check Port Binding** ```bash # Verify port is bound docker port container-name # Check if service is listening docker exec container-name netstat -tulpn # Test internal connectivity docker exec container-name curl http://localhost:8080 ``` **🔧 Firewall Issues** ```bash # Check firewall status (Ubuntu/Debian) sudo ufw status # Allow port through firewall sudo ufw allow 8080 # For Synology, check Control Panel > Security > Firewall ``` **🔧 Network Configuration** ```bash # Check Docker networks docker network ls # Inspect network configuration docker network inspect network-name # Recreate network if needed docker-compose down docker network prune docker-compose up -d ``` --- ### 🔗 **Services Can't Communicate** #### **Symptoms** - App can't connect to database - API calls between services fail - "Name resolution failure" errors #### **Solutions** **🔧 Network Isolation** ```yaml # Ensure services are on same network networks: app-network: name: app-network services: app: networks: - app-network database: networks: - app-network ``` **🔧 Service Discovery** ```bash # Use container names for internal communication DATABASE_HOST=database-container # Not localhost # Test name resolution docker exec app-container nslookup database-container ``` --- ### 🔴 **AdGuard Crash-Loop (bind: cannot assign requested address)** #### **Symptoms** - AdGuard container shows "Restarting" or "Up Less than a second" in `docker ps` - Logs contain: `fatal] starting dns server: configuring listeners: ... bind: cannot assign requested address` #### **Cause** AdGuard binds its DNS listener to a specific IP address stored in `AdGuardHome.yaml`. If the host's IP changes (DHCP reassignment, netplan change, or AdGuard briefly starts and rewrites the config to the current IP), the stored IP won't match the host and AdGuard will fail to bind. #### **Diagnose** ```bash # See what IP AdGuard is trying to bind to docker logs AdGuard --tail 20 # See what IP the interface actually has ip addr show eno1 | grep 'inet ' # See what's in the config file sudo grep -A3 'bind_hosts' /home/vish/docker/adguard/config/AdGuardHome.yaml ``` #### **Fix** ```bash # Update the config to match the actual interface IP sudo sed -i 's/- 192.168.68.XXX/- 192.168.68.100/' /home/vish/docker/adguard/config/AdGuardHome.yaml # Restart AdGuard docker restart AdGuard ``` > **On concord-nuc**: `eno1` must have static IP `192.168.68.100`. If it reverted to DHCP, re-apply the static config with `sudo netplan apply`. See [concord-nuc README](../../hosts/physical/concord-nuc/README.md) for full details. --- ## 💾 Storage & Data Issues ### 📁 **Data Not Persisting** #### **Symptoms** - Configuration lost after container restart - Uploaded files disappear - Database data resets #### **Solutions** **🔧 Volume Mounting** ```yaml # Ensure proper volume mounting volumes: - /volume1/docker/service:/data:rw # Host path:Container path - ./config:/app/config:rw # Relative path # Check volume exists ls -la /volume1/docker/service ``` **🔧 Permission Issues** ```bash # Fix volume permissions sudo chown -R 1026:100 /volume1/docker/service sudo chmod -R 755 /volume1/docker/service # Check container user docker exec container-name id ``` --- ### 💿 **Disk Space Issues** #### **Symptoms** - "No space left on device" errors - Services failing to write data - Slow performance #### **Solutions** **🔧 Check Disk Usage** ```bash # Check overall disk usage df -h # Check Docker space usage docker system df # Check specific directory du -sh /volume1/docker/* ``` **🔧 Clean Up Docker** ```bash # Remove unused containers, networks, images docker system prune -a # Remove unused volumes (CAUTION: This deletes data!) docker volume prune # Clean up logs sudo truncate -s 0 /var/lib/docker/containers/*/*-json.log ``` --- ## 🔐 Authentication & Access Issues ### 🚪 **Can't Login to Services** #### **Symptoms** - "Invalid credentials" errors - Login page not loading - Authentication timeouts #### **Solutions** **🔧 Default Credentials** ```bash # Check service documentation for defaults # Common defaults: # Username: admin, Password: "REDACTED_PASSWORD" # Username: admin, Password: "REDACTED_PASSWORD" # Username: admin, Password: "REDACTED_PASSWORD" # Check logs for generated passwords docker-compose logs service-name | grep -i password ``` **🔧 Reset Admin Password** ```bash # For many services, delete config and restart docker-compose down rm -rf ./config/ docker-compose up -d # Check service-specific reset procedures docker exec container-name reset-password admin ``` --- ### 🔑 **SSL/TLS Certificate Issues** #### **Symptoms** - "Certificate not trusted" warnings - HTTPS not working - Mixed content errors #### **Solutions** **🔧 Nginx Proxy Manager** ```bash # Access Nginx Proxy Manager http://host-ip:81 # Add SSL certificate for domain # Use Let's Encrypt for automatic certificates ``` **🔧 Self-Signed Certificates** ```bash # Generate self-signed certificate openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes # Add to browser certificate store # Or use HTTP instead of HTTPS for internal services ``` --- ## 📊 Performance Issues ### 🐌 **Slow Service Response** #### **Symptoms** - Web interfaces load slowly - API calls timeout - High CPU/memory usage #### **Solutions** **🔧 Resource Allocation** ```yaml # Increase resource limits deploy: resources: limits: memory: 4G # Increase memory cpus: '2.0' # Increase CPU ``` **🔧 Database Optimization** ```bash # Check database performance docker exec db-container pg_stat_activity # Optimize database configuration # Add indexes, tune memory settings ``` **🔧 Storage Performance** ```bash # Check disk I/O iostat -x 1 # Move to faster storage (SSD) # Use tmpfs for temporary data tmpfs: - /tmp:size=1G ``` --- ## 🔄 Update & Maintenance Issues ### 📦 **Update Failures** #### **Symptoms** - Container won't start after update - New version missing features - Configuration incompatibility #### **Solutions** **🔧 Rollback to Previous Version** ```bash # Use specific version tag image: service:v1.2.3 # Instead of :latest # Rollback with docker-compose docker-compose down docker-compose pull docker-compose up -d ``` **🔧 Backup Before Updates** ```bash # Backup configuration cp -r ./config ./config.backup # Backup database docker exec db-container pg_dump -U user dbname > backup.sql # Test update on copy first cp -r service-dir service-dir-test cd service-dir-test # Test update here ``` --- ### 🔄 **Watchtower Not Running** #### **Symptoms** - Containers not updating automatically - Watchtower container in "Created" state - No Watchtower logs or activity #### **Solutions** **🔧 Check Container Status** ```bash # Check if Watchtower container exists sudo docker ps -a | grep watchtower # Check container state sudo docker inspect watchtower --format '{{.State.Status}}' ``` **🔧 Start Watchtower Container** ```bash # Start the container if it's stopped sudo docker start watchtower # Verify it's running sudo docker ps | grep watchtower # Check logs for startup sudo docker logs watchtower --tail 20 ``` **🔧 Test Watchtower API (if enabled)** ```bash # Test API endpoint (should return 401 if secured) curl -s -w "\nHTTP Status: %{http_code}\n" http://localhost:8082/v1/update # Test with authentication token curl -H "Authorization: Bearer your-token" http://localhost:8082/v1/update ``` **🔧 Automated Fix Script** ```bash # Use the automated fix script ./scripts/fix-watchtower-atlantis.sh ``` **📋 Related Documentation** - Incident Report: `docs/troubleshooting/watchtower-atlantis-incident-2026-02-09.md` - Fix Script: `scripts/fix-watchtower-atlantis.sh` - Status Check: `scripts/check-watchtower-status.sh` --- ## 🌐 Tailscale Issues ### LAN host unreachable despite being on the same subnet **Symptoms:** - Can ping the gateway but not a specific LAN host - SMB/NFS mounts time out silently - `tracert`/`traceroute` to the host loops or times out immediately - `Find-NetRoute` on Windows shows traffic routing via Tailscale instead of the local interface **Cause:** A Tailscale node is advertising a subnet route that overlaps your local LAN (e.g. Calypso advertises `192.168.0.0/24`). Any node with `accept_routes: true` installs that route at a lower metric than the local interface, so traffic meant for LAN hosts goes into the Tailscale tunnel instead. **Diagnose (Linux):** ```bash # Check policy routing table for Tailscale-installed routes ip route show table 52 | grep 192.168 # Check which peer is advertising the subnet tailscale status --json | python3 -c " import sys, json d = json.load(sys.stdin) for peer in d.get('Peer', {}).values(): routes = peer.get('PrimaryRoutes') or [] if routes: print(peer['HostName'], routes) " ``` **Diagnose (Windows):** ```powershell # Check which interface Windows uses to reach the host Find-NetRoute -RemoteIPAddress 192.168.0.100 | Select-Object InterfaceAlias, NextHop # Check route table for the subnet Get-NetRoute -AddressFamily IPv4 | Where-Object { $_.DestinationPrefix -like '192.168.0*' } | Select-Object DestinationPrefix, NextHop, RouteMetric, InterfaceAlias ``` **Fix (Linux — immediate):** ```bash sudo ip route del 192.168.0.0/24 dev tailscale0 table 52 ``` **Fix (Linux — permanent):** Set `accept_routes: false` in Tailscale config. For TrueNAS SCALE app: ```bash sudo midclt call app.update tailscale '{"values": {"tailscale": {"accept_routes": false, "reset": true}}}' ``` **Fix (Windows — permanent):** ``` tailscale up --accept-routes=false --login-server=https://headscale.vish.gg:8443 ``` > **Note:** Nodes that genuinely need remote access to the `192.168.0.0/24` LAN (e.g. off-site VPS, remote laptop) should keep `accept_routes: true`. Nodes that are physically on that LAN should use `accept_routes: false`. See full incident report: `docs/troubleshooting/guava-smb-incident-2026-03-14.md` --- ### TrueNAS Tailscale app stuck in STOPPED / DEPLOYING after upgrade **Symptoms:** - App shows `STOPPED` state after a version upgrade - App starts deploying but immediately exits - Container logs show: `Error: changing settings via 'tailscale up' requires mentioning all non-default flags` **Cause:** After a TrueNAS app version upgrade, the new container's startup script runs `tailscale up` with flags from the app config. If any flag in the stored Tailscale state differs from the app config (e.g. `accept_dns` was `false` at runtime but `true` in the app UI), `tailscale up` refuses to proceed. **Fix:** 1. Set `reset: true` in the app config to clear the flag mismatch 2. Ensure all app config flags match the intended running state (especially `accept_dns`) 3. Start the app — it will apply a clean `tailscale up --reset ...` 4. Set `reset: false` after the app is running (optional, reset is idempotent) ```bash sudo midclt call app.update tailscale '{"values": {"tailscale": { "accept_dns": false, "accept_routes": false, "advertise_exit_node": true, "hostname": "truenas-scale", "reset": true }}}' sudo midclt call app.start tailscale ``` --- ## 🔐 Authentik SSO Issues ### Forward Auth redirect loop (`ERR_TOO_MANY_REDIRECTS`) **Symptoms:** Browser shows infinite redirect loop or `ERR_TOO_MANY_REDIRECTS` when accessing a service protected by Authentik Forward Auth. **Cause 1 — Missing `X-Original-URL` header in NPM:** The Authentik outpost returns `500` because it can't detect the original URL. Check the Authentik server logs: ``` failed to detect a forward URL from nginx ``` **Fix:** Add to NPM advanced config for the affected proxy host: ```nginx auth_request /outpost.goauthentik.io/auth/nginx; proxy_set_header X-Original-URL $scheme://$http_host$request_uri; ``` **Cause 2 — Empty `cookie_domain` on proxy provider:** After successful login, Authentik can't set the session cookie correctly so the redirect loop continues. **Fix:** Set `cookie_domain` on the provider via Authentik API or UI (**Admin → Providers → [provider] → Advanced → Cookie Domain = `vish.gg`**): ```bash AK_TOKEN="" PK=12 # provider PK PROVIDER=$(curl -s "https://sso.vish.gg/api/v3/providers/proxy/$PK/" -H "Authorization: Bearer $AK_TOKEN") UPDATED=$(echo "$PROVIDER" | python3 -c "import sys,json; d=json.load(sys.stdin); d['cookie_domain']='vish.gg'; print(json.dumps(d))") curl -s -X PUT "https://sso.vish.gg/api/v3/providers/proxy/$PK/" \ -H "Authorization: Bearer $AK_TOKEN" -H "Content-Type: application/json" -d "$UPDATED" ``` > **Rule:** All Forward Auth proxy providers should have `cookie_domain: vish.gg`. If adding a new Forward Auth provider, always set this. ### SSL "not secure" for unproxied domains Services that need direct internet access (Matrix federation, DERP relays, headscale) must be **unproxied in Cloudflare** (orange cloud off). The Cloudflare Origin Certificate (cert ID 1 in NPM) is only trusted by Cloudflare's edge — direct connections will show "not secure". **Fix:** Issue a Let's Encrypt cert via Cloudflare DNS challenge: ```bash ssh matrix-ubuntu # or any host with certbot + cloudflare.ini sudo certbot certonly --dns-cloudflare \ --dns-cloudflare-credentials /etc/cloudflare.ini \ -d your.domain.vish.gg --email your-email@example.com --agree-tos ``` Then import into NPM as a custom cert and update the proxy host. See `docs/troubleshooting/matrix-ssl-authentik-incident-2026-03-19.md` for full details. --- ## 🤖 Ansible & Automation Issues ### 📋 **Playbook Failures** #### **Symptoms** - Ansible tasks fail with permission errors - SSH connection failures - Tasks timeout or hang #### **Solutions** **🔧 SSH Connectivity** ```bash # Test SSH connection ssh -i ~/.ssh/key user@host # Check SSH key permissions chmod 600 ~/.ssh/private_key # Verify host in known_hosts ssh-keyscan -H hostname >> ~/.ssh/known_hosts ``` **🔧 Permission Issues** ```bash # Check sudo permissions ansible host -m shell -a "sudo whoami" # Add user to docker group sudo usermod -aG docker username # Fix Ansible inventory [hosts] hostname ansible_user=correct_user ansible_become=yes ``` --- ## 🔍 Diagnostic Commands ### 🛠️ **Essential Commands** **Container Diagnostics** ```bash # List all containers docker ps -a # Check container logs docker logs container-name --tail 50 -f # Execute commands in container docker exec -it container-name /bin/bash # Check container resource usage docker stats container-name # Inspect container configuration docker inspect container-name ``` **Network Diagnostics** ```bash # Test connectivity ping hostname curl -I http://hostname:port # Check DNS resolution nslookup hostname dig hostname # Check port availability telnet hostname port nc -zv hostname port ``` **System Diagnostics** ```bash # Check system resources htop free -h df -h # Check service status systemctl status docker systemctl status service-name # Check logs journalctl -u docker -f tail -f /var/log/syslog ``` --- ## 🆘 Emergency Procedures ### 🚨 **Service Down - Critical** 1. **Immediate Assessment** ```bash docker ps | grep service-name docker logs service-name --tail 20 ``` 2. **Quick Restart** ```bash docker-compose restart service-name # or docker-compose down && docker-compose up -d ``` 3. **Check Dependencies** ```bash # Verify database is running docker ps | grep database # Check network connectivity docker exec service-name ping database ``` 4. **Rollback if Needed** ```bash # Use last known good configuration git checkout HEAD~1 -- service-directory/ docker-compose up -d ``` ### 🔥 **Multiple Services Down** 1. **Check Host Status** ```bash # Check system resources free -h && df -h # Check Docker daemon systemctl status docker ``` 2. **Restart Docker if Needed** ```bash sudo systemctl restart docker docker-compose up -d ``` 3. **Check Network Issues** ```bash # Test internet connectivity ping 8.8.8.8 # Check local network ping gateway-ip ``` --- ## 📞 Getting Help ### 🔍 **Where to Look** 1. **Service logs**: Always check container logs first 2. **Official documentation**: Check the service's official docs 3. **GitHub issues**: Search for similar problems 4. **Community forums**: Reddit, Discord, forums 5. **This documentation**: Check other sections ### 📝 **Information to Gather** - Container logs (`docker logs container-name`) - System information (`uname -a`, `docker version`) - Configuration files (sanitized) - Error messages (exact text) - Steps to reproduce the issue ### 🏷️ **Common Log Locations** ```bash # Docker logs docker logs container-name # System logs /var/log/syslog /var/log/docker.log # Service-specific logs /volume1/docker/service/logs/ ./logs/ ``` --- ## 📋 Prevention Tips ### ✅ **Best Practices** - **Regular backups**: Automate configuration and data backups - **Monitoring**: Set up alerts for service failures - **Documentation**: Keep notes on configuration changes - **Testing**: Test updates in non-production first - **Version control**: Track configuration changes in Git ### 🔄 **Maintenance Schedule** - **Daily**: Check service status, review alerts - **Weekly**: Review logs, check disk space - **Monthly**: Update containers, review security - **Quarterly**: Full system backup, disaster recovery test --- ## 🐳 Synology DSM — Docker / gluetun Issues ### gluetun crashes immediately on Synology (`flushing conntrack` error) **Symptoms** - gluetun container exits with exit code 1 seconds after starting - Logs show: `ERROR flushing conntrack: netfilter query: netlink receive: invalid argument` - Any container using `network_mode: "service:gluetun"` fails with `namespace path: lstat /proc//ns/net: no such file or directory` **Cause** Synology DSM kernels do not ship the `nf_conntrack_netlink` module (`modprobe nf_conntrack_netlink` fails with "not found"). The gluetun `latest` Docker image (from ~2026-02-23, commit `625a63e`) introduced fatal conntrack flushing that requires this module. **Fix** Pin gluetun to `v3.38.0` (last known good version on Synology) and use `privileged: true`: ```yaml gluetun: image: qmcgaw/gluetun:v3.38.0 # do NOT use latest privileged: true # replaces cap_add: NET_ADMIN devices: - /dev/net/tun:/dev/net/tun healthcheck: test: ["CMD-SHELL", "wget -qO /dev/null http://127.0.0.1:9999 2>/dev/null || exit 1"] interval: 10s timeout: 5s retries: 6 start_period: 30s ``` For containers sharing gluetun's network (e.g. deluge), use `condition: service_healthy` to avoid the race condition: ```yaml deluge: network_mode: "service:gluetun" depends_on: gluetun: condition: service_healthy ``` **Notes** - The healthcheck hits gluetun's built-in health server at `127.0.0.1:9999` which returns 200 when the VPN tunnel is up - The gluetun volume mount (`/gluetun`) overwrites the container's `/gluetun` dir — do **not** use `["CMD", "/gluetun/healthcheck"]` as that binary gets hidden by the mount - With the WireGuard SPK installed (see below), v3.38.0 uses kernel WireGuard (`Using available kernelspace implementation`); interface is still `tun0` in this version - `latest` gluetun still crashes even with kernel WireGuard — the `nf_conntrack_netlink` missing module issue is unrelated to WireGuard --- ### Installing native kernel WireGuard on Synology (WireGuard SPK) Installing the 3rd-party WireGuard SPK gives Docker containers native kernel WireGuard support instead of the slower userspace implementation. **Atlantis status:** WireGuard SPK v1.0.20220627 installed and running (Feb 2026). No reboot required — loaded cleanly via `synopkg start`. **Steps for v1000 platform (DS1823xs+), DSM 7.3:** ```bash # Download SPK wget 'https://www.blackvoid.club/content/files/2026/02/WireGuard-v1000-73-1.0.20220627.spk' -O /tmp/wireguard.spk # Install (do NOT check "run after installation" if using DSM UI) sudo /usr/syno/bin/synopkg install /tmp/wireguard.spk # Start (fixes privilege and loads kernel module) sudo /usr/syno/bin/synopkg start WireGuard # Verify module loaded lsmod | grep wireguard ``` **Make persistent on boot** — add to `esynoscheduler` DB (or DSM Task Scheduler UI), depends on `Docker mount propagation`: ```sql INSERT INTO task (task_name, event, enable, owner, operation_type, operation, depend_on_task) VALUES ('WireGuard module', 'bootup', 1, 0, 'script', '#!/bin/sh /usr/syno/bin/synopkg start WireGuard', 'Docker mount propagation'); ``` **Boot task chain on Atlantis:** `VPNTUN` (modprobe tun) → `Docker mount propagation` (mount --make-shared /) → `WireGuard module` (synopkg start WireGuard) **Platform SPK URLs (DSM 7.3):** replace `v1000` with your platform (`r1000`, `geminilake`, `apollolake`, `denverton`, etc.): `https://www.blackvoid.club/content/files/2026/02/WireGuard-{platform}-73-1.0.20220627.spk` To find your platform: `cat /etc.defaults/synoinfo.conf | grep platform_name` --- ### Docker containers fail with `path / is mounted on / but it is not a shared or slave mount` **Cause** Synology DSM boots with the root filesystem mount as `private` (no propagation). Docker requires `shared` propagation for containers that use network namespaces or VPN tunnels (e.g. gluetun). **Fix — temporary (lost on reboot)** ```bash mount --make-shared / ``` **Fix — permanent (via DSM Task Scheduler)** Create a new triggered task in DSM → Control Panel → Task Scheduler: - Type: Triggered (bootup) - User: root - Script: ```sh #!/bin/sh mount --make-shared / ``` This has been applied to **Atlantis** and **Calypso** via the `esynoscheduler` DB directly. Task name: `Docker mount propagation`. **Setillo**: must be added manually via the DSM UI (SSH sudo requires interactive terminal). --- --- ## arr-scripts / Lidarr / Deezer {#arr-scripts-lidarr-deezer} arr-scripts runs as s6 services inside the Lidarr container. See [lidarr.md](../services/individual/lidarr.md) for the full setup. ### Scripts stuck in "is not ready, sleeping until valid response..." loop **Cause**: `getArrAppInfo()` reads `arrApiKey` and `arrUrl` from `config.xml` using `xq | jq`. If `xq` was broken when the container first started, the variables are set to empty/wrong values and the `verifyApiAccess()` loop retries forever with stale values — it never re-reads them. **Fix**: Restart the container. The scripts reinitialize with fresh variable state. If the restart loop persists, check the `xq` issue below first. ### Alpine `xq` vs Python yq `xq` conflict **Cause**: Alpine's `xq` package (v1.x) outputs XML passthrough instead of converting to JSON. arr-scripts need `cat config.xml | xq | jq -r .Config.ApiKey` to work, which requires Python yq's `xq`. **Symptom**: `cat /config/config.xml | xq | jq -r .Config.ApiKey` returns a parse error or empty string instead of the API key. **Check**: `xq --version` inside the container — should show `3.x.x` (Python yq), not `1.x.x` (Alpine). **Fix** (persistent via scripts_init.bash): ```bash uv pip install --system --upgrade --break-system-packages yq ``` This installs Python yq's `xq` entry point at `/usr/bin/xq`, overriding Alpine's version. ### "ERROR :: Invalid audioFormat and audioBitrate options set..." **Cause**: When `audioFormat="native"`, `audioBitrate` must be a word, not a number. | audioBitrate value | Result | |---|---| | `"low"` | 128kbps MP3 (Deezer Free) | | `"high"` | 320kbps MP3 (Deezer Premium) | | `"lossless"` | FLAC (Deezer HiFi) | | `"master"` | MQA (Tidal Master) | | `"320"` | **INVALID** — causes this error | **Fix**: In `/volume2/metadata/docker2/lidarr/extended.conf`, set `audioBitrate="high"`. ### "ERROR :: download failed, missing tracks..." **Cause**: `deemix` is not installed (setup.bash fails silently on Alpine). The script finds a Deezer match but can't execute the download. **Check**: `which deemix` inside the container — should return `/usr/bin/deemix`. **Fix** (persistent via scripts_init.bash): ```bash uv pip install --system --upgrade --break-system-packages deemix ``` ### Album title matching always fails — "Calculated Difference () greater than 3" **Cause**: `pyxdameraulevenshtein` is not installed. The distance calculation in `python -c "from pyxdameraulevenshtein import damerau_levenshtein_distance; ..."` fails silently, leaving `$diff` empty. Every `[ "$diff" -le "$matchDistance" ]` comparison then fails with `[: : integer expected`. **Check**: `python -c "from pyxdameraulevenshtein import damerau_levenshtein_distance; print(damerau_levenshtein_distance('hello','hello'))"` — should print `0`. **Fix** (persistent via scripts_init.bash): ```bash uv pip install --system --upgrade --break-system-packages pyxdameraulevenshtein ``` ### Why setup.bash fails to install packages `setup.bash` uses `uv pip install` to install Python dependencies. On the Alpine version used by the linuxserver/lidarr image, some packages (yq, deemix, pyxdameraulevenshtein) fail to build due to missing setuptools or C extension issues. The failure is silent — setup.bash exits 0 regardless. **Fix**: `scripts_init.bash` explicitly reinstalls all critical packages after setup.bash runs. This runs on every container start (it's in `custom-cont-init.d`), so it survives container recreates. ### ARL token expired Deezer ARL tokens expire approximately every 3 months. Symptoms: downloads fail silently or deemix returns 0 tracks. **Get a new token**: 1. Log in to deezer.com in a browser 2. DevTools → Application → Cookies → `arl` value 3. Update in `/volume2/metadata/docker2/lidarr/extended.conf`: `arlToken="..."` 4. Restart the lidarr container ### Checking arr-scripts service status ```bash # Via Portainer console exec or SSH into container: s6-svstat /run/service/custom-svc-Audio s6-svstat /run/service/custom-svc-ARLChecker # View live logs docker logs lidarr -f # Per-service log files inside container ls /config/logs/Audio-*.txt tail -f /config/logs/Audio-$(ls -t /config/logs/Audio-*.txt | head -1 | xargs basename) ``` --- ## 📋 Next Steps - **[Diagnostic Tools](diagnostics.md)**: Advanced troubleshooting tools - **[Performance Tuning](performance.md)**: Optimize your services - **[Emergency Procedures](emergency.md)**: Handle critical failures - **[Monitoring Setup](../admin/monitoring.md)**: Prevent issues with monitoring --- *Remember: Most issues have simple solutions. Start with the basics (logs, connectivity, resources) before diving into complex troubleshooting.*