I’d like to add my own observations (to what others have reported in older threads) that Duplicati on macOS is very slow (compared to Windows). I’ve run it on macOS High Sierra for over a year, and now switched to a new 2018 MacBook Air running Mojave. In both cases, during what I presume is the initial ‘compare’ stage when the backup is starting, it takes easily 30 to 60 minutes before the actual data upload starts, and during this time, the laptop runs hot and the fan goes into overdrive. This results in daily backup times that range anywhere from 45 minutes to 1.5 hours. In comparison, on my Windows 10 laptop, this is done in under ten minutes (total).
I am using client version 2.0.4.5_beta_2018-11-28 (which is the advertised stable build on the Duplicati homepage) with Mono 5.20.1.19. Mojave 10.14.5, 16GB RAM, 512GB storage (half of that used). I am uploading to NextCloud via WebDAV. AES-256, remote volume size is set to 100MB, with backup retention set to delete backups that are older than 12 months.
Any help or suggestions are appreciated.
Thanks!
P.S: Source size is ~190GB, backup size is ~280GB (227 versions)
I am just browsing through the release notes of the canary builds and came across this one.
In the notes, it says “Fixed CPU/memory issue on MacOS”. I’m assuming this is what I am experiencing.
Update: Nope! Updating to the latest canary build (2.0.4.18, from May 12) unfortunately didn’t change anything. It stays at “Starting backup” for 30 minutes (or longer), accompanied by lots of heat & loud fans.
I am experiencing the same thing. When trying to do a 1TB backup, I could never get there, so I broke my backup into 5 pieces. This allowed each job to complete, but I still get lots of heat and fans when each job runs.
I’m on Mojave 10.14.6 but Duplicati has been working fine here for well over a year now. Daily backup takes 5 minutes tops, total size is around 70 GB. The mono version I have installed right now is stable 6.0.0.319, maybe updating it is worth a try?
Thanks for the responses! Source size is ~190GB, backup size is ~280GB (227 versions). I am not familiar with the deduplication block size setting and conclude that I therefore probably didn’t make any changes to it.
Also, I recently updated mono to the latest stable version as well, and it didn’t change anything. Perhaps it’s the sheer number of versions or the size of the backup that are problematic (over time)?
I don’t know, your numbers don’t seem too bad. On my NAS I’m backing up 670GB of data and using default deduplication block size. Not a very powerful CPU in the NAS. My backups complete in about 10-15 minutes.
How large is your backup job’s sqlite database? Mine is about 3.7GB on the NAS.
I vacuum my database regularly - have you tried doing that? I have heard some people see performance gains by doing a vacuum.
I made some progress on this front. I first tried to vacuum the SQLite db, which didn’t seem to alleviate the problem. I then reviewed the settings of my backup configuration and changed the “Backup retention” setting from “Delete backups that are older than 12 months” to “Smart backup retention”. The first backup right after was down from ~1 hour to ~40 minutes. The next backup after was then down to just 4 minutes.
I’ll keep an eye on the situation, but changing the backup retention setting may have fixed my issue.
P.S.: I now see what happened – changing the backup retention brought the backup down from 227 to 14 versions, and from ~280GB to ~260GB. Surely the large number of versions was the culprit. I’m still confused how smart backup retention calculates the number of versions though.
Changing the backup retention setting likely expunged a good bit of data from the FileLookup table and probably other tables.
I have found at least one instance where a problematic SQL Query was forcing a full scan of the entire (very, very large) table it was operating on; adding an index was effective in eliminating that performance issue. But becase SQLite is not instrumented for performance analysis the way a ‘professional’ database like Oracle or Postgres it’s very difficult to pin this down.
Vacuuming the database is effective at performance improvement mostly when large amounts of data have recently been deleted or “replaced” because it rewrites the physical file to eliminate newly empty pages (fixed-length storage units within the data file) and regenerates ‘fragmented’ or ‘bloated’ indices which makes them more accurate and useful. Most people are doing it for space reasons, but it should also improve the planner’s performance in at least some cases, My background is in enterprise databases but I believe that SQLite will need to ‘look’ at empty pages in at least certain cases, though brief this increases overhead and may be a factor, here.
With that in mind, if you’ve not already, you might try vacuuming again and see if this causes further improvements. (Also consider doing analyze as described at SQLite Query Language: ANALYZE as this will re-populate some metadata tables that planner can use to allow it to make better plans (hopefully.) However, note that analyze-es are potentially as slow as vacuum-s.)
The PRAGMA optimize command will automatically run ANALYZE on individual tables on an as-needed basis. The recommended practice is for applications to invoke the PRAGMA optimize statement just before closing each database connection.