Slow backups on macOS

cannondale · June 10, 2019, 10:13pm

I’d like to add my own observations (to what others have reported in older threads) that Duplicati on macOS is very slow (compared to Windows). I’ve run it on macOS High Sierra for over a year, and now switched to a new 2018 MacBook Air running Mojave. In both cases, during what I presume is the initial ‘compare’ stage when the backup is starting, it takes easily 30 to 60 minutes before the actual data upload starts, and during this time, the laptop runs hot and the fan goes into overdrive. This results in daily backup times that range anywhere from 45 minutes to 1.5 hours. In comparison, on my Windows 10 laptop, this is done in under ten minutes (total).

I am using client version 2.0.4.5_beta_2018-11-28 (which is the advertised stable build on the Duplicati homepage) with Mono 5.20.1.19. Mojave 10.14.5, 16GB RAM, 512GB storage (half of that used). I am uploading to NextCloud via WebDAV. AES-256, remote volume size is set to 100MB, with backup retention set to delete backups that are older than 12 months.

Any help or suggestions are appreciated.

Thanks!

P.S: Source size is ~190GB, backup size is ~280GB (227 versions)

cannondale · June 11, 2019, 12:10am

I am just browsing through the release notes of the canary builds and came across this one.

In the notes, it says “Fixed CPU/memory issue on MacOS”. I’m assuming this is what I am experiencing.

Update: Nope! Updating to the latest canary build (2.0.4.18, from May 12) unfortunately didn’t change anything. It stays at “Starting backup” for 30 minutes (or longer), accompanied by lots of heat & loud fans.

Mark_Daubenmier · June 12, 2019, 3:09am

I am experiencing the same thing. When trying to do a 1TB backup, I could never get there, so I broke my backup into 5 pieces. This allowed each job to complete, but I still get lots of heat and fans when each job runs.

cannondale · June 14, 2019, 12:03pm

Thanks Mark. I guess there’s no love here for macOS. Anyone? Anyone? Bueller? Bueller?

cannondale · July 6, 2019, 11:58am

Just bumping this topic back to the top. Any other macOS users here who have found a solution to this problem?

cannondale · August 15, 2019, 12:33am

Another bump. Still an issue in latest canary build.

acusel · August 18, 2019, 8:39pm

I’m on Mojave 10.14.6 but Duplicati has been working fine here for well over a year now. Daily backup takes 5 minutes tops, total size is around 70 GB. The mono version I have installed right now is stable 6.0.0.319, maybe updating it is worth a try?

drwtsn32 · August 18, 2019, 8:55pm

How much data are you backing up?

Did you make any changes to the default deduplication block size?

cannondale · August 18, 2019, 9:15pm

Thanks for the responses! Source size is ~190GB, backup size is ~280GB (227 versions). I am not familiar with the deduplication block size setting and conclude that I therefore probably didn’t make any changes to it.

Also, I recently updated mono to the latest stable version as well, and it didn’t change anything. Perhaps it’s the sheer number of versions or the size of the backup that are problematic (over time)?

drwtsn32 · August 19, 2019, 3:38am

I don’t know, your numbers don’t seem too bad. On my NAS I’m backing up 670GB of data and using default deduplication block size. Not a very powerful CPU in the NAS. My backups complete in about 10-15 minutes.

How large is your backup job’s sqlite database? Mine is about 3.7GB on the NAS.

I vacuum my database regularly - have you tried doing that? I have heard some people see performance gains by doing a vacuum.

cannondale · August 23, 2019, 5:35pm

I made some progress on this front. I first tried to vacuum the SQLite db, which didn’t seem to alleviate the problem. I then reviewed the settings of my backup configuration and changed the “Backup retention” setting from “Delete backups that are older than 12 months” to “Smart backup retention”. The first backup right after was down from ~1 hour to ~40 minutes. The next backup after was then down to just 4 minutes.

I’ll keep an eye on the situation, but changing the backup retention setting may have fixed my issue.

P.S.: I now see what happened – changing the backup retention brought the backup down from 227 to 14 versions, and from ~280GB to ~260GB. Surely the large number of versions was the culprit. I’m still confused how smart backup retention calculates the number of versions though.

Ingmyv · September 2, 2019, 3:36pm

Changing the backup retention setting likely expunged a good bit of data from the FileLookup table and probably other tables.

I have found at least one instance where a problematic SQL Query was forcing a full scan of the entire (very, very large) table it was operating on; adding an index was effective in eliminating that performance issue. But becase SQLite is not instrumented for performance analysis the way a ‘professional’ database like Oracle or Postgres it’s very difficult to pin this down.

Vacuuming the database is effective at performance improvement mostly when large amounts of data have recently been deleted or “replaced” because it rewrites the physical file to eliminate newly empty pages (fixed-length storage units within the data file) and regenerates ‘fragmented’ or ‘bloated’ indices which makes them more accurate and useful. Most people are doing it for space reasons, but it should also improve the planner’s performance in at least some cases, My background is in enterprise databases but I believe that SQLite will need to ‘look’ at empty pages in at least certain cases, though brief this increases overhead and may be a factor, here.

With that in mind, if you’ve not already, you might try vacuuming again and see if this causes further improvements. (Also consider doing analyze as described at SQLite Query Language: ANALYZE as this will re-populate some metadata tables that planner can use to allow it to make better plans (hopefully.) However, note that analyze-es are potentially as slow as vacuum-s.)

ts678 · September 4, 2019, 4:18pm

DB Browser for SQLite might be a way to force ANALYZE. I don’t run macOS, so can’t test on that system.

Run PRAGMA optimize when closing database connection #3749 discussed that versus ANALYZE. It went into v2.0.4.18-2.0.4.18_canary_2019-05-12 meaning anyone running a recent canary has that, but it’s “iffy”:

The PRAGMA optimize command will automatically run ANALYZE on individual tables on an as-needed basis. The recommended practice is for applications to invoke the PRAGMA optimize statement just before closing each database connection.