Will changing the remote volume size improve speed?

digitalformula2017 · April 22, 2019, 6:18am

Hi all,

I’m using the default volume size of 50MB to backup over the network to my NAS. Duplicati version is 2.0.4.5_beta_2018-11-28 on Ubuntu, NAS is running unRAID and is connected via 1GbE.

Backups are all OK but I often carry out restore tests to make sure I can get stuff back if my Ubuntu machine dies (like it did yesterday). Right now I’m doing a restore test on my MacBook Pro due to the Ubuntu machine dying. It is working, but is SLOW.

Questions:

Will increasing the remote volume size decrease how long each file takes to process? The live log shows each file getting processed, but I’m wondering if there’s some latency caused by using files that small.
Why do I need to recreate the database every time I do a restore test on another machine? I did a restore yesterday, and the database got recreated on my MacBook Pro. Today I’m doing another restore and the database is being recreated again. Why is that?
Why does Duplicati use so much CPU? I guess it might be something to do with Mono as right now I can see “mono-sgen64” taking up 360.1% CPU (quad core machine). That’s quite a lot, IMO.

Duplicati is great but just want to understand if I’m using it in the most effective way.

Thanks!

ts678 · April 22, 2019, 1:36pm

Can you say anything about where it is when it’s slow? Building partial temporary database ... sometimes can get very slow (and has lots of forum posting) for large backups, but after that’s done it’s actually restoring files that proceeds. Have you restored any bits of files, or are you still doing database?

Choosing sizes in Duplicati discusses –dblock-size remote volume size, and deduplication –blocksize which at 100KB can make for huge database tables for large backups, and possible slow performance.

I’d think you could probably raise both values. --dblock-size can change anytime but applies only to new dblocks (workaround is compact). --blocksize might matter more for big backups but can’t be changed.

I can’t say how much this would speed up the restore phase, but recreate and backups are believed to scale up poorly when the administrative overhead gets high (might matter more than network latencies).

Restoring files if your Duplicati installation is lost possibly describes what you’re testing. Is that correct? Restoring this way makes a partial temporary database that’s (I think) designed for the specific restore, meaning it gets created faster than the database you would recreate if you wanted to continue backups.

Creating a long-term (at least long enough for a series of restores) database might be what you’re after. The REPAIR command can do this, and using –version (most recent is 0) can tame it a bit so it doesn’t spend its time doing all versions (which will take longer) if that’s good (one version, with full set of files).

Don’t try having two backups (with two databases) sharing one remote. That can lead to huge messes. Perhaps the temptation is another reason why a partial temporary database is the less dangerous path.

Can’t say offhand, and probably varies case-by-case (some report slowness without much CPU usage). There’s a large amount of work involved, including possibly decryption, unzipping. dealing with lots of DB queries (you can enjoy seeing those if you do About → Show log → Live → Profiling), and such things.
How the restore process works and Features (which have performance prices) might give further ideas.

That’s a great attitude. There are definitely rough edges, complexities, and things one should learn about. One subtle point I found out after doing my own restore tests to the same system is that –no-local-blocks must be used to ensure remote data is actually used, as opposed to getting whatever it can from source. Your test is more purely reflective of disaster recovery. Always a wise thing to test before disaster occurs.