Extremely slow backup over LAN

Doesn’t the volume size cause problems as well?

Nope. You can change --block-size (aka “Upload volume size”) any time without problems however just as with --zip-compression-level the new value will only be applied to archives created going forward. Existing ones won’t be automatically re-compressed at the new size unless part of a compact process.

I believe there are some manual steps you can take to force a re-compress of everything, but I’ve never tried it myself. If you’re interested let me know and I’ll try to find the related posts.

1 Like

Considering most of my files are almost incompressible video files, perhaps reducing the compression level is the way to go.

Any idea about the giant discrepancy in file count?

Compression level won’t matter for already compressed files. There is a list of file extensions Duplicati checks and if your file has one of those types it’s not compressed at all.
You can see the file extensions here: duplicati/default_compressed_extensions.txt at master · duplicati/duplicati · GitHub

I’m not sure what could be causing the discrepancy in file counts. Maybe it’s a bug in the file counter? The counter is a separate process from the actual backup so if it’s only in the counter it shouldn’t matter to the backup result. But I don’t know how you’d know until you finished the backup and moved back to the NAS:

I was just looking at that file actually.

I have mostly m2ts and arw files which are compressed but not found in that text file. I’ve edited the file to exclude those types as well. Hopefully that speeds things up a bit.

I’m assuming the encryption process is more CPU-bound than the compression process?

Both are mostly CPU bound. Compression seemed to have the biggest impact in tests done by @dgcom here: Duplicati 2 vs. Duplicacy 2.

Default compression level is 9 and going down to 1 cut about 36% off his backup time in one of the tests. An other test using level 0 cut that down about 45%. And the test with turning off encryption cut about 7% (for a total of 52% with level 0 compression and no encryption)

Of course these things could also depend on memory and disk speeds, or even network speeds if you end up bottlenecked there.

2 Likes

I don’t know if that file is dynamically loaded or not - if adding those extensions works for you, please let us know so we can try and get them in the official release file.

I restarted Duplicati after making the changes - just to be sure.

I ran into the restoring from a different OS issue so I restarted the backup but the speeds seem to be significantly better. The Pi only has a 100Mbps port but I’m averaging around 45Mbps - still going to take 6 days to finish at this rate, but its half as long as the original estimate.

I’ll see if the transfer speed stays consistent over the next little while.

What settings did you end up with? Is it mostly compression that slowed you down?

I added the m2ts and arw files to the compression ignore list and that seems to have done the trick. So yes, looks like it was the compression.

This is what the incoming traffic on the backup destination looks like:

Reports a 45Mbps average over 5 minutes. You can clearly see where each 50MB chunk starts and finishes.

Hmm, interesting. That’s some performance gain.

m2ts is likely compressed, but isn’t ARW raw uncompressed video? That would compress well

ARW is similar to Canon’s CR2. It should be an uncompressed image file but by most reports its actually a compressed format. I did (naively) a quick compression test with a few ARW files into a zip file and the resulting output was barely smaller than the original.

hmm, I’m submitting a PR (#3021) to add m2ts, but I’d need to see some testing of compression ratios for ARW before adding that.

It should compress fairly poorly before it makes sense to force on all users.

1 Like

I’d be happy to help with that if you can give me any guidelines. I have TONS of ARW files to test with :slight_smile:

If you think your uploads are happening so fast they have to wait for the next archive to be prepped then you could try different --asynchronous-upload-limit settings (default is 4) to control how many archive files “ahead” will be generated.

Thought it sounds like you might be at the other side of things where your CPU is so busy generating another archive volume that it can’t pay attention to any already going uploads.

Well, ideally we would want to test with SharpCompress directly since that’s what Duplicati uses.
But that kind of requires me to write a small C# wrapper that’ll make it easy to compress those files on CLI and display the ratio.

Alternatively, it can be tested by creating a couple of backup jobs with just a couple ARW files in it. Duplicati will deduplicate the files, but if we have 3-4 jobs where the only difference is the compression level we can see what kind of real world result to expect at full/medium/no compression. This would also be a good way to see what kind of performance each compression level gives you :slight_smile:

If you set your --blocksize to be larger than your largest target test file then the only deduplication that would occur would be for exact files.

Of course a change like that could render any resulting time tests meaningless in the real world…

Well my ARW files are all under 25MB each which is smaller than the block size - I’m guessing dedup would always be running in that case. I could run the backup, delete the data, and redo with a different compression level.

Also, I moved the temporary storage for Duplicati to a much faster drive and my transfer speeds went up significantly.

There’s no more hard edges on each file transfer, plus the average and minimum speeds are up a lot. The previous drive was on a pretty old and fairly busy single HDD. The new one is an NFS share but with an SSD Cache.

Good catch on the drive performance bottleneck! :smiley:

Since you’re working on a Raspberry Pi I’m assuming this is not really useful to you, but I believe an option is being added to allow these temp files to be created fully in memory (an alternative is to use a ram drive).

It would be interesting if Duplicati could self-monitor how much time is spent waiting for a particular resource and offer suggestions on ways to improve backup performance. :thinking:

The backup jobs can’t share dedup, so it should be fine to keep them all during the test. But it’s always good to keep them in different folders so they don’t confuse each other :slight_smile: