Maximize Upload Speed Dropbox

xl_summer_lx · August 17, 2023, 3:36pm

Hi,
I’m new with Duplicati, but I’m facing some issues to maximize upload speed to Dropbox. I have a 10Gbit connection (10 DOWN / 2.5 UP) and I have a MacBook Pro 14 M2 with which I’m testing the upload (more or less 5GB of data), but I’m not able to to go above 22MB/s (175Mbit/s).

Here is my configuration:

Total data to upload 5GB (stored locally)
Remote Volume Dimension: 125MB
Asynchronous Upload limits: 0
No limits in bandwidth
Encryption: Disabled
Compression: 0

I found that the usage of the CPU is capped at 275%. is there a way to increase the overall usage and increase the upload speed?

Thanks

gpatel-fr · August 17, 2023, 4:12pm

Hello

If your data is in a single file, Duplicati is limited by the time that creating a zip file involves, something rather difficult to do with several cores (a core can’t create the end of the file before the beginning has been written). I am currently in the process of evaluating the upload performance and I get better than that, however I’m not yet confident enough of my conclusions to post the results yet.

xl_summer_lx · August 18, 2023, 5:30am

Hello,
I’m uploading multiple files, can you please share some of your findings once available?

Thanks

ts678 · August 19, 2023, 12:05am

Presumably that plural refers to asynchronous-upload-limit and asynchronous-concurrent-upload-limit.
concurrency-block-hashers and concurrency-compressors are other options that may parallelize more.
Possibly the database is a single-threaded limit. You can go easier on it by increasing blocksize, if the reduced block-level deduplication doesn’t bother you. It will also help other things, e.g. .zip file packing.

How much changes each backup? Unless it’s the first backup, Duplicati has to walk the tree looking for changed files, then read through the files to find changed blocks. That can certainly slow upload speed, because only changed data is uploaded. If your comment is on initial backup, then ignore this comment.

xl_summer_lx · August 19, 2023, 1:44pm

Thanks for the info, yes it’s my initial backup, but are more/less 30TB to upload…

About the other options, are there some suggested values for machine with “a lot” of computational power? With defaults, I’m using only the 25% of the all capacity.

ts678 · August 19, 2023, 3:01pm

No suggestion. Nobody volunteers to do extensive benchmarking on such machines.
You can be the first.

Now I’m confused about these different numbers. If you want a 30 TB backup, that’s vastly different.
My original understanding was that you had 5 GB on an SSD system (so fast). Now I’m less certain.

xl_summer_lx · August 19, 2023, 4:14pm

You are right, sorry for the confusion

5GB are on my MacBook Pro to test the upload and find the optimal settings to saturate CPU and/or Upload Connection, 30TB is the total amount of data that I need to backup.

I would not write here for just 5GB 10MB/s or 20MB/s it’s a matter of minutes, but with 30TB it’s a matter of days or weeks.

Thank you

gpatel-fr · August 19, 2023, 4:33pm

My (promised) report is not yet very good IMO but I have still posted it because someone (the OP here) asked for it. I could possibly create a binary based on the (now posted) related PR.

ts678 · August 19, 2023, 4:52pm

Configuration tips for large (200TB) SME cloud backup has my personal thinking on backups that big, especially when lots of files are present. I don’t know if that’s the case for you, but it seems likely. The usual increase of blocksize helps keep total block count below the generally recommended 1 million when files are large, however each file also has a small metadata block, so a lot of files mean a lot of blocks, thus slow running due to block overload, e.g. in SQL operations and other block-scaled areas.

That’s not an immediate upload-speed result, and one can’t expect scale-up linearly. It drops sharply.

Or far worse. I talk about restore speed. Duplicati does parallel uploads and single threaded restores. Single threaded network operations are limited by things like speed of light (latency), but parallelizing potentially can help. You might want to reconsider whether or not you can tolerate long restore delay, however the other topic was a business, and probably would be losing revenue while system was out.

If you see you can measure 10 Gbit down, that’s with a whole lot of parallel connections typically, and single connection is far worse. The cloud provider probably also couldn’t run that fast on a connection.

I was getting about 80 Mbit from OneDrive on a recent test on a 1000 Mbit link (limited by WiFi as well) which would mean 30 TB at 10 Mbyte/second would take 3 million seconds (35 days), if no other limits. Faster might not help, as 10 Mbyte/second might run into the hard drive’s (if used) maximum write rate.

IMO something of this scale needs a lot of planning and testing, including test of disaster recovery time. Although 30 TB is not the 200 TB case, it’s still a lot of data, and I’m not sure you can tune in big speed.