Restoration network speed limited to 100MBits/s

Hello,

I am using Duplicati to backup my Windows stations to an Azure Blob storage space.
I have no problem with the backup, everything works fine. Thanks to the development team. :slight_smile:

My problem is with the performance of the restore which is limited to 100MBit/s.

I use the following command line to perform the restoration:
Duplicati.CommandLine.exe restore “azure://Contener?auth-username=StorageAccount&auth-password=StorageAccountKey” “*” --passphrase=“MyPassPhrase;)” --overwrite=false --restore-path=D:\TESTRestor\ --prefix=MyPrefix --tempdir=“C:\Temp\Duplicati\” -dblock-size=500MB --blocksize=50MB

I tried adding the --throttle-download and --concurrency-max-threads options but it doesn’t change the network transfer speed which is still limited to 100 MBits/s

I add some contextual information:

  • I use Duplicati - 2.0.6.104_canary_2022-06-15
  • I have tried to restore on two different machines with fast i9-9980 CPUs. I get the same performance. The CPU load is about 10-20% during the restore. So the problem is not a CPU limitation.
  • The storage space is Azure in a BLOB storage account. I use other tools on this space which have very good upload and download performance on the same machine.
  • The working drives are NVME SSDs, there is no saturation here.
  • My internet line is a 8GBps fibre. No saturation possible here either.
  • I have good performance when backing up at about 700 MBits/s. :slight_smile:
  • The saved files are only very large files of 10GB on average.
  • I use the parameters --dblock-size=500MB and --blocksize=50MB, because the backup are only large files without modification and compression possible.
  • The reconstruction of the local database during the restore takes about 2 minutes. It is during the file download phase that the performance is limited to 100MBits/s.

Can you help me to unblock the restoration speed?

Thank you in advance and congratulations to the designer of the solution.
(Sorry for my English)

No one to help?
Can you share your performance metrics while restoring from cloud storage?
Thanks

Welcome to the forum @Ripod

Big Comparison - Borg vs Restic vs Arq 5 vs Duplicacy vs Duplicati is one of the better studies.
Posts to the forum don’t go to everyone, so only people who read a lot will notice your latest ask.
One concern I have with the above comparison (and with your test method) is whether they use

no-local-blocks

Duplicati will attempt to use data from source files to minimize the amount of downloaded data. Use this option to skip this optimization and only use remote data.

because without this, as little as possible goes over the network – which is the metric you chose.

For whatever it’s worth, I ran a OneDrive restore with no-local-blocks and got about 60 Mbit/s.
That was seen in Task Manager with Update speed set to low, trying to visually average a graph.

The way one gets fast network performance is often through parallelism, just as with CPU cores.

Increase the number of concurrent requests suggests azcopy on 16 logical processors gets 256.
You can study whatever tools you use, or maybe even view the connection counts using netstat.

  --asynchronous-concurrent-upload-limit (Integer): The number of concurrent
    uploads allowed
    When performing asynchronous uploads, the maximum number of concurrent
    uploads allowed. Set to zero to disable the limit.
    * default value: 4

works for backups. Restores are single-threaded. Watch About → Show log → Live → Information
If you want a pure Azure download test, use your URL in Duplicati.CommandLine.BackendTool.exe

A network-only restore is usually limited by download rate, but also keep no-local-blocks in mind
because (depending on the setup) its shortcut can get a lower download speed than restore speed.

Hello ts678

Thank you for your reply.

The Big Comparison - Borg vs Restic vs Arq 5 vs Duplicacy vs Duplicati is very informative.

Kelly gets 3m28s for a 3.7Gb file restore from onedrive, that’s 0.1Gbits/s, so that’s the same performance I’m seeing with my tests.

I think you have hit the nail on the head, Duplicati is not multi-threaded during the restore.

I must admit I don’t understand the developers’ position on this:

  • A backup is a process that runs in the background, admittedly the faster the backup the better, but there is no downtime during this phase. If the daily backups are completed during the night, everything is fine.

  • However, if there is a major storage incident and users are no longer able to access their data, this is when you need a very fast process with all the performance options to get the files back to the users as quickly as possible.

I hope that the developers will take this into consideration and will quickly propose a multi-threaded evolution of the restoration.

While waiting for an evolution, I will plan to divide the data to be restored in 6 to 8 separate processes. Not easy to do as I have no view on the amount of data in each directory in case of DR.

Thank you

It’s a question (IMO) of priorities. There is far more that could be done than can be done.
Sometimes one might work off of demand in forum and Issues. Or it might get personal.
“Every good work of software starts by scratching a developer’s personal itch.” concept.

As a free effort from volunteers, progress depends on them, and there are very few lately.
I’m here constantly asking for volunteers for code, test, docs, support, etc. It takes a lot…

On chronology, Support parallel uploads for a backend [$381] #2026 was opened in 2016.
Implement parallel uploads to a single backend #3684 was its answer which was in 2019.
I’m not sure if there’s an equivalent issue open for downloads. If so, please link from here.

You can read through that to view commentary from some developers not available today.
Original author (who has had other priorities lately – volunteers are permitted that) posted:

Wow, large refactoring and update.

The point there is that there is a fair amount of skill and time needed to do major evolution.
If you find something that does a good job of keeping your 8 Gb pipe full, please post back.
I’m certainly not familiar with all products, especially ones suited for your seeming usages.

For another case in point of how developers contribute things without centralized planning:

Implementing the feature to restore any version of a single file discusses a wish from 2014
where you can see how someone began it (maybe an interest?), had to pause, came back.
Display versions for a file and allow any version to be restored #4805 is in a queue awaiting
processing by a scarce developer with permission to review it and bring it into the program.

From what I can see, this is sometimes addressed with a local backup. Also have a remote
that will almost certainly transfer more slowly (speed of light…) but is a beneficial safeguard.
Some people call it a “hybrid” backup. Another popular concept is “3-2-1” (one copy off site).

Backblaze has a nice feature for those with slow connections, they’ll fast-ship an 8 TB drive.
The target market for that likely doesn’t have a fast connection, even if the software used it…

They make some good blog posts, such as:

Server Backup 101: On-premises vs. Cloud-only vs. Hybrid Backup Strategies

You might still need to choose backup software carefully for performance that fits your needs.
Or design around limits. For example, a fast download to local might help speed up a restore.
Or you could have a cloud sync for latest copy, and historical restores use something slower.

EDIT:

rclone is good at concurrent file transfers, and supports lots of providers, including Azure Blob.
If you happen to be a developer, Azure “looks” likes it can parallelize within a single file transfer.
Some other providers permit such things too, but each would need its own support in Duplicati.
Duplicati’s upload concurrency is at the file level, and almost every provider can manage that…

Hello TS678
Thank you for the time spent answering me.
Have a nice day

1 Like