What are the best settings for restore of large file counts?

Maj1127 · September 16, 2022, 1:44pm

Good morning! We have used Duplicati to restore successfully for a while now, but we’ve hit a major problem. We have a total of nearly 160,000 dblock files that need to be converted back to their original file types. We’d like to make sure we have optimal settings for speed and reliability of the restore in this case.

A few things to note…

All of the dblock files and index files exist locally on ssd’s, so everything is local as far as data transfer is concerned.
In the past, we have restored successfully but the file counts were much smaller, less than 10k.
the reason for the large file count is that we had the backup job split each of the dblock files into 50MB files after compression.

Any advice or recommendations would be greatly appreciated. Thanks!

ts678 · September 17, 2022, 2:13pm

Welcome to the forum @Maj1127

So that’s about 8 TB of dblock files. What matters more is are you at default small 100 KB blocksize? Database SQL operations slow down when tracking more than a few million blocks. If you have your database intact, you might care less, otherwise (e.g. for disaster recovery) DB recreate may get slow.

There can be slow SQL even with a good database, but at least it’s less DB exercise without recreate.
Is database on SSD as well? That might help performance, but there may be few experts for this size.
If anyone’s done one before, please feel free to help, otherwise this may have to rely on generic ideas.

Speed and reliability possibly trade off, and your choice may depend on the needs to get files restored. Restoring a smaller critical set of files may be an option, and only the needed dblocks will be obtained. Because there is always the possibility of a restore failure, doing prioritized increments may add value. Time might be more, due to more total work, or might be less if scaling down speeds up SQL operation.

Is restore also going to local SSD? Are original source files there too? Are you restoring over them or to separate area? Duplicati by default will try to obtain blocks from original source instead of downloading. no-local-blocks can stop this effort if you know there’s nothing there to be had. This can save that work.

On the other hand, if there are files that are almost as desired at original spot, Duplicati will only change whatever is needed in those files to make them as they were when that backup was created (less work).

Maj1127 · September 21, 2022, 1:04pm

Hey @ts678, Thanks for the response. To give some more details, the duplicati files are all stored locally on a server containing all SSDs. They destination folder is also on SSDs within the same logical drive. So as far as that goes it is as fast as we could make it.

The problem we have run into in the past is that when we use the duplicati restore function, we are only able to successfully restore certain numbers of files at a time. Otherwise the process fails. Granted, we weren’t using SSDs for those jobs. We really just want to make sure that this will complete in a decent timeframe. The files we are restoring are actually backup files, so we need all of them intact before a proper restore can be completed.

Whatever settings you would recommend for a quicker restore are appreciated and if nothing else will give us a good test case for future endeavors. Thanks again for your help!

ts678 · September 21, 2022, 3:07pm

What OS is this? Is its temporary folder on SSD? What about Duplicati database (or is that missing)?

How so? Got any messages? Any idea what went wrong? I can’t guide you around undescribed issue.

Duplicati was backing up backup files from some backup program? OK, if you need them all you need them all, however if it comes down to doing this in small loads (I don’t know why), that should still work.

Maj1127 · September 27, 2022, 1:25pm

Hey @ts678, We managed to get everything downloaded, but to answer the questions, it is Windows Server 2016 and it is not a temp folder. The duplicati database was non existent on the target server as it is a disaster recovery scenario, so the database existed in the cloud only.

The process did not fail this time, but typically the errors state that there are missing dblock files in the source data, but I can see those files in the cloud storage.

And we did not need all of the files but there was a way to determine which files in particular were necessary, so we restored all of them anyway.