Restore methods speed comparison

I’ve seen that there are two restore methods: New and Legacy.

I sort-of guess how these methods work:

  • Legacy: read a file by some order from the backup repository, extract the blocks to the target files in the correct order, modify metadata.
  • New: read the files concurrently from the backup repository, store them in the cache, restore the files sorted by size largest first.

I performed a restore test and saw that restore was uncharacteristically slow, so I decided to compare the two methods.

The backup source contains some images, source files, documents, various installation files and ISO images, etc. Some files are above 1GB, most are very small.

~410,000 files, ~1.3TiB.

The backup repository is another NAS on my home network connected using FTP through a 1GB network.

Restoring using the legacy method I saw that the network was on average 600Mbits, except for the last part. Probably because larger files were added to the repo last.

It took about 4 hours for the complete files to be written to the restore folder, and another 4 hours for the metadata update / file verification / whatever to complete.

I then ran a restore using the new method.

I can see the network was fully saturated. The restore was started around 09:00 (AM). At midnight I checked progress and saw that I had 100GB and ~370,000 files remaining to restore.

At 9:00 (AM) I stopped the restore. I still had ~300,000 files remaining to be restored and 13GB.

The backups are configured with all default settings.

I believe the new method is supposed be considerably faster. Am I missing something? Is there a misconfiguration?

Hi @Maor_Avni, welcome to the forum :waving_hand:

Yes, the new method is significantly faster, but it looks like we were too conservative in the size of cache and too simple in the ordering strategy. What is happening is that the same files are downloaded multiple times, in an attempt to limit the amount of temporary disk space used.

The effect is that network traffic increases a lot and restore speed slows.

We have another discussion here where @carljohnsen lists some ideas for improvement:

There is also work on a PR that will address the issues:

Interesting. I’ll try increasing the cache size significantly and perform another restore test.