Restore stopped with database or disk is full database or disk is full

That sounds like there should be plenty of capacity for the restore process. Do you see any bottlenecks on the target drive when its being run with 12 writing processes?

I can see that’s true for your setup. I would argue that it’s not as much that the new restore flow doesn’t support larger volumes, but rather that a re-download of them are more expensive. With the current default cache size setting, I assumed that 100 downloaded volumes would be sufficient for most backups. However, looks like it’s not uncommon that files are spread across more than 100 volumes, leading to the cache issues. Restoring multiple files concurrently further stress this cache boundary, causing them to compete for resources. Changing configuration parameters is always a trade-off: if there wasn’t a trade-off the default would (or at the very least should) be the optimal value.

Was this with the new or the legacy restore flow? If you try the new restore flow with --restore-file-processors set low as well, then it shouldn’t be requesting too many volumes at once (assuming that blocks haven’t been assigned to more than 100 volumes in a round-robin fashion). For the legacy flow, the issue is likely related to the SQLite error mentioned earlier.

The cache size configuration has been addressed in this PR where the default becomes “use as much cache as possible up until there’s only 1 GB left, then perform premature eviction”. It will take some time to make it to a release. Regarding changing the default concurrency parameters, I don’t have a good “new default” yet.

I understand that this information exists, but it needs updating (some of which I cannot do myself) to reflect the characteristics of the new restore flow. Larger volumes still help with backends that favor fewer but larger requests. This can also hold true for the new restore flow, given that there’s enough space on disk to hold the volumes to allow for parallelism. If there’s no premature eviction, then the only downside to larger volumes is that each volume can only be accessed sequentially. Thus smaller volumes allow for smaller resource locks, in turn increasing the parallel capabilities.

I think the primary issue for the new restore flow is the cache issue. As noted, the new default should help with this and we’re still working on improving the cache management further. The legacy restore flow breaks due to the SQLite related issue.

Right, makes sense given your external drive. The new restore flow can be tuned to work with this, as you’ve already found with setting --restore-volume-downloaders=1 ensuring that only one volume is downloaded at a time. However, with the cache issue one volume is fetched multiple times, which I assume is causing the stalling.

I understand your frustration, and I’m sorry to hear that you’re considering stopping using Duplicati. I noticed that you’re using the old web interface: is it still an issue with the new web interface? Regarding information in general, the docs cover a lot of the settings and use cases. Are there any particular topic missing from these that you think would be helpful to add? Contributions are also welcome there.

This is exactly what I was trying to achieve with the new restore flow: parallization of the restore process, which is described in detail in the corresponding blog post. Instead of serially looking at each volume and scattering the containing blocks across the target disk, the new restore flow starts by parallelizing on the file-level: each file is restored in parallel. This also ensures sequential write (assuming the whole file needs to be restored) which benefits any disk.

The flow starts with the FileLister communicating the files to restore to the FileProcessors (of which there are multiple running in parallel). For each file, the FileProcessor checks whether the target file exists (e.g. for picking up on a previous restore), whether it has the correct size and hash, and if not, finds which blocks are missing. Each missing block is then requested from the block cache. If the block is there, it’s returned, if not, it asks the volume cache for the missing block. The volume cache then checks if the volume is already downloaded, and if it is, it sends the volume to the VolumeDecompressors (of which there are multiple running in parallel) which extract the blocks and send them to the block cache. If the volume is not downloaded, it sends the request to the VolumeDownloaders (of which there are multiple running in parallel) which download the volume, send it to the VolumeDecryptors (of which there are multiple running in parallel) which decrypts the volume and sends it back to the volume cache. The volume cache then sends the volume to the VolumeDecompressors which extract the blocks and send them to the block cache. The block cache then sends the blocks back to the FileProcessor which writes them to disk. So the new design has a lot of parallelism, concurrency, pipelining, caching, and asynchrony built in. Each communication step is a FIFO queue allowing for burst requests. The network is tunable (--restore-file-processors=, --restore-volume-downloaders=, --restore-volume-decryptors=, --restore-volume-decompressors=) allowing for more or less parallelism depending on the bottleneck. The cache is tunable (--restore-volume-cache-hint=, --restore-cache-max=) allowing for more or less caching depending on the disk space and RAM available.

Given the complexity and immaturity of the new restore flow, and due to it having lighter resource requirements because of its serial nature, the legacy restore flow has been kept.

There is no need to ridicule me or the other contributors.

I’m unsure of what you mean: both the new and legacy restore flows treat local and remote backends the same, trying to maximize work effort. What’s missing on the local machine?

That’s definitely a bug, and I haven’t seen it before. Is this occuring in the legacy restore flow? Does the log file reflect the same counts? Does the new web interface show the same behavior? Are all of the warnings the same and what are they? When I run a backup and restore with 2.2.0.3 from a local folder to a local folder I see the correct results under the log, both for the old and new web interface.