Restore stops with "Volume count in Sleepable Dictionary volume table is not zero"

covertboard · January 15, 2025, 7:27pm

I’m using Duplicati (since six weeks 2.1.0.106_canary_2025-01-11, before one of the betas) for multiple years now as part of a Docker stack (on a Raspberry 5) to backup my smarthome on a Google Drive. Never a problem with backing up or restoring.

Now, for the first time (unfortunately in a really important situation), a restore stops (to be precise: “slows down to complete halt”) with the error message

Volume count in SleepableDictionarys volume table is not zero: 14726 Volumes: -1, 85, 87, 89, 91, 130, 132, 138, 140, 169, 171, 173, 177, 208, 215, 220, 237, 239, 241, 243, 245, 255, 257, 263, 265, 267, 269, 271, 291, 293, 295, 321, 323.

It also shows:
Block count in SleepableDictionarys block table is not zero: 14726

And

Error during file processing
System.NotSupportedException: Stream does not support reading.
   at System.IO.FileStream.ReadAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)
   at System.IO.Stream.ReadAsync(Byte[] buffer, Int32 offset, Int32 count)
   at Duplicati.Library.Main.Operation.Restore.FileProcessor.<>c__DisplayClass1_0.<<Run>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at CoCoL.AutomationExtensions.RunTask[T](T channels, Func`2 method, Boolean catchRetiredExceptions)

I did not find anything on that message. Does anyone have an idea on what this means and how I can fix it?

covertboard · January 15, 2025, 8:59pm

Ok, this is odd. Even though I did the most recent update on 2.1.0.106, I could only restore my backups by going back to 2.0.8.1_beta_2024-05-07.

Does this make sense to anyone?

ts678 · January 15, 2025, 10:24pm

Maybe the developer of new restore, but it’s gotten late for today:

kenkendk · January 16, 2025, 7:35am

That sounds like an issue with the new restore implementation.
As mentioned, you can switch back to the previous one by setting the advanced option --restore-legacy=true

Maybe @carljohnsen can explain what the issue is.

If so, I think this may be related to “broken” index files that are generated by 2.1.0.2. Due to some quirks, the index files may contain the same blocklist data multiple times. This was tripping up some database recreate as well, so perhaps it is also doing unexpected things with the new restore flow as well.

covertboard · January 16, 2025, 7:48am

Thanks!

Since my setting appears to be a pretty normal use case (with a fairly serious bug)…

backups running for years on beta
then updating to the most recent canary
continuing backups on the most recent canary
then restoring after some weeks with that canary

… I am happy to provide details for debugging. Just let me know how I can support.

The bug was pretty scary, though, since NONE of my backups seemed to work, and only out of desperation I went back to the beta version (to restore what was backed up by the latest canary).

carljohnsen · January 17, 2025, 10:09am

I’d be happy to help! Yes, that issue relates to the new restore flow.

When the restore starts, it constructs a plan for which blocks it needs to restore followed by a similar plan for which volumes it needs to fetch. This plan is used for the caching strategy, so that it is aware of which blocks and volumes are needed multiple times thus benefiting from being cached. If they’re only needed once, there’s no need for spending resources on keeping them in their caches. If a block or volume is no longer needed, they are evicted from the block cache (residing in memory) and volume cache (residing on disk) respectively.

The first error that you’re reporting is related to this restore plan; it states that according to the plan, 14726 volumes have not been fully read, followed by the first couple of volume IDs (for debugging purposes). The second statement is very similar, but reports that the block cache have not seen the correct amuont of reads yet. If the restore was successful, both of these counts should be 0. This error is meant as a post restoration sanity check of whether the restoration went according to the plan.

As to why the error occurs, it looks like being due to the FileProcessor crashing when it tried to read a stream:

covertboard:

Error during file processing
System.NotSupportedException: Stream does not support reading.
   at System.IO.FileStream.ReadAsync(Byte[] buffer, Int32 offset, Int32 count, CancellationToken cancellationToken)
   at System.IO.Stream.ReadAsync(Byte[] buffer, Int32 offset, Int32 count)
   at Duplicati.Library.Main.Operation.Restore.FileProcessor.<>c__DisplayClass1_0.<<Run>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at CoCoL.AutomationExtensions.RunTask[T](T channels, Func`2 method, Boolean catchRetiredExceptions)

My guess is that you get the reporting of the caches being incomplete due to one of the internal restore processes throwing an exception, which prematurely shuts down the operation, leading to an incomplete restoration as reported by the caches.

As for a solution to the problem, I need to figure out what happened in the FileProcessor. I’ll get back to you if I need more information.

carljohnsen · January 21, 2025, 5:40am

I found the error; it turns out that the file was opened in an incorrect mode when only part of the file needed restoring. The error has been fixed and a test capturing this behaviour has been added in PR 5886 and should be part of a following build. Thank you @covertboard for reporting the bug!

covertboard · January 21, 2025, 6:28am

Thanks!

Will try out again soon, since I have by now recovered from the shock that all my backups don’t work anymore.

covertboard · March 13, 2025, 7:43pm

@carljohnsen , quick question on this one (am not a GitHub Pro), since I’d like to re-test.

From what I can see, your fix has been merged into the master branch:

But when I compare the current release (2.1.0.5) against all versions that were released since Jan 20, I can’t find the commit: Comparing v2.1.0.104_canary_2024-12-31…v2.1.0.5_stable_2025-03-04 · duplicati/duplicati

Am I missing anything?

carljohnsen · March 14, 2025, 5:35am

It’s because the new restore flow is not part of the beta / stable build yet - it currently resides in the canary builds only. In particulary, it was part of 2.1.0.108, which this comparison of 2.1.0.107 and 2.1.0.108 shows. It should also be part of any of the following canary builds as well.

I can’t say when it’ll be part of the beta / stable builds yet, but if you’d like to try the canary still the latest canary 2.1.0.110 does have some more optimizations and bugfixes done to the reworked restore flow making it faster. However, there are some more bugs and optimizations, which have been fixed in the master branch and will be in the next canary build.

Hope that clarifies it, otherwise do let me know!

carljohnsen · March 17, 2025, 8:45am

I can report that the new canary 2.1.0.111 now has all of the “pending” restore optimizations and bugfixes.