Error on purge with 2.1.0.3_beta_2025-01-22

Thanks again for taking the time to debug this with me. I have spent some time with the bugreport database and have found an explanation: the timeouts.

It looks like the purge call was setting it up for failure, but the real failure was the backup operation. It attempted to fix some errors, and due to a bug, it ended up removing all filelists/filesets.

I have now tracked down the problem and it is caused by an unfortunate interplay between the timeout and some old abort handling code. More than 6 years ago there was a method to stop a running backup and a desire to make the stop graceful. This was signaled by some code throwing a OperationCanceledException and the uploader gracefully exiting once it happened.

The timeout handler will in some cases throw the OperationCanceledException instead of the correct TimeoutException. Additionally, throwing an OperationCanceledException will sometimes be converted to a TaskCancelledException marking the tasks as cancelled.

All these factors combine into a case where the upload can fail with an OperationCanceledException and this failure passes all checks, making Duplicati think it has completed correctly.

This should be caught in the post-backup verification, but since files were marked as “Uploading” the logic here considered the files to be incomplete and conveniently “fixes” it by removing them.

In total this makes it possible to have a backup that appears to succeed but actually has failed/missing uploads. Since the cleanup is quite robust, no validation checks are triggered after the cleanup, giving the appearance that all is working well, when it is in fact not.

In your case, Duplicati found some failing uploads and corrected these, uploading a total of 4 new filelists. Due to the bug, these were silently failing, without Duplicati knowing it. But seeing that new filelists were “successfully” uploaded, it proceeded to delete the two older filesets, leaving you without any filelists/filesets.

We will be releasing a new stable release shortyly that fixes this issue as it is quite severe. In the meantime, you can use canary 2.1.0.108 or later, or simply set the timeout to a high value (or set it to 0 to disable).

Note that setting a high timeout value only fixes the case related to the timeout, but other code may also throw OperationCanceledException and this will result in similar lost files, but is expected to be less frequent.

1 Like