Backups suddenly failing - 4 machines now

I’m running 2.0.4.4_canary_2018-11-14 on Fedora Server 29, and suddenly my backups are failing.

Initially the error was:

2018-11-22 14:03:01 +01 - [Error-Duplicati.Library.Main.Operation.TestHandler-FailedToProcessFile]: Failed to process file duplicati-20181122T130004Z.dlist.zip.aes System.Exception: The file duplicati-20181122T130004Z.dlist.zip.aes was downloaded and had size 2097152 but the size was expected to be 2124781 at Duplicati.Library.Main.BackendManager.GetForTesting (System.String remotename, System.Int64 size, System.String hash) [0x00065] in <365e0052627a43cbb8447250519b5a7f>:0 at Duplicati.Library.Main.Operation.TestHandler.DoRun (System.Int64 samples, Duplicati.Library.Main.Database.LocalTestDatabase db, Duplicati.Library.Main.BackendManager backend) [0x0042f] in <365e0052627a43cbb8447250519b5a7f>:0

So I tried verifying, repair, recreate and now it’s reporting this on backups:

2018-11-22 18:13:30 +01 - [Error-Duplicati.Library.Main.Operation.BackupHandler-FatalError]: Fatal error
System.Exception: Unexpected number of remote volumes marked as deleted. Found 0 filesets, but 1 volumes

Any advice on how to fix this?

Now another, this time a Windows server has failed - I had this the other day on the same server and a reboot help, but I can’t keep rebooting each time this happens:

Log data:
2018-11-23 05:06:00 +01 - [Warning-Duplicati.Library.Main.Operation.Backup.MetadataPreProcess.FileEntry-ProcessingMetadataFailed]: Failed to process entry, path: D:\Program Files\MDaemon\App\CfNonScanWrn.dat
CoCoL.RetiredException: The channel is retired
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at CoCoL.Channel`1.<WriteAsync>d__30.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Duplicati.Library.Main.Operation.Backup.MetadataPreProcess.<>c__DisplayClass2_0.<<Run>b__0>d.MoveNext()
2018-11-23 05:06:00 +01 - [Warning-Duplicati.Library.Main.Operation.Backup.MetadataPreProcess.FileEntry-ProcessingMetadataFailed]: Failed to process entry, path: D:\Program Files\MDaemon\App\CFRules.dat
CoCoL.RetiredException: The channel is retired
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at CoCoL.Channel`1.<WriteAsync>d__30.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Duplicati.Library.Main.Operation.Backup.MetadataPreProcess.<>c__DisplayClass2_0.<<Run>b__0>d.MoveNext()

And now I’m at the point where I think I have a serious problem, another Windows server backup failed:

Log data:
2018-11-23 08:03:44 +01 - [Error-Duplicati.Library.Main.Operation.TestHandler-FailedToProcessFile]: Failed to process file duplicati-20181122T070000Z.dlist.zip.aes
System.Exception: The file duplicati-20181122T070000Z.dlist.zip.aes was downloaded and had size 3145728 but the size was expected to be 3173821
   at Duplicati.Library.Main.BackendManager.GetForTesting(String remotename, Int64 size, String hash)
   at Duplicati.Library.Main.Operation.TestHandler.DoRun(Int64 samples, LocalTestDatabase db, BackendManager backend)

So rebooted the server which got fixed the last time I rebooted but this time I got a warning:

Log data:
2018-11-23 08:51:14 +01 - [Warning-Duplicati.Library.Main.Operation.Backup.UploadSyntheticFilelist-MissingTemporaryFilelist]: Expected there to be a temporary fileset for synthetic filelist (34, duplicati-b03bbc2eb91844294a763aafca75b4c4a.dblock.zip.aes), but none was found?

4th machine now failing backups:

Log data:
2018-11-23 18:18:58 +01 - [Error-Duplicati.Library.Main.Operation.BackupHandler-FatalError]: Fatal error
CoCoL.RetiredException: The channel is retired
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at CoCoL.Channel`1.<WriteAsync>d__30.MoveNext()

I’m not sure about the “channel is retired” ones, but the size errors show a remarkably binary-even truncation which could possibly be due to filesystem or network problems. What storage type is in use, and do all use it? Can you examine the destination to see whether the file is truncated there? You can also try opening it using SharpAESCrypt.exe from the Duplicati installation or by installing more user-friendly software from AES Crypt.

actual              actual hex      expected        difference      filename
2097152             200000          2124781         27629           duplicati-20181122T130004Z.dlist.zip.aes
3145728             300000          3173821         28093           duplicati-20181122T070000Z.dlist.zip.aes

The TestHandler messages which are getting the size error are probably from Verifying backend files after a backup runs. This samples even old files, however the ones having trouble look like they’re recent backups. Duplicati does require reliable transfer and storage of backup files, but something seems to be going wrong.

1 Like

All the failing backups are using Wasabi and each has it’s own bucket.

I’ll try the checks later today and report back.

I downloaded and tried to decrypt the file, the message it came back with was:

Error: Message has been altered, do not trust content

I tried to open the ZIP and WinRAR says it’s corrupt

What length are you getting for that file right after download? Is it either the actual or expected Duplicati got?

What’s somewhat odd is that Duplicati does an early file listing as a sanity check, and it checks listed lengths. Example error message from a test backup job log when I added an extra byte to the end of a file for this test:

The file duplicati-20181123T193503Z.dlist.zip.aes was downloaded and had size 926 but the size was expected to be 925

If you could obtain a file listing with length directly from Wasabi, you can also see if it matches download size. Unfortunately sometimes tools give sizes in larger units, whereas what I’d prefer would be an exact size of file. Here’s an example of larger units, using a tool that you might also find useful to download with Duplicati code:

C:\Program Files\Duplicati 2>Duplicati.CommandLine.BackendTool.exe help
Usage: <command> <protocol>://<username>:<password>@<path> [filename]
Example: LIST ftp://user:pass@server/folder

Supported backends: aftp,amzcd,azure,b2,box,cloudfiles,dropbox,file,ftp,googledrive,gcs,hubic,jottacloud,mega,msgroup,onedrivev2,sharepoint,openstack,rclone,s3,od4b,mssp,sia,ssh,tahoe,webdav
Supported commands: GET PUT LIST DELETE CREATEFOLDER

C:\Program Files\Duplicati 2>Duplicati.CommandLine.BackendTool.exe list "file://C:/Duplicati Backups/local test 7 encrypted"
Name    Dir/File        LastChange      Size
duplicati-20181123T193503Z.dlist.zip.aes        File    11/24/2018 10:19:31 AM  926 bytes
duplicati-20181123T193602Z.dlist.zip.aes        File    11/23/2018 2:36:03 PM   925 bytes
duplicati-20181124T005226Z.dlist.zip.aes        File    11/23/2018 7:52:26 PM   925 bytes
duplicati-ba10b1775b7194e5da729ee932459c479.dblock.zip.aes      File    11/23/2018 2:35:03 PM   1.40 KB
duplicati-bea036779eba84c48921bd4a3d65f122e.dblock.zip.aes      File    11/23/2018 2:36:02 PM   1.40 KB
duplicati-i63dd01a09b7e4203acf936d973f430ce.dindex.zip.aes      File    11/23/2018 2:36:02 PM   1005 bytes
duplicati-i7ab65cb97dce48a59845f69daf92216a.dindex.zip.aes      File    11/23/2018 2:35:04 PM   1005 bytes

C:\Program Files\Duplicati 2>

How are you even getting a ZIP to open? When I get the “altered” error, SharpAESCrypt doesn’t make a ZIP. WinRAR’s plan for an encrypted ZIP file format is different anyway, so I wouldn’t expect it to open Duplicati’s.

EDIT: A file size issue on file list before a backup apparently isn’t fatal, but adds warnings to a notification log or a –log-file log, if either is used, otherwise I’m not sure if the warning before backup is saved. From my log:

2018-11-24 10:19:44 -05 - [Profiling-Timer.Begin-Duplicati.Library.Main.Operation.BackupHandler-PreBackupVerify]: Starting - PreBackupVerify
2018-11-24 10:19:44 -05 - [Profiling-Timer.Begin-Duplicati.Library.Main.BackendManager-RemoteOperationList]: Starting - RemoteOperationList
2018-11-24 10:19:44 -05 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()
2018-11-24 10:19:44 -05 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (7 bytes)
2018-11-24 10:19:44 -05 - [Profiling-Timer.Finished-Duplicati.Library.Main.BackendManager-RemoteOperationList]: RemoteOperationList took 0:00:00:00.000
2018-11-24 10:19:44 -05 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteReader]: Starting - ExecuteReader: SELECT DISTINCT "Name", "State" FROM "Remotevolume" WHERE "Name" IN (SELECT "Name" FROM "Remotevolume" WHERE "State" IN ("Deleted", "Deleting")) AND NOT "State" IN ("Deleted", "Deleting")
2018-11-24 10:19:44 -05 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteReader]: ExecuteReader: SELECT DISTINCT "Name", "State" FROM "Remotevolume" WHERE "Name" IN (SELECT "Name" FROM "Remotevolume" WHERE "State" IN ("Deleted", "Deleting")) AND NOT "State" IN ("Deleted", "Deleting") took 0:00:00:00.000
2018-11-24 10:19:44 -05 - [Warning-Duplicati.Library.Main.Operation.FilelistProcessor-MissingRemoteHash]: remote file duplicati-20181123T193503Z.dlist.zip.aes is listed as Verified with size 926 but should be 925, please verify the sha256 hash "mhNWhG6TOLk+4q4rKlOicaM83ilRjkwGL4SGq3Sfb1s="
2018-11-24 10:19:44 -05 - [Profiling-Timer.Finished-Duplicati.Library.Main.Operation.BackupHandler-PreBackupVerify]: PreBackupVerify took 0:00:00:00.001

From end of my notification email:

Log data:
2018-11-24 10:19:44 -05 - [Warning-Duplicati.Library.Main.Operation.FilelistProcessor-MissingRemoteHash]: remote file duplicati-20181123T193503Z.dlist.zip.aes is listed as Verified with size 926 but should be 925, please verify the sha256 hash "mhNWhG6TOLk+4q4rKlOicaM83ilRjkwGL4SGq3Sfb1s="
2018-11-24 10:19:45 -05 - [Warning-Duplicati.Library.Main.Operation.FilelistProcessor-MissingRemoteHash]: remote file duplicati-20181123T193503Z.dlist.zip.aes is listed as Verified with size 926 but should be 925, please verify the sha256 hash "mhNWhG6TOLk+4q4rKlOicaM83ilRjkwGL4SGq3Sfb1s="
2018-11-24 10:20:25 -05 - [Error-Duplicati.Library.Main.Operation.TestHandler-FailedToProcessFile]: Failed to process file duplicati-20181123T193503Z.dlist.zip.aes
System.Exception: The file duplicati-20181123T193503Z.dlist.zip.aes was downloaded and had size 926 but the size was expected to be 925
   at Duplicati.Library.Main.BackendManager.GetForTesting(String remotename, Int64 size, String hash)
   at Duplicati.Library.Main.Operation.TestHandler.DoRun(Int64 samples, LocalTestDatabase db, BackendManager backend)

I can’t figure out the syntax for Duplicati.CommandLine.BackendTool.exe and my Wasabi bucket, what should I be passing as the parameters?

I got the zip by using the -do command so it kept the decrypted file rather than delete it which is also the default for the AEScrypt tool I also installed.

But does it matter about the size because the file is corrupt any way so Duplicati is correct in that there is a problem so the file size info it has is probably right. Can I somehow clean up the backup this file is from so I can get things back to normal?

When the help text requests <protocol>://<username>:<password>@<path> I think it’s generally looking for whatever the storage URL is for that storage type. Although you can see some of the URL in Target URL of the job’s Commandline option, to also get authentication credentials you can Export → As Command-line.

Do we know the file is corrupt? The WinRAR test doesn’t count (I think) unless you can get it to decrypt any Duplicati .zip.aes file, but I doubt it can. Duplicati is not using ZIP-format encryption, but ZIP then encryption. Additionally, I prefer to figure out where the problem is happening, in the hope of keeping it from continuing. Right now all I know (unless you have some emails or logs like I showed) is that Duplicati couldn’t download the file as it expected, and most likely it was truncated. Did that occur during upload, during storage, during download, or some other way? Getting back to normal requires having confidence in networks and storage.

Duplicati.CommandLine.BackendTester.exe is one option to see if things are good now. Maybe it was simply something temporarily wrong, but I don’t want to make things worse if the storage is still having any problem.

There’s also no guarantee whatever went wrong didn’t hurt some other file. You could survey for damage by using The TEST command after reshaping your Export → As Command-line a little, or do that in the web UI. Using a larger number of samples takes longer, but the default of 1 sample seems low given known damage.

Meanwhile I’ll see what I can think of for best recovery assuming your Wasabi is OK now and damage is little. You’ve already used most of the primary tools on the initial report. There are a few other tools, but not many.

You’ve got multiple problems, and some are pretty obscure (e.g. “channel is retired” which I hope is just from something else going wrong first). The synthetic filelist is something an interrupted backup normally makes to show what it had gotten thus far, and that warning (note it wasn’t an error) might not be much to worry about. “Unexpected number of remote volumes marked as deleted” has been seen a few times, but isn’t understood.

Is this the most important system to get back up? What about the others? Are there any on Wasabi still OK? While a quick Google search didn’t spot any widespread problems there, how did your systems fail at once?

Another consideration is whether “back to normal” means backing up current data, or keeping access to old. Sometimes a new backup may be started, and the old moved to another Duplicati to see what can be saved. Sometimes people don’t care about old, or repair efforts make things worse, so it’s easier to just begin fresh. Until I hear more about your needs, I’m trying to proceed cautiously, but that definitely can be more difficult…

If anybody else following this has thoughts about a good way to resolve this, feel free to join in on the efforts.

I’m also using Wasabi and started experiencing the same issue you are seeing within the past few days. So it seems like something is going on with Wasabi but not sure yet. I’ll let you know if/when I figure it out.

That is unfortunately a “follow error” that happens when something else breaks.

I am tagging @wasabi-jim in case he has some input.

Thanks everyone, a lot has been discussed and I’m not sure I can answer all of it.

I can say that they did not all fail at the same time per-se, just the same day as they are all scheduled at staggered times over night.

All the machines affected except one have a secondary Duplicati backup to a local server, so Wasabi is just for long-term and real disaster recovery. If you think it might be better to start new ones for them then so be it, it’s not like it’s happened before i.e. moving from CrashPlan I made the same decision to simply lose the backups.

Also, I’m still struggling with the syntax for the various command-line tools and what to put, for a Wasabi bucket, in place of this <protocol>://<username>:<password>@<path>

Hi Folks - thanks for the details about and sorry you are seeing some problems. We will try to investigate based on the above info but if the affected Duplicati users would not mind sending in ticket request to support at wasabi.com, we can get back to you with a personalized reply.

Thanks,
Jim (part of the Wasabi team)

So my prior note didn’t help? I tested it on Google Drive and Backblaze B2. I tried options afterwards without luck, so I think it all has to be on the URL, and the Export → As Command-line seemed to be a way to get it. Haven’t actually looked at source code though, but I had assumed it took the URL apart in the standard way.

Ah ok, so I got that now. I tried the BackendTester and it wanted to delete everything, and I then ran just the Test - it asked me for my passphrase and nothing seems to be happening. I can enter more characters and returns, and it just sits there. This is on my Fedora box btw.

I ended up deleting the three dlist files that were reporting a file size mismatch. Then I deleted and recreated the database. I was able to run a successful backup after that. I think Wasabi was having some kind of issue because I was having problems deleting objects. Seems to be okay for now at least for me.

I don’t know what to say about that. I just tested to a b2:// URL from Windows and Linux, and mine worked, however it said *** Remote folder is not empty, aborting when I tried to reuse a bucket and folder, and could not create a new bucket but could create a new folder (I just changed the name from an old URL). Got a few upload failures both cases (regular Duplicati would probably just have retried those up to 5 times).

That’s what I would have suggested (maybe after retrying a plain recreate again). You need that database.

Thinking about an earlier statement I made, I think that test of SharpAESCrypt with do and WinRAR was an appropriate proof of a corrupt file, because you decrypted (as much as could be done) before trying unzip. You might want to at least run one --log-file log to see if you get any Warnings about bad lengths, like I did. Running with a higher –backup-test-samples than the rather low default of 1 for awhile might also be good.

One thing I forgot to ask is whether anybody was on a recent Duplicati version. There was an update of the AWSSDK DLL in 2.0.3.24 (I think), and I’m hoping it still works fine with all of the S3-compatible providers…

So do you think I should simply delete the corrupt file as well or can it be recovered somehow and re-uploaded?