Backups suddenly failing - 4 machines now


#4

So rebooted the server which got fixed the last time I rebooted but this time I got a warning:

Log data:
2018-11-23 08:51:14 +01 - [Warning-Duplicati.Library.Main.Operation.Backup.UploadSyntheticFilelist-MissingTemporaryFilelist]: Expected there to be a temporary fileset for synthetic filelist (34, duplicati-b03bbc2eb91844294a763aafca75b4c4a.dblock.zip.aes), but none was found?

#5

4th machine now failing backups:

Log data:
2018-11-23 18:18:58 +01 - [Error-Duplicati.Library.Main.Operation.BackupHandler-FatalError]: Fatal error
CoCoL.RetiredException: The channel is retired
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at CoCoL.Channel`1.<WriteAsync>d__30.MoveNext()

#6

I’m not sure about the “channel is retired” ones, but the size errors show a remarkably binary-even truncation which could possibly be due to filesystem or network problems. What storage type is in use, and do all use it? Can you examine the destination to see whether the file is truncated there? You can also try opening it using SharpAESCrypt.exe from the Duplicati installation or by installing more user-friendly software from AES Crypt.

actual              actual hex      expected        difference      filename
2097152             200000          2124781         27629           duplicati-20181122T130004Z.dlist.zip.aes
3145728             300000          3173821         28093           duplicati-20181122T070000Z.dlist.zip.aes

The TestHandler messages which are getting the size error are probably from Verifying backend files after a backup runs. This samples even old files, however the ones having trouble look like they’re recent backups. Duplicati does require reliable transfer and storage of backup files, but something seems to be going wrong.


#7

All the failing backups are using Wasabi and each has it’s own bucket.

I’ll try the checks later today and report back.


#8

I downloaded and tried to decrypt the file, the message it came back with was:

Error: Message has been altered, do not trust content

I tried to open the ZIP and WinRAR says it’s corrupt


#9

What length are you getting for that file right after download? Is it either the actual or expected Duplicati got?

What’s somewhat odd is that Duplicati does an early file listing as a sanity check, and it checks listed lengths. Example error message from a test backup job log when I added an extra byte to the end of a file for this test:

The file duplicati-20181123T193503Z.dlist.zip.aes was downloaded and had size 926 but the size was expected to be 925

If you could obtain a file listing with length directly from Wasabi, you can also see if it matches download size. Unfortunately sometimes tools give sizes in larger units, whereas what I’d prefer would be an exact size of file. Here’s an example of larger units, using a tool that you might also find useful to download with Duplicati code:

C:\Program Files\Duplicati 2>Duplicati.CommandLine.BackendTool.exe help
Usage: <command> <protocol>://<username>:<password>@<path> [filename]
Example: LIST ftp://user:pass@server/folder

Supported backends: aftp,amzcd,azure,b2,box,cloudfiles,dropbox,file,ftp,googledrive,gcs,hubic,jottacloud,mega,msgroup,onedrivev2,sharepoint,openstack,rclone,s3,od4b,mssp,sia,ssh,tahoe,webdav
Supported commands: GET PUT LIST DELETE CREATEFOLDER

C:\Program Files\Duplicati 2>Duplicati.CommandLine.BackendTool.exe list "file://C:/Duplicati Backups/local test 7 encrypted"
Name    Dir/File        LastChange      Size
duplicati-20181123T193503Z.dlist.zip.aes        File    11/24/2018 10:19:31 AM  926 bytes
duplicati-20181123T193602Z.dlist.zip.aes        File    11/23/2018 2:36:03 PM   925 bytes
duplicati-20181124T005226Z.dlist.zip.aes        File    11/23/2018 7:52:26 PM   925 bytes
duplicati-ba10b1775b7194e5da729ee932459c479.dblock.zip.aes      File    11/23/2018 2:35:03 PM   1.40 KB
duplicati-bea036779eba84c48921bd4a3d65f122e.dblock.zip.aes      File    11/23/2018 2:36:02 PM   1.40 KB
duplicati-i63dd01a09b7e4203acf936d973f430ce.dindex.zip.aes      File    11/23/2018 2:36:02 PM   1005 bytes
duplicati-i7ab65cb97dce48a59845f69daf92216a.dindex.zip.aes      File    11/23/2018 2:35:04 PM   1005 bytes

C:\Program Files\Duplicati 2>

How are you even getting a ZIP to open? When I get the “altered” error, SharpAESCrypt doesn’t make a ZIP. WinRAR’s plan for an encrypted ZIP file format is different anyway, so I wouldn’t expect it to open Duplicati’s.

EDIT: A file size issue on file list before a backup apparently isn’t fatal, but adds warnings to a notification log or a –log-file log, if either is used, otherwise I’m not sure if the warning before backup is saved. From my log:

2018-11-24 10:19:44 -05 - [Profiling-Timer.Begin-Duplicati.Library.Main.Operation.BackupHandler-PreBackupVerify]: Starting - PreBackupVerify
2018-11-24 10:19:44 -05 - [Profiling-Timer.Begin-Duplicati.Library.Main.BackendManager-RemoteOperationList]: Starting - RemoteOperationList
2018-11-24 10:19:44 -05 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()
2018-11-24 10:19:44 -05 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (7 bytes)
2018-11-24 10:19:44 -05 - [Profiling-Timer.Finished-Duplicati.Library.Main.BackendManager-RemoteOperationList]: RemoteOperationList took 0:00:00:00.000
2018-11-24 10:19:44 -05 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteReader]: Starting - ExecuteReader: SELECT DISTINCT "Name", "State" FROM "Remotevolume" WHERE "Name" IN (SELECT "Name" FROM "Remotevolume" WHERE "State" IN ("Deleted", "Deleting")) AND NOT "State" IN ("Deleted", "Deleting")
2018-11-24 10:19:44 -05 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteReader]: ExecuteReader: SELECT DISTINCT "Name", "State" FROM "Remotevolume" WHERE "Name" IN (SELECT "Name" FROM "Remotevolume" WHERE "State" IN ("Deleted", "Deleting")) AND NOT "State" IN ("Deleted", "Deleting") took 0:00:00:00.000
2018-11-24 10:19:44 -05 - [Warning-Duplicati.Library.Main.Operation.FilelistProcessor-MissingRemoteHash]: remote file duplicati-20181123T193503Z.dlist.zip.aes is listed as Verified with size 926 but should be 925, please verify the sha256 hash "mhNWhG6TOLk+4q4rKlOicaM83ilRjkwGL4SGq3Sfb1s="
2018-11-24 10:19:44 -05 - [Profiling-Timer.Finished-Duplicati.Library.Main.Operation.BackupHandler-PreBackupVerify]: PreBackupVerify took 0:00:00:00.001

From end of my notification email:

Log data:
2018-11-24 10:19:44 -05 - [Warning-Duplicati.Library.Main.Operation.FilelistProcessor-MissingRemoteHash]: remote file duplicati-20181123T193503Z.dlist.zip.aes is listed as Verified with size 926 but should be 925, please verify the sha256 hash "mhNWhG6TOLk+4q4rKlOicaM83ilRjkwGL4SGq3Sfb1s="
2018-11-24 10:19:45 -05 - [Warning-Duplicati.Library.Main.Operation.FilelistProcessor-MissingRemoteHash]: remote file duplicati-20181123T193503Z.dlist.zip.aes is listed as Verified with size 926 but should be 925, please verify the sha256 hash "mhNWhG6TOLk+4q4rKlOicaM83ilRjkwGL4SGq3Sfb1s="
2018-11-24 10:20:25 -05 - [Error-Duplicati.Library.Main.Operation.TestHandler-FailedToProcessFile]: Failed to process file duplicati-20181123T193503Z.dlist.zip.aes
System.Exception: The file duplicati-20181123T193503Z.dlist.zip.aes was downloaded and had size 926 but the size was expected to be 925
   at Duplicati.Library.Main.BackendManager.GetForTesting(String remotename, Int64 size, String hash)
   at Duplicati.Library.Main.Operation.TestHandler.DoRun(Int64 samples, LocalTestDatabase db, BackendManager backend)

#10

I can’t figure out the syntax for Duplicati.CommandLine.BackendTool.exe and my Wasabi bucket, what should I be passing as the parameters?

I got the zip by using the -do command so it kept the decrypted file rather than delete it which is also the default for the AEScrypt tool I also installed.

But does it matter about the size because the file is corrupt any way so Duplicati is correct in that there is a problem so the file size info it has is probably right. Can I somehow clean up the backup this file is from so I can get things back to normal?


#11

When the help text requests <protocol>://<username>:<password>@<path> I think it’s generally looking for whatever the storage URL is for that storage type. Although you can see some of the URL in Target URL of the job’s Commandline option, to also get authentication credentials you can Export --> As Command-line.

Do we know the file is corrupt? The WinRAR test doesn’t count (I think) unless you can get it to decrypt any Duplicati .zip.aes file, but I doubt it can. Duplicati is not using ZIP-format encryption, but ZIP then encryption. Additionally, I prefer to figure out where the problem is happening, in the hope of keeping it from continuing. Right now all I know (unless you have some emails or logs like I showed) is that Duplicati couldn’t download the file as it expected, and most likely it was truncated. Did that occur during upload, during storage, during download, or some other way? Getting back to normal requires having confidence in networks and storage.

Duplicati.CommandLine.BackendTester.exe is one option to see if things are good now. Maybe it was simply something temporarily wrong, but I don’t want to make things worse if the storage is still having any problem.

There’s also no guarantee whatever went wrong didn’t hurt some other file. You could survey for damage by using The TEST command after reshaping your Export --> As Command-line a little, or do that in the web UI. Using a larger number of samples takes longer, but the default of 1 sample seems low given known damage.

Meanwhile I’ll see what I can think of for best recovery assuming your Wasabi is OK now and damage is little. You’ve already used most of the primary tools on the initial report. There are a few other tools, but not many.

You’ve got multiple problems, and some are pretty obscure (e.g. “channel is retired” which I hope is just from something else going wrong first). The synthetic filelist is something an interrupted backup normally makes to show what it had gotten thus far, and that warning (note it wasn’t an error) might not be much to worry about. “Unexpected number of remote volumes marked as deleted” has been seen a few times, but isn’t understood.

Is this the most important system to get back up? What about the others? Are there any on Wasabi still OK? While a quick Google search didn’t spot any widespread problems there, how did your systems fail at once?

Another consideration is whether “back to normal” means backing up current data, or keeping access to old. Sometimes a new backup may be started, and the old moved to another Duplicati to see what can be saved. Sometimes people don’t care about old, or repair efforts make things worse, so it’s easier to just begin fresh. Until I hear more about your needs, I’m trying to proceed cautiously, but that definitely can be more difficult…

If anybody else following this has thoughts about a good way to resolve this, feel free to join in on the efforts.


#12

I’m also using Wasabi and started experiencing the same issue you are seeing within the past few days. So it seems like something is going on with Wasabi but not sure yet. I’ll let you know if/when I figure it out.


#13

That is unfortunately a “follow error” that happens when something else breaks.

I am tagging @wasabi-jim in case he has some input.


Channel retired for OneDrive backup on Debian
Repair and Rebuild database fails: Unexpected difference in fileset
Issue with backup completing 2.0.4.5_beta
#14

Thanks everyone, a lot has been discussed and I’m not sure I can answer all of it.

I can say that they did not all fail at the same time per-se, just the same day as they are all scheduled at staggered times over night.

All the machines affected except one have a secondary Duplicati backup to a local server, so Wasabi is just for long-term and real disaster recovery. If you think it might be better to start new ones for them then so be it, it’s not like it’s happened before i.e. moving from CrashPlan I made the same decision to simply lose the backups.

Also, I’m still struggling with the syntax for the various command-line tools and what to put, for a Wasabi bucket, in place of this <protocol>://<username>:<password>@<path>


#15

Hi Folks - thanks for the details about and sorry you are seeing some problems. We will try to investigate based on the above info but if the affected Duplicati users would not mind sending in ticket request to support at wasabi.com, we can get back to you with a personalized reply.

Thanks,
Jim (part of the Wasabi team)


#16

So my prior note didn’t help? I tested it on Google Drive and Backblaze B2. I tried options afterwards without luck, so I think it all has to be on the URL, and the Export --> As Command-line seemed to be a way to get it. Haven’t actually looked at source code though, but I had assumed it took the URL apart in the standard way.


#17

Ah ok, so I got that now. I tried the BackendTester and it wanted to delete everything, and I then ran just the Test - it asked me for my passphrase and nothing seems to be happening. I can enter more characters and returns, and it just sits there. This is on my Fedora box btw.


#18

I ended up deleting the three dlist files that were reporting a file size mismatch. Then I deleted and recreated the database. I was able to run a successful backup after that. I think Wasabi was having some kind of issue because I was having problems deleting objects. Seems to be okay for now at least for me.


#19

I don’t know what to say about that. I just tested to a b2:// URL from Windows and Linux, and mine worked, however it said *** Remote folder is not empty, aborting when I tried to reuse a bucket and folder, and could not create a new bucket but could create a new folder (I just changed the name from an old URL). Got a few upload failures both cases (regular Duplicati would probably just have retried those up to 5 times).

That’s what I would have suggested (maybe after retrying a plain recreate again). You need that database.

Thinking about an earlier statement I made, I think that test of SharpAESCrypt with do and WinRAR was an appropriate proof of a corrupt file, because you decrypted (as much as could be done) before trying unzip. You might want to at least run one --log-file log to see if you get any Warnings about bad lengths, like I did. Running with a higher –backup-test-samples than the rather low default of 1 for awhile might also be good.

One thing I forgot to ask is whether anybody was on a recent Duplicati version. There was an update of the AWSSDK DLL in 2.0.3.24 (I think), and I’m hoping it still works fine with all of the S3-compatible providers…


#20

So do you think I should simply delete the corrupt file as well or can it be recovered somehow and re-uploaded?


#21

Sorry, I mixed up my users. I had seen you had already done this series ultimately winding up with a recreate:

which means it’s probably not too extremely slow (like it is for some people). Did you save the old database? There’s enough info in it that (I think) repair can rebuild a missing remote dlist, but if database is gone and remote dlist is gone, you probably can’t get the dlist back unless somehow it’s now working right on backend. How the backup process works gets into details, but the dlist is basically a file listing plus references to data, which can be of varying age (old or new). If everything comes back together by deleting 20181122T130004Z then you lose the view of what all your files were like then. Did you survey enough to feel other files are OK?

Before deleting the file on the backend, please make sure it still looks bad. Did you ever ask Wasabi support for a personalized reply, as @wasabi-jim suggested? Maybe they can think of a way to get the file back, or possibly would want to study other files for similar problems. I’m not sure how much direct visibility they have even to file lengths to look for other file sizes suspiciously even in length in hex. Or you could offer them info.
–upload-verification-file holds file names, lengths and SHA-256 hashes for your files, but I’m not sure it runs when the backup is currently broken. If it won’t, it might still be a useful thing to have for any future problems.

If it turns out you have no old database, the problem still exists, Wasabi can’t help or explain, and you’ve got some confidence the damage is limited, then save off a just-in-case copy of the dlist, delete it, then recreate,
unless someone reading this has a better idea. I don’t know if there’s a way to preserve all of the backend in case this takes a few tries (similar idea to saving copies of the database, but takes space and provider tools) but you can, if you like, avoid actually changing things (until you see the plan) by adding a –dry-run to repair which is sometimes a little surprising in its plan, but with no local database it should just try to recreate that…

For anyone who hasn’t noticed, there’s another weird issue on Wasabi at Backup won’t complete with 2.0.4.4 resolved (perhaps dangerously) by seemingly using unencrypted HTTP. The actual cause of that is still more unclear, especially until additional reports come in pointing to some common factor to get suspicious about…


#22

This is what Wasabi told me:

Thanks for the note and reaching out to us.   We are sorry for the problem.     We are still investigating the root cause on our end but can you tell us whether or not it would be possible for you to re-upload the objects in question?   

From our observations in our logs, the 2 objects in question are:

[duplicati-20181122T130004Z.dlist.zip](http://duplicati-20181122t130004z.dlist.zip/).aes in bucket 59489 dp-maggie

and

[duplicati-20181122T070000Z.dlist.zip](http://duplicati-20181122t070000z.dlist.zip/).aes in bucket 54866 dp-marge

Thanks,
Jim

And when I asked where I can get those files from?

 you are correct in terms Duplicati generating those files and storing them in Wasabi. Let me reach out to the Duplicati folks and see what they recommend. Sorry for the inconvenience

Jim

To me that points at an issue at their end as I wasn’t the only one it seems.

I do have some local backups from each night though I’d need to look if they include the profile folder holding the database at the time - I use a service so it’s stored in the Windows folder somewhere. If I got the database files back, how could we rebuild the file from it?


#23

Let me see if I can get @JonMikelV in on this due to past experience (and tests) with rebuilding dlist (and I think dindex) files on the backend, however going this route would make me even more nervous about what might happen (see my point on copying off the destination and using --dry-run). There have been cases of restoring an old copy of the database from an image backup, doing repair, and having the result be making the destination look like the database (basically deleting new versions), whereas one would like the reverse.

Feel free to start seeing if you have a suitably recent local backup of the database for remote. Maybe it can also help in some other way. Meanwhile, maybe I can find some of these other forum posts that I’m recalling.