Hash mismatch on non-existent file


#1

I read through all the hash mismatch threads but can’t find anything that applies to my situation. I’m getting this error at the end of every run:

2018-11-20 15:35:09Z - Warning: Operation Get with file duplicati-b7ebba9d6fa424f959a49934a04d0f206.dblock.zip attempt 5 of 5 failed with message: Hash mismatch on file "C:\WINDOWS\TEMP\dup-50c2b314-295f-4f67-8935-05708d538f0a", recorded hash: FnrKZDR/CpPOYwrqwl+niHSKTp/DUsAx77NJqQ+z1bY=, actual hash Tq6vpP+Kl5J4xrzosCN6Bje8aNXNqblANo9PYNhUv7U=
Duplicati.Library.Main.BackendManager+HashMismatchException: Hash mismatch on file "C:\WINDOWS\TEMP\dup-50c2b314-295f-4f67-8935-05708d538f0a", recorded hash: FnrKZDR/CpPOYwrqwl+niHSKTp/DUsAx77NJqQ+z1bY=, actual hash Tq6vpP+Kl5J4xrzosCN6Bje8aNXNqblANo9PYNhUv7U=
   at Duplicati.Library.Main.BackendManager.DoGet(FileEntryItem item)
   at Duplicati.Library.Main.BackendManager.ThreadRun()

2018-11-20 15:35:13Z - Error: Fatal error
Duplicati.Library.Main.BackendManager+HashMismatchException: Hash mismatch on file "C:\WINDOWS\TEMP\dup-50c2b314-295f-4f67-8935-05708d538f0a", recorded hash: FnrKZDR/CpPOYwrqwl+niHSKTp/DUsAx77NJqQ+z1bY=, actual hash Tq6vpP+Kl5J4xrzosCN6Bje8aNXNqblANo9PYNhUv7U=
   at Duplicati.Library.Main.AsyncDownloader.AsyncDownloaderEnumerator.AsyncDownloadedFile.get_TempFile()
   at Duplicati.Library.Main.Operation.CompactHandler.DoCompact(LocalDeleteDatabase db, Boolean hasVerifiedBackend, IDbTransaction& transaction, BackendManager sharedBackend)
   at Duplicati.Library.Main.Operation.DeleteHandler.DoRun(LocalDeleteDatabase db, IDbTransaction& transaction, Boolean hasVerifiedBacked, Boolean forceCompact, BackendManager sharedManager)
   at Duplicati.Library.Main.Operation.BackupHandler.CompactIfRequired(BackendManager backend, Int64 lastVolumeSize)
   at Duplicati.Library.Main.Operation.BackupHandler.Run(String[] sources, IFilter filter)

This is backing up unencrypted over a LAN. I went looking for the file listed in the log - b7ebba* and it doesn’t exist anywhere. The Find command doesn’t find it and manually looking at the drive in Windows doesn’t show that file either.

So what’s happening? How am I getting a hash mismatch on something that’s not even there? And, more importantly, is this a real error as in my backups may be corrupted or is it a bug of some kind I can safely ignore.

List-broken-files shows no broken file and Verify shows no errors either. Thanks.

ETA I ran AFFECTED and, after seeing which files would be affected, I also got this error in case it’s relevant:

OnFinish callback True failed: Cannot access a disposed object. Object name: ‘SQLiteConnection’. => Cannot access a disposed object. Object name: ‘SQLiteConnection’.

Return code: 0


#2

Did you ever find the seemingly non-existent dblock file mentioned? Does the timestamp and/or name on the C:\WINDOWS\TEMP file ever change? Normally I’d have thought temp file would get a new name and current timestamp every download, however that still leaves the question of where it’s coming from. You could install Sysinternals Process Monitor to watch duplicati-b7ebba9d6fa424f959a49934a04d0f206.dblock.zip.

Is the TEMP file actually there? Because it appears you’re not running encryption, you could also just copy it somewhere, add a .zip suffix, and see whether a test by a zip program such as 7-Zip has any comments on it.

I’m not sure what to say about AFFECTED. I can find no other reports in Duplicati forum or issues, however I found some on the general Internet (although not enough that I can figure out how error relates to Duplicati).

Do you have any other backup jobs running on the same Duplicati installation? Is destination an SMB share?

For testing purposes, you could set –backup-test-samples=0 in your job, to skip Verifying backend files after the backup (does this seem visually to be the failure spot?), but this is supposed to be sampled, not one file over and over again. If need be, we could probably open up the database, to see what it says about that file.

Trying the regular backup from the web UI Commandline option can sometimes get a better view of an issue, especially if one runs with high logging such as --log-level=Information or maybe even Profiling which is a lot.


#3

No, it only “exists” in the log but doesn’t exist anywhere else.

Does the timestamp and/or name on the C:\WINDOWS\TEMP file ever change?
Yes. This is a daily backup so the name and timestamp change.

I have a process monitor but where would I tell it look? That file isn’t on any of my systems.

The log records 5 failure attempts in a row (failure 1 of 5, 2 of 5 etc.) with that “ghost” dblock file and 5 different TEMP files, none of which are leftover in the TEMP directory at the end of the process.

Yes, one other job running twice a day to the same destination and it works fine. The destination is a Windows Network drive on my LAN.

The only difference between the two jobs (other than the times they run obviously) is that the job with the hash mismatch is backing up one very large file (like 100g) whereas the other job is a variety of regular-sized files. No clue if that makes a difference or not but it’s literally the only difference between the two jobs and is actually the reason I broke it into two jobs - I wanted the “one big file” job to run at a time when the machine was unused whereas I didn’t care when the other job ran.

I tried running with verification and hash checks disabled and that allowed the job to finish a few times though once I still got a hash error on that file anyway.

What would I look for in the database?

And I’ll try to up my log level and see if there’s more detail about that file then.


#4

You don’t need to give a full path. Just use “contains” to see if it sees it somewhere you don’t.
I just ran a test and it seems to even pick up accesses to shares, and gives me the UNC path.
Because you get this at the end of a long backup, you might also want Filter --> Drop Filtered Events set in order to not get an overwhelming amount of data, although dropping may mean more runs with other filters.

I suppose you could also watch C:\WINDOWS\TEMP although the file name won’t be predictable. Doing this on another backup right after a completed backup would lower the number of files you’d see going through.

Alternatively, if we can convince ourselves that this is the sampled test of backend files, run that separately.
The TEST command gives the syntax, but the easiest way to run is from the backup’s web UI Commandline, probably just changing the Commandline arguments box from pathnames to 1, for the number of samples.

Were they –no-backend-verification and –skip-file-hash-checks? I hope they didn’t allow any silent damage.

duplicati-b7ebba9d6fa424f959a49934a04d0f206.dblock.zip as Name in RemoteVolume table, and perhaps see if it’s in any unusual State. Normally, most remote volumes are in Verified, but there are others. VerificationCount is (I think) what forces the random test sample to wander around to try to even the sample which in your case doesn’t seem very even. Are you always seeing this one dblock file getting a hash error?

Windows will use a weak database encryption by default, so you might need to run --unencrypted-database on the Server or Tray Icon in order to get that off before opening with something like DB browser for SQLite.

Your TEMP location makes me think you’ve installed as a Windows service, and if so, you can uninstall and install with the additional option for Server. Current ones are visible in service properties Path to executable.

When you can get a good look at a sequential log, that might also give further clues as to what’s happening. Right now, as you noted, it’s complaining about a seemingly nonsense backend Get. How’d that actually go? Process Monitor might show whether the file is somewhere, how much was read, then how much was written. There’s a Size field in the database, and also a Hash, but we probably can’t hash that file before it vanishes.


#5

It lists itself as Verified with a VerificationCount of 0. Still doesn’t actually exist on the Target drive if that matters.

Yes. My process to date has been to do a list-broken-files followed by a purge-broken-files (I never have anything broken BTW though some files get marked for future deletion). Then I run a Repair on the database then turn backend verification and hash checks back on and run the backup again. This allows it to complete without error, though at some point, that same ghost file hash mismatch shows up again and I have to repeat the process. Should I run a Verify to check for corruption or does the completed backup after verification and hash checks are back on mean that everything is solid?

I’ll report back on the other questions you asked as I gather the data.

TEST command shows no errors.


#6

Do the other ones have higher counts? Possibly it’s hammering this one to try to get it to catch up with them.
FilterByVerificationCount might be the code, and this gets used by SelectTestTargets in the same source file.

Whether or not it truly exists doesn’t matter if Duplicati sees it there and is apparently able to read and write something. Empty file hash is 23b159aee24e6c6fae71127a3f2d5d7dcb94475bfeed454e4b8288c9191f3f2e which is I7FZruJObG+ucRJ6Py1dfcuUR1v+7UVOS4KIyRkfPy4= after Base64 encoding, yet your error said
actual hash Tq6vpP+Kl5J4xrzosCN6Bje8aNXNqblANo9PYNhUv7U= and I doubt that it could hash a missing file, therefore I conclude something got written. How is Duplicati connecting to the share? If you’re running a Windows service with it, it has a potentially completely different set of mappings compared to what you have. The only reasonable way I know to see what’s going on is to use Process Monitor to see what Duplicati sees.

Does the ghost file survive Windows restart? Possibly there are some cached leftovers that that would clear. What’s interesting is that just before the hash check I see a size check, so maybe the ghost has correct size.

Verify would add confidence, and the higher the better (and slower). Ultimately, you can also say “all” instead of a number. “all” allows an extremely high number, but the actual is probably capped by the actual file count. For a better slower verify, add –full-remote-verification to have it look inside instead of just verifying the hash. Instead of taking lots of time on “all”, you could also just raise --backup-test-samples after a return to normal which I think we’re not at yet, correct (much as I would like to see that happen)? Still further debugging to do?

Thank you. This certainly looks odd so far, and maybe some further information will help understand its path.

This seems kind of odd. Even though it’s sampled with a count of 1 by default, I’m not sure why it’s not being tripped up by the same ghost file the way backup is, unless it was run differently, e.g. from the command line.


#7

Actually most of them were zero.

By IP address.

Yes.

It’s not entirely clear. The last two runs haven’t thrown an error. Then again, I’ve had this happen before where it seems like all is fine and then a few days later it throws the hash mismatch on that same file so not sure at this point. I’ll run a Verify with full-remote-verification and see if it throws any errors.


#8

With everything else also at a low VerificationCount (e.g. 0), maybe your random failure is just due to randomly picking that file. There’s still the mystery of where it’s from, so I still suggest Process Monitor, even if it’s filtering and discarding everything except that file. It can be tuned later if anything interesting shows up about that file…

Running a –log-file log on either your backup or your server and at at least –log-level=Information (if on 2.0.3.3 beta – if newer use –log-file-log-level) would be another fairly painless way to keep an eye on usages of the file.

–full-remote-verification is better, slower, and no substitute for a “full”-as-in-“all” check, if you’re trying to survey, however actually using “all” on the verification could take awhile, so you might want to survey a smaller number.


#9

I’m not sure exactly how to have Process Monitor do what you’re asking. Is there a “watch” feature of some kind that I’m unaware of?


#10

Watch is pretty much its main feature. Sysinternals Process Monitor is what I mean, in case there’s another.

When running, it needs enough privilege to monitor Duplicati. If Duplicati is Tray Icon, that’s easy. If service, then you probably need “Run as Administrator”. Use the Filter option to set Path contains , Add. Set Filter --> Drop filtered events (if you like, to avoid collecting too much). You get a Capture toggle and a Clear button.


#11

Ah - oops, mixed it up with Sysinternals Process Explorer. Have the right one now and will see if that file shows up at some point.


#12

Okay I think I know what’s going on and can point you to the source of the bug. I have two backup jobs from the same computer - different source files/folders and different target directories but same machine. Job 1 always backs up fine with no issues whatsoever. Job 2 is the one with the hash mismatch.

After using Process Monitor, I noticed that the nonexistent file actually does exist - in Job 1 (the one with no errors). I began to wonder if Job 1 was leaving some “residue” in the Windows/Temp directory then, when Job 2 started, it picked up the residue, said “hey this file isn’t part of my backup job” and threw the error.

To test this, I stopped the Duplicati Service, deleted everything in Windows/Temp and ran Job 2. No errors. To further test, I used the tempdir option to give Job 2 its own temp directory then ran both jobs back to back. Once again, no errors.

So there’s the issue. I think the solution, other than having per-job Temp directories, is either full cleanup of the Temp directory upon job completion or something within Duplicati that can distinguish a Temp file created by a prior job from Temp files created by the current running job.


#13

Interesting – and surprising to me. I don’t know how the temporary file tracking is done, but I have 97 leftover files now without visible problem other than space waste. See Duplicati temp dup-xxxx files not being deleted. There are enough jobs sharing this system that it seems like there statistically ought to be blended leftovers.

On the other hand, your finding is interesting and maybe someone will chase it. I’m glad you found a way out.

EDIT: We looked over the job 2 database at the RemoteVolume row for the ghost dblock file, which was there. Hearing that it belongs to job 1, I wonder what job 1’s RemoteVolume table thinks about that file? Collisions in filename are theoretically possible, I suppose, but is there any common history at all between these two jobs? Collisions in temp file names may also be theoretically possible, but as you noted, the filename changes a lot. One common point (I assume) is that both jobs share a Duplicati Server. Maybe some cross-connect is there, although your separating temp files may show otherwise. Basically this is confusing my limited understanding.

Although it may be impractical, if you can come up with reproducible steps, a GitHub issue can track that well.