Internal consistency check failed, generated index block has wrong hash

drwtsn32 · October 21, 2019, 9:56pm

I am playing around with the latest Canary 2.0.4.31 with a test copy of one of my production backup sets.

I have done various purge operations, backups, database recreations, restores, etc.

But now I’m testing the deletion of dindex files to see if they are recreated properly. I deleted 1550 dindex files then ran another backup. Duplicati detected the missing files and prompted me to Repair. I ran the Repair and 1548 of the 1550 files were recreated without issue. Two of the files caused this type of error:

2019-10-21 14:37:32 -07 - [Error-Duplicati.Library.Main.Operation.RepairHandler-CleanupMissingFileError]: Failed to perform cleanup for missing file: duplicati-i40c3ea119a344bfa8d5e8e7faed60c24.dindex.zip.aes, message: Internal consistency check failed, generated index block has wrong hash, JK08SX0qCZzPHTVQ2tSqICHbTx0ddPiRp4YcXwHjkB4= vs yuJVEmzJrjO2iD9aWhWzi7XKPJ+SKReFmnrEIFYOLOs=
2019-10-21 14:37:32 -07 - [Error-Duplicati.Library.Main.Operation.RepairHandler-CleanupMissingFileError]: Failed to perform cleanup for missing file: duplicati-i94d4b285035b4fc59d77625e182ae5ce.dindex.zip.aes, message: Internal consistency check failed, generated index block has wrong hash, i+LMfmCATFXwu7A3YGHVGkU0iqD4NH1rqzZfW1Q3xO0= vs tpI5x47zReN2kh4Cf1Uu8gPLqsXhgYLBXf3Vs0k/Y2o=

Anyone know what the root cause of this is, and if there is any way to recover?

ts678 · October 22, 2019, 12:41am

Internal consistency check failed, generated index block has wrong hash #3202 had multiple people working on a similar issue, but the error can happen in two places, so I’m not sure yours is the same. Yours was in running Repair, but at least some of the other ones were in Backup. The code is similar.

github.com

duplicati/duplicati/blob/91505e62238a68f020d99bd9a7c1a2be8e3b25cf/Duplicati/Library/Main/Operation/Common/IndexVolumeCreator.cs#L150-L157


      
          if (options.IndexfilePolicy == Options.IndexFileStrategy.Full)

              foreach(var b in await database.GetBlocklistsAsync(blockvolume.ID, options.Blocksize, options.BlockhashSize))

              {

                  var bh = Convert.ToBase64String(h.ComputeHash(b.Item2, 0, b.Item3));

                  if (bh != b.Item1)

                      throw new Exception(string.Format("Internal consistency check failed, generated index block has wrong hash, {0} vs {1}", bh, b.Item1));

                  w.WriteBlocklist(b.Item1, b.Item2, 0, b.Item3);

              }

github.com

duplicati/duplicati/blob/91505e62238a68f020d99bd9a7c1a2be8e3b25cf/Duplicati/Library/Main/Operation/RepairHandler.cs#L324-L333


      
              if (m_options.IndexfilePolicy == Options.IndexFileStrategy.Full)

                  foreach (var b in db.GetBlocklists(volumeid, m_options.Blocksize, hashsize))

                  {

                      var bh = Convert.ToBase64String(h.ComputeHash(b.Item2, 0, b.Item3));

                      if (bh != b.Item1)

                          throw new Exception(string.Format("Internal consistency check failed, generated index block has wrong hash, {0} vs {1}", bh, b.Item1));

          

                      w.WriteBlocklist(b.Item1, b.Item2, 0, b.Item3);

                  }

          }

I’m not familiar with this code, but the GitHub issue can probably supplement my/your attempt to read.

Was the production backup made before 2.0.4.31 even though you’re playing with a copy on 2.0.4.31?

Is the problem described reproducible on 2.0.4.31? If so, can you get a better log and stack trace of it?

If you have a debugger or a debug build, some of the variables in the code might add to the message, although if you get a DB browser you can probably also use the “should-have” hash to get some clues.

A DB bug report might also help, although some recent attempts have found that they’re slow to create.

Does it happen on 2.0.4.30? I’d prefer to know that it’s an old but rare bug, not a regression in 2.0.4.31.

drwtsn32 · October 22, 2019, 1:34am

Yes, the backups were made prior to 2.0.4.31.

Unknown as I cannot run my exact tests due to 2.0.4.30 having a problem with the Purge operation.

I will dig deeper, perhaps trying your suggestion of using the debugger. Thanks…

drwtsn32 · October 22, 2019, 3:22am

Was poking around the job database and I noticed that two hashes on the right are not unique in the BlocklistHash table.

For instance:

JK08SX0qCZzPHTVQ2tSqICHbTx0ddPiRp4YcXwHjkB4= vs yuJVEmzJrjO2iD9aWhWzi7XKPJ+SKReFmnrEIFYOLOs=

Capture

Is it ok for a particular hash to appear more than once in this table?

ts678 · October 22, 2019, 3:38pm

Possibly in some cases, but it may be rare, and I’m not an expert. Nevertheless, let me try explanations.

The BlocklistHash Index is likely for blocklists that exceed 1 block for the file, like BlocksetEntry Index is for file data that exceeds 1 block. Big files are not unusual, and you can look up BlocksetID in File table to see if the path makes sense. The SHA-256 hash for a block is stored directly in the blocklist without a Base64 encoding, so at 32 bytes you can get about 3200 of them in a block, each representing a 100KB source file block, so the file here is probably at least 1 GB if I did the math right. You might have two files which have a common section appearing midway. Is the Hash for earlier and later Index values different between the two files? Two files that are totally identical usually just point to the same blockset I believe.

Duplicati’s deduplication mechanism may worry most about large savings such as duplicated source file data blocks. Duplicate blocks with the small data block hashes might not be worth another dedup stage.

drwtsn32 · October 22, 2019, 7:42pm

Thanks for your thoughts. I am very unfamiliar with Duplicati at this level. I’ll try to do more debugging when I have time, and also see if the problem is reproducible on a previous version of Duplicati - like 2.0.4.30 as you suggested. I won’t be able to run the purge tests but I can test the dindex recreation to see if it fails.

ts678 · October 22, 2019, 8:14pm

Depending on your install situation, I was thinking you might be able to just drop 2.0.4.30 where 2.0.4.31 was having trouble with Repair to see if 2.0.4.30 has trouble too. Basically, don’t do all, but start from fail. Whatever you like though.

drwtsn32 · October 22, 2019, 8:21pm

Interesting idea, I will give it a shot…thanks

drwtsn32 · October 23, 2019, 2:34am

Have confirmed it’s not a problem with 2.0.4.31 specifically. I downgraded to 2.0.4.30 and it has the same error. I then reset my test by deleting the test backup data and recopying from a production backup. I then just deleted the dindex files and ran a repair. This time 3 files were flagged with consistency check failures: the same 2 files as before and now a third file that was created in my production backup after I did my earlier test.

The machine running the production backup data I’m testing is using version 2.0.4.29. Wonder if there’s some problem with that version that is leading to incorrect dindex files.

Can someone else running 2.0.4.29 or newer do a test where you delete your dindex files and run a repair?

ts678 · October 23, 2019, 11:45am

Tested this on 2.0.4.31 for backup made on 2.0.4.31. Repair put back 10 dindex files that I had deleted.

v2.0.4.31-2.0.4.31_canary_2019-10-19

Fixed a retry error where uploaded dindex -files would reference non-existing dblock files, thanks @warwickmm

although I don’t know (and haven’t looked at) code enough to say whether that might be related. If your default backup logs show RetryAttempts at anything other than 0, then the possibility of that bug opens.

EDIT:

Possibly this would be the easiest one to track down, e.g. look at its timestamp on destination, work out what backup run made it, look for oddities e.g. non-zero RetryAttempts. Ideally you kept additional logs.

drwtsn32 · October 26, 2019, 4:41am

Just to follow up on this. I repeated my tests using 2.0.4.32 and didn’t have any issues at all. I was able to run multiple purge operations, database recreations, dindex file recreations, etc and everything is working well.

I don’t think any changes were made between 2.0.4.31 and 2.0.4.32 to account for this, so maybe my earlier tests were faulty somehow.