Backup valid, but still unrestorable?

This is likely to be related somehow, to compaction and this data set getting “broken” whatever it means after all.

The initial situation was caused by terminated compaction. (Generated by earlier versions, which did have the journal issue). I’ve updated since that. Before running repair, compact, and rest of the operations. My activity was triggered due the index error below.

Missing file: duplicati-i12764fec0f664e6185a416acf7b1850e.dindex.zip.aes
Found 1 files that are missing from the remote storage, please run repair

And repair uploaded 541 byte file, as said, I’ve earlier connected this to restore failure.

  Listing remote folder ...
  Uploading file (541 tavua) ...

As we all know, this is basically empty place holder, which shouldn’t hurt.

After this compaction finished without problems.

And then we’re in the situation I described bit earlier.

I also decrypted the “broken” b0366 … file, and zip verify passes and extrating it passes also without any problems.

If necessary, I can privately share the dlist and the b0366 index (not data), if anyone thinks that would be helpful. Those must be kept private and deleted when strictly not needed for debugging / analysis, but technically those do not contain anything particularly sensitive information.

… Then to the testing and fixing …

I took copy of the backup set, tried to restore it, just as previous. → Fail

Then I did run repair to rebuild the database locally, then I did run list-broken-files, prune-broken-files and repair again.

And then I decrypted the dlist files, and now I can diff these with the original one from the “broken” set. There are some differences, because data got deleted in the process.

Restore now “works”, by running repair, removing some data from the backups and then running restore. But now at least some data (ie one large file) is missing. It’s still not a good result at all. Restore should work without deleting data from backup.

Technically the content of the data file does not matter, what matters is its zip directory. If you can share that (it’s basically a list of hashes…), it’s enough. Like that:

unzip duplicati-b5dd653ae13bd4a1f837f0c9daa8cc1f7.dblock.zip
Archive:  duplicati-b5dd653ae13bd4a1f837f0c9daa8cc1f7.dblock.zip
  inflating: manifest                
  inflating: Ma3-QBwsRK0_BJSfeLtsnwAIZZHt6jIOzfu5o3otqJY=  
  inflating: AAFRcFERQnUtY1EC6zDh1lvNRwfzxKXYAmG5gqogqwI=  
  inflating: SRLfjQeJoK24g05Jww_vTuReIOpxyx9np8Hfum9pFKo=  
  inflating: GREhNzIj0EAgYjatcoSvR3J0s5-vy7lctkcSvUyFTIE=  
  inflating: CUJRLy8eWJ9R1lPZpk5Wi0esDqB1daiFMZXArhC4LPM=  
  inflating: z8LlKVvckpKACdgauIB-_oDcRK1lPe3OQYoa45wJaxc=  
  inflating: LJK1DgZH1lnz0vMq8Q_U6wbTe0WrJU026JBl56jpHqo=  
  inflating: KYdoJQxT2oIDxM7jZDlD0ocEhQogIFW1mRvVW37ffuQ=  
  inflating: u8vkG0O4L01uhzeJQyEZqEz_2Z_GFLBzXy0m6Lvn8Yg=  
  inflating: ZkkgPs4esd_qam2Syb3w9j5R2nGw2Hy7N_Y1SCyrXVk=  
  inflating: msTy9q-kd_8E0SSmS7W-ad-e7s8cjIls0ISdLuwDIbI=  
  inflating: -ZuLHRZdRosror_hwXHU4PpVvQ8sxQXl997JlxU_q2o=  
  inflating: YNTn65Qjs_bsx_eMp9nYR_a0S4ZSl6Ow1dfUIPwobcM=  
  inflating: m5f-9yMcej4oI_ZGMQf6quyqkvPqpUWuirSPwey-4YI=  
  inflating: F7_5qJbDY9qehN8SgatEiZWUvtscVAzVKL3_hq_vAUs=  
  inflating: 50yObb4sRUnzefC_-GQgJvuYDiK1AOHF0m2G9LSqH0A=  

(etc… the full list should be available, with the dindex file)

In this report of what works and what doesn’t work, it would be helpful to distinguish whether the act is before or after the failed restore. Yes, it should detect before, but fixing up unexpected fail is good too.

Please see what (if anything) seems more effective after it’s actually broken. This might depend on the process. I forget whether Direct restore database can be seen, but a normal recreate certainly can.

I’ll need to read the rest of this again, since it’s been over a week now. I might edit it some more notes.

Well, always when I run restore, I run restore with no-local-db option, because restoring with local-db is kind of pointless, because … yeah no need to repeat obvious things.

Here is the build (valid for 3 days):

But I do have great news.

2024-01-15 08:31:31 +02 - [Warning-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-WrongBlocklistHashes]: duplicati-i2f4f78e1f1e44044aadd3b82a9c6b315.dindex.zip.aes had invalid blocklists which could not be used. Consider deleting this index file and run repair to recreate it.

It detected offending dindex file and successfully restored the backup.

Just to be 100% sure, I retested with the exactly same command line with the canary version and it fails saying:

2024-01-15 08.44.53 +02 - [Warning-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-UpdatingTables]: Unexpected changes caused by block duplicati-b0366966e93db4129b6ce9f5798fbe9d2.dblock.zip.aes

This is good, really good.

It just makes me smile, how good my educated guesses about the issue causes were earlier. :wink: Yet as I said, I’ve been there and done that, broken transaction stuff I mean.

That is good to hear (and lucky timing that we just found this). That unfortunately confirms my theory that this bug will happen without any warning in existing backups, but luckily it can be fixed in retrospect.

thanks for reporting, I hope to get a Canary out during the next month with this fix.

This one was deficient SQL, but as an SQL dev you’ve probably seen that too.

I guess this means at DB recreate time? As a hidden defect in a dindex, they’re hard to see otherwise. Does anyone want to actually look at the dindex to see if it fits the predicted corruption, or is it certain?

Does that mean after the current one goes Beta, or is it a second Canary (meaning a change of plan)?

EDIT 1:

An earlier “in retrospect” chance would be in a test operation using --full-remote-verification, maybe even taking advantage of the latest value (whose name I forgot, and which isn’t in manual yet) which doesn’t require the heavy load of downloading all dblocks. Maybe this works now – I didn’t test.

EDIT 2:

Canary option value is ListAndIndexes according to the command line help. Also mentioned above.

EDIT 3:

The test operation would also want all if this was done, otherwise it’d be hit-or-miss whether it saw it. Repair command seems like it wouldn’t be able to see it, as it wouldn’t be looking inside dindex files…

EDIT 4:

Or the other view is that one should occasionally recreate to see what happens. If it says fix the dindex, then do it. The original database should have the right info. Has anyone tested the recreated database?