Backup storage corruption, any way to continue using backup set?

Hi,

One of my backups is on an external USB drive. During a backup a few weeks ago, the drive got disconnected. The usb drive has an NTFS filesystem, which got corrupted. Some files (dblock.zip.aes) were missing/empty, a file system check found and recovered errors. The operating system is Ubuntu 20.04.4, Duplicati 2.0.6.3_beta_2021-06-17.
I went through a whole set of operations:

  • create new backup or delete the last backup: failed due to missing remote files
  • repair: error due to missingdblockfiles
  • repair with --rebuild-missing-dblock-files: Failed with several messages: Repair not possible, missing {n} blocks., “If you want to continue working with the database, you can use the “list-broken-files” and “purge-broken-files” commands to purge the missing data from the database and the remote storage.”
  • list-broken/purge-broken-files: This failed with a message ErrorID: CannotPurgeWithOrphans
    Unable to start the purge process as there are 10321 orphan file(s)
    Return code: 100
  • So I went ahead with Delete and Recreate: This enabled me to run a backup and delete operations again, but a verify failed, so I ended up deleting this new backup, and also the one that got interrupted a few weeks ago. Odd thing I noticed, the sqlite database of this backup was 2.7GiB before the delete and repair, 3.x after the repair… I’m not sure what could make it grow, by just “reverse engineering” what’s on the remote storage.
  • Verify continues to fail with:
    Failed to process file duplicati-i0dfebe65094d4fc7b8990b8f14539041.dindex.zip.aes => Invalid header marker.
    There are 3 files that seem to have this invalid header.

I can now run a backup, but it ends with these messages and non-zero return code although it states the backup was successful.

Verifying remote backup ...
Remote backup verification completed
  Downloading file (unknown) ...
  Downloading file (unknown) ...
  Downloading file (unknown) ...
  Downloading file (unknown) ...
  Downloading file (unknown) ...
Failed to process file duplicati-i0dfebe65094d4fc7b8990b8f14539041.dindex.zip.aes => Invalid header marker
  Downloading file (unknown) ...
  Downloading file (unknown) ...
  Downloading file (unknown) ...
  Downloading file (unknown) ...
  Downloading file (unknown) ...
Failed to process file duplicati-i6ed0c1338d6742a1aab4e8fe76cc047b.dindex.zip.aes => Invalid header marker
  Downloading file (unknown) ...
  Downloading file (unknown) ...
  Downloading file (unknown) ...
  Downloading file (unknown) ...
  Downloading file (unknown) ...
Failed to process file duplicati-i28de5ac9b340448caf6ba2c234c9409d.dindex.zip.aes => Invalid header marker
  Downloading file (92,54 MB) ...
  Downloading file (187,54 KB) ...
  Downloading file (249,91 MB) ...
  0 files need to be examined (0 bytes)
  Duration of backup: 03:23:32
  Remote files: 4473
  Remote size: 542,24 GB
  Total remote quota: 1,82 TB
  Available remote quota: 242,81 GB
  Files added: 38799
  Files deleted: 19611
  Files changed: 4710
  Data uploaded: 65,39 GB
  Data downloaded: 342,63 MB
Backup completed successfully!
Return code: 3

What is the situation now? As I understand the dindex files are just a lookup table to go from block hash to dblock file quickly; so that sound to me like if you “process” all dblocks files, dindex files can be regenerated. But how can this be done?
In addition, is there any chance there are more faults in the (remote storage), that have gone undetected by list-broken-files and verify?

1 Like

Ouch. Trying to recover from broken partial bits will be tough.
Do you have space to relegate this one to old-file restores by

Duplicati.CommandLine.RecoveryTool.exe

This tool can be used in very specific situations, where you have to restore data from a corrupted backup. The procedure for recovering from this scenario is covered in Disaster Recovery.

and this tool doesn’t look at dindex files at all, and it does its best if some of the dblocks are missing.

Probably from file damage. An intact file has AES at the start.
How many files did you ask to verify in Advanced options?
Verification is sampled, so there can be other damaged files.

backup-test-samples
backup-test-percentage

Running Repair (when things are in running shape…) will do that, based on DB contents.
Delete and Recreate lost DB information, and IIRC (I might test) seeing dblock didn’t help.
I’m actually surprised Recreate finished with missing dblock files. You wrote no messages.

The repair did show errors (didn’t include all messages in my post because it’s a >2000 line notes file)

-> completed with 13 errors:
2022-06-02 23:55:09 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-be58d04abcd824fae90187a3b3297a8cf.dblock.zip.aes by duplicati-ie8f01d8c7976446582e6e077b07d2aff.dindex.zip.aes, but not found in list, registering a missing remote file
2022-06-02 23:55:13 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-be5c6fbf87e48487d8c80e015223cfb5a.dblock.zip.aes by duplicati-iea0f279c7afe4bf1a52b35b1a551cf9e.dindex.zip.aes, but not found in list, registering a missing remote file
2022-06-02 23:57:06 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-be5c3ca7c2025473ba8cb269f5dbc9079.dblock.zip.aes by duplicati-i9f915657366e45b3b559399b046193c0.dindex.zip.aes, but not found in list, registering a missing remote file
2022-06-03 00:00:46 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-bdd8944a86423434bbdc1a30147c20e75.dblock.zip.aes by duplicati-iccb0b97f33894919892f4a1c0025db8d.dindex.zip.aes, but not found in list, registering a missing remote file
2022-06-03 00:06:32 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-be5f73dfdd52042d1965391a9424b2cc7.dblock.zip.aes by duplicati-i12401a4742404a2cb9df4c599fa0df58.dindex.zip.aes, but not found in list, registering a missing remote file
2022-06-03 00:08:33 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-be5dd8efcd37e40a49cf3ee1d15e48f13.dblock.zip.aes by duplicati-ifaa9ccf615874954b577c60697c486e5.dindex.zip.aes, but not found in list, registering a missing remote file
2022-06-03 00:09:25 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-be60437f589984c23a987286116af49d3.dblock.zip.aes by duplicati-i27c15306906e45b38b8b9a3484c8a2f2.dindex.zip.aes, but not found in list, registering a missing remote file
2022-06-03 00:10:27 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-be587eec0cd4b43039839782aba473528.dblock.zip.aes by duplicati-iefc32b81f63d40d9a443a81664f90ba0.dindex.zip.aes, but not found in list, registering a missing remote file
2022-06-03 00:11:47 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-IndexFileProcessingFailed]: Failed to process index file: duplicati-i0dfebe65094d4fc7b8990b8f14539041.dindex.zip.aes
2022-06-03 00:12:28 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-be5a830e18c73498186c4c63fb32c30ef.dblock.zip.aes by duplicati-i366755e2b2ae4cf0b73251d5f9563a11.dindex.zip.aes, but not found in list, registering a missing remote file
2022-06-03 00:13:20 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-IndexFileProcessingFailed]: Failed to process index file: duplicati-i6ed0c1338d6742a1aab4e8fe76cc047b.dindex.zip.aes
2022-06-03 00:14:04 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-IndexFileProcessingFailed]: Failed to process index file: duplicati-i28de5ac9b340448caf6ba2c234c9409d.dindex.zip.aes
2022-06-03 00:14:20 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-be6115c059c2a4f2b8c0671220fe24dfe.dblock.zip.aes by duplicati-i010424e8ed3840b5a931aa1b52cb4366.dindex.zip.aes, but not found in list, registering a missing remote file

I’ll run a verify now with “–backup-test-percentage=100”
I have to say I skipped this option on the verify, because both the name and the documentation seem to suggest it only impacts the verify after a backup operation? I kind of assumed that “verify” does a 100% verify all the time.

After a backup is completed, some (dblock, dindex, dlist) files from the remote backend are selected for verification. Use this option to specify the percentage (between 0 and 100) of files to test. If the backup-test-samples option is also provided, the number of samples tested is the maximum implied by the two options. If the no-backend-verification option is provided, no remote files are verified.

Oh, that Verify went surprisingly quick:

  Downloading file (249,92 KB) ...
  Downloading file (249,92 KB) ...
  Downloading file (249,92 KB) ...
  Downloading file (249,92 KB) ...
  Downloading file (249,92 KB) ...
Failed to process file duplicati-i0dfebe65094d4fc7b8990b8f14539041.dindex.zip.aes => Invalid header marker
  Downloading file (481,83 KB) ...
  Downloading file (481,83 KB) ...
  Downloading file (481,83 KB) ...
  Downloading file (481,83 KB) ...
  Downloading file (481,83 KB) ...
Failed to process file duplicati-i6ed0c1338d6742a1aab4e8fe76cc047b.dindex.zip.aes => Invalid header marker
  Downloading file (488,42 KB) ...
  Downloading file (488,42 KB) ...
  Downloading file (488,42 KB) ...
  Downloading file (488,42 KB) ...
  Downloading file (488,42 KB) ...
Failed to process file duplicati-i28de5ac9b340448caf6ba2c234c9409d.dindex.zip.aes => Invalid header marker
  Downloading file (84,50 MB) ...
  Downloading file (88,53 KB) ...
  Downloading file (249,95 MB) ...
duplicati-i0dfebe65094d4fc7b8990b8f14539041.dindex.zip.aes: 1 errors
	Error: Invalid header marker

duplicati-i6ed0c1338d6742a1aab4e8fe76cc047b.dindex.zip.aes: 1 errors
	Error: Invalid header marker

duplicati-i28de5ac9b340448caf6ba2c234c9409d.dindex.zip.aes: 1 errors
	Error: Invalid header marker

duplicati-20210614T123543Z.dlist.zip.aes: 17931 errors
	Extra: /etc/java-11-openjdk/security/default.policy
	Extra: /etc/java-11-openjdk/security/java.security
	Extra: /home/tom/.cache/chromium/Default/Cache/01e9c4f33b3090df_0
	Extra: /home/tom/.cache/chromium/Default/Cache/06979b06c170bbf9_0
	Extra: /home/tom/.cache/chromium/Default/Cache/06dfbc4a477f7ea2_0
	Extra: /home/tom/.cache/chromium/Default/Cache/089add6663a2fc41_0
	Extra: /home/tom/.cache/chromium/Default/Cache/0b0cdeab0ef9fcb4_0
	Extra: /home/tom/.cache/chromium/Default/Cache/0c19838925457cca_0
	Extra: /home/tom/.cache/chromium/Default/Cache/0d0ae5570c9ed2f4_0
	Extra: /home/tom/.cache/chromium/Default/Cache/0f6c8f2582f28f48_0
	... and 17921 more

Return code: 3

Just by the run time of this verify I know it couldn’t have check all 500GiB of files.

PS, to come back to your question:

Do you have space to relegate this one to old-file restores by RecoveryTool

No I don’t have large enough drives (well, with enough free disk-space) to restore entire backups now, and I currently have no need to recover specific files or folders, my question is mostly if I can continue taking backups with this set, or should just restart entirely and lose the past year monthly backups.

--backup-test-percentage applies to backup and not verify.

so the thing to do would be to backup with larger --backup-test-percentage to verify more files.
This verification is still limited by loss of records of what should be there. That was in the database.

What I see so far are dindex files that likely got corrupted, and some where indexed dblock is gone.
What might not be found is how many dblock files are totally lost (along with the dindex for that file).
I would have expected Recreate to issue an error if a dlist couldn’t find a referenced block though.

How the backup process works explains this, but basically the dlist references blocks in dblock files.

The dindex files ideally tell Recreate the dblock location of every block forming a file in the dlist files.
If a dindex file is missing, Duplicati searches all the dblock files, and it may take lots of downloading.
A slow Internet connection could easily take days, so Verify button doesn’t just download all the files.

The TEST command (e.g. run in Commandline) can verify files as you wish, without running backup.

Just for followup in case anyone ends up reading this after ending up in similar situation:
I have not been able to get this backup working again (as in getting through backup and a verify cycle without getting a bunch of errors on at least one of them). So I wiped the USB disk and started over with this backup, losing all backups since about 2 years.
I would have preferred Duplicati to somehow handle corruptions more gracefully: eg marking backups as “incomplete”, maybe even listing files that have now become unrecoverable. But if I run a new backup, at least I can be sure that all files in the new backup set will be OK.

PS: I find the information on how the backup process works really interesting, but it stops unfortunately too early: eg there’s nothing in there to understand how the local (sqlite) database is populated; and how (or if) there are tools for recovering data in either different scenarios.

The nearest thing is an interrupted backup is marked “partial”. “incomplete” doesn’t really fit “broken”.
One also can’t really identify which backup version is incomplete because the new ones build on old.

image

That’s the list-broken-files that you tried, unfortunately finding it stopped by some deeper problem.

Few people want deep information, but the contents of the database are described. How the code gets there is described in the code though. FWIW, here’s some developer-level documentation on database:

Developer documentation
Documentation for the local database format
Is there an ER Diagram and Data Dictionary?
Database rebuild

So would we all, but until some people from the community skill up and dive in, all we can do is hope.
As a free community project, Duplicati only exists and perhaps improves thanks to volunteers helping.

There are lots of openings in development, documentation, test, forum, etc. Thanks to those who help.