Restore operation failed. Failed to retrieve file xxx.dblock.zip.aes & Message has been altered, do not trust content

Yeah I was confused too on why would that happen.

When it comes to backups, I am very careful with it. I did run a lot of verification jobs before I went ahead and wiped the Linux disk. Not only that, but I even counted the number of files produced by duplicati and made a txt of each name and exact size of each file, and more.

Actually, I have a new problem. Turns out that there are more corrupted files than I expected. It says restoration was successful. When I went ahead and looked through the files, some are literally just 0 byte.

Additionally, duplicati won’t restore other 0 byte files if I tried to restore from the main root of files, I have to restore sub folders one by one.

Honestly, what kind of a piece of s*** is this software? No why I am recommending it. I understand it is in beta too. But apparently, hidden options and flags really says a lot about were its heading.

A sample of 1 set is enough to cover the 1 dlist, but falls way short of covering multiple dindex and dblock.

VerificationCount at 0 means it was never actually sampled, I think. Yes, one can adjust the sample size…

PR for above can help, but instead of trying to guess dblock count, one has to guess the right percentage, assuming the goal is to not sample repeatedly which slows things, but also finds files that break with time.

What said that? You were using RecoveryTool, but I don’t think it does. It also has little for restore selection.

fixed various issues including

but the fix was last month after 2.0.8.1. It’s in v2.0.9.100_canary_2024-05-30, but that just hit Canary, meaning it generally has more unknowns in it than the Beta. OTOH it also has various new bug fixes.

This part sounds like they are but shouldn’t be zero.

This part sounds like they’re intended. I’m confused.

EDIT 1:

I sabotaged a RecoveryTool index.txt file by editing a Base64 SHA-256 hash at start of the line. That got:

0: C:\backup restore\backup source\A.txt (1 bytes) error: System.Exception: Unable to locate block with hash: VZrq0IJk1XldOQlxjN0Fq9SVcuhP5VWQ7vMaiKCP3/0=
   at Duplicati.CommandLine.RecoveryTool.Restore.HashLookupHelper.ReadHash(String hash)
   at Duplicati.CommandLine.RecoveryTool.Restore.Run(List`1 args, Dictionary`2 options, IFilter filter)

and left me with no file rather than an empty file or a correct one with letter A in it. How are you restoring?
Regular restore does sometimes falsely declare success, but only after making a lot of noise before that.

EDIT 2:

You should be able to click whatever you like in the GUI. IIRC, CommandLine is harder. Which did you do?

EDIT 3:

So Linux Duplicati? I see some restore work on Windows. Be careful about mixing OS types for restoring.

EDIT 4:

It “sounds” like Windows previously was just an SMB storage destination, but OP mentioned a Windows restore (following a test restore on Linux, and Linux delete). Which OS is the current restore problem on?

I am talking about now. I used the GUI again. Said restoration was successful. Yet some files were corrupt.

I didn’t understand what you meant. Look, some files like books in .pdf were restored as 0 byte, they aren’t 0 byte. Other files are truly 0 byte files, like files created by programs and intended to be there, yet Duplicati won’t restore these files when restoring the whole back up by selecting the main root folder. You have to select at least sub folders inside the main backup point. The same goes with empty folders.

Which restoration attempt?

I was talking about the GUI.

In short, Duplicati on both Linux & Windows:

Source: Linux
Destination: Windows

Transferring method: Mounting Windows SMB share on linux as a folder: mount -t cifs. Then in Duplicati GUI, Local drive/folder option is selected for the mounted folder.

What happened: Backed up Linux. Tested a restore on Linux in preparation to wipe the disk. Restoration was successful, confirmed by comparing directories and files of both, the backup source and the restored assets. I used Properties menu. I used commands like: tree, diff, rsync etc. I manually checked the most important assets, and they were intact. I used Duplicati GUI and run all sorts of commands and pressed verify backup and other related GUI elements that were related to verification. I run a test for the last time. I rechecked by using the previously mentioned methods, and all were good. Duplicati is idle. I quited Duplicati on linux. unmounted the mounted SMB share successfully. Turned off Linux. Went to sleep. Woke up. Didn’t turn on Linux, I used a USB of a live OS. Wiped Linux. Days later, went to restore from Windows. And here we go.

With what database? Backup on Linux Duplicati would have one with info on backup to Windows SMB.
Location would be in ~/.config/duplicati folder, for whatever user Duplicati was run as (e.g. root or you).

Preserving the old database and putting it back after wipe is probably the nicest way to get going again.
Restoring files if your Duplicati installation is lost could create one temporarily, or Repair for longer use.

I’m not sure if doing all this cross-platform complicates things, and there’s also the manual dblock work, which an old database (if you had one somehow) wouldn’t know of. In my image, note DB holds a hash.

I can poke a bit to see if I can repro an issue but generally I select whatever (even the root) and it works.

What happened to the RecoveryTool work? Did that work out OK, but now you’re testing the usual way?

If need be, there is some good logging at the restored file level, e.g. to pick one that is coming out wrong.
Verbose level would be good. About → Show log → Live or for large logging, log-file and log-file-log-level.

I’m not on Linux much, but I set up the following tree, pointed to the top line in Restore GUI, and restored:

$ ls -lRn BackupRestore
BackupRestore:
total 8
drwxrwxr-x 2 1000 1000 4096 Jun  9 21:21 sub1
drwxrwxr-x 2 1000 1000 4096 Jun  9 21:21 sub2

BackupRestore/sub1:
total 4
-rw-rw-r-- 1 1000 1000 0 Jun  9 21:21 empty1.bin
-rw-rw-r-- 1 1000 1000 6 Jun  9 21:20 file1.txt

BackupRestore/sub2:
total 4
-rw-rw-r-- 1 1000 1000 0 Jun  9 21:21 empty1.bin
-rw-rw-r-- 1 1000 1000 6 Jun  9 21:21 file1.txt

but I guess you’re saying you would have been missing the desired 0 byte files (are others OK) for that?

The FIND command (a.k.a list) can show what size the database has for any file. Is it size 0 there or not?

We haven’t yet gone digging directly in the database, and I’d prefer we don’t, so I’m proposing other ways.

EDIT 1:

Below are some examples of things Restore would say at Verbose log level. If it ended early, that might explain missing empty files. They’re a special case, after regular files. This log is reverse-chronological.


Jun 9, 2024 10:19 PM: The operation Restore has completed
Jun 9, 2024 10:19 PM: Testing restored file integrity: /tmp/BackupRestore/sub1/empty1.bin
Jun 9, 2024 10:19 PM: Testing restored file integrity: /tmp/BackupRestore/sub1/file1.txt
Jun 9, 2024 10:19 PM: Testing restored file integrity: /tmp/BackupRestore/sub2/empty1.bin
Jun 9, 2024 10:19 PM: Testing restored file integrity: /tmp/BackupRestore/sub2/file1.txt
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/sub2/file1.txt
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/sub2/empty1.bin
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/sub2/
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/sub1/file1.txt
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/sub1/empty1.bin
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/sub1/
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/
Jun 9, 2024 10:19 PM: Restoring empty file "/tmp/BackupRestore/sub1/empty1.bin"
Jun 9, 2024 10:19 PM: Restoring empty file "/tmp/BackupRestore/sub2/empty1.bin"
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/sub2/file1.txt
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/sub2/empty1.bin
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/sub2/
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/sub1/file1.txt
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/sub1/empty1.bin
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/sub1/
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/
Jun 9, 2024 10:19 PM: Backend event: Get - Completed: duplicati-b663e535bd645448eb918b6f100d206fa.dblock.zip (2.43 KB)
Jun 9, 2024 10:19 PM: Backend event: Get - Started: duplicati-b663e535bd645448eb918b6f100d206fa.dblock.zip (2.43 KB)
Jun 9, 2024 10:19 PM: 1 remote files are required to restore
Jun 9, 2024 10:19 PM: Target file is patched with some local data: /tmp/BackupRestore/sub1/file1.txt
Jun 9, 2024 10:19 PM: Target file is patched with some local data: /tmp/BackupRestore/sub2/file1.txt
Jun 9, 2024 10:19 PM: Target file does not exist: /tmp/BackupRestore/sub2/file1.txt
Jun 9, 2024 10:19 PM: Target file does not exist: /tmp/BackupRestore/sub1/file1.txt
Jun 9, 2024 10:19 PM: Creating folder: /tmp/BackupRestore/sub1/
Jun 9, 2024 10:19 PM: Creating folder: /tmp/BackupRestore/sub2/

In light of the experience detailed in this topic, I have added a new PR that inverts the local block logic. With this change, local block usage is now opt-in, rather than opt-out, which will hopefully prevent issues like this going forward.

I have added a new PR that uses 0.1% as the default number of remote files to test.

That is a good change. I was working on a selection on the restore page, which defaults to no local blocks, but this way it will also help command line users

I read through the documentation and found two more options that seem relevant:

full-remote-verification

full-block-verification

Not exactly sure what the difference is, but should we enable at least full-remote-verification? I don’t care if the whole backup is downloaded after backup. I backup just once a week like 20GB at night and it’s fine to download it again.
Threads like this make me a bit nervous :slight_smile:

There were no database involved. It is a direct restore. I guess you might be a bit confused, so tell me about what you want to know exactly. For now, I am restoring from Windows via both GUI using restore section, and CLI using RecoveryTools.exe.

No, both GUI and CLI didn’t work well. This is the CLI output for the list command (the missing book as an example):

/home/user/Base/Jellyfin/Library/Books/CODE: The Hidden Language Of Coumputer Hardware And Software.pdf (12.87 MB)

When restoring (GUI & CLI), it says restoration is successful. Going to the file location, this is what I found:

CODE

That’s it. CODE. A 0 byte file.

There were other errors like:

2024-06-10 22:32:45 +03 - [Error-Duplicati.Library.Main.Operation.RestoreHandler-RestoreFileFailed]: Failed to restore file: “E:\Lawnchair Backup May 29, 2024 9:23:29 AM.lawnchairbackup”. Error message was: The filename, directory name, or volume label syntax is incorrect.

IOException: The filename, directory name, or volume label syntax is incorrect.

@ts678 @kenkendk I will go through intensive tests for my backup by using different tools, methods, versions, and document every type of errors I come across. But before I start, I want you guys to check my methods. Tell me if anything is missing, what could I add, and what would be unnecessary or redundant.

Overview:

Platform: Windows 10
Duplicati version: 2.0.8.1_beta_2024-05-07

Available backup versions (using list command):
0: 5/29/2024 9:55:16 AM
1: 5/28/2024 6:32:23 PM

Both versions are going to go through the same restoration tests as mentioned below. The reason for this is I found some minor differences in errors by restoring from different versions.

For GUI, I will be pasting a copied log by using “Copy log” button within “Complete log” section. For CLI, I will be only pasting the errors that shows up.

Other additions:

  1. When the restore operation finish: In File Explorer, with show hidden files enabled, I will use “Properties” option from the context menu, providing the “Size”, “Size on disk”, and “Contains” attributes. I will ignore the extra layers of folders by moving the subfolders and files directly to a root folder.

  2. I will provide a count of files for both backup versions using the command list in Duplicati.CommandLine.RecoveryTools.exe

Things I expect to go wrong:

Decryption errors. Filename, directory name, volume label syntax is incorrect. Corrupted files. True 0 byte files and empty folders are not restored (later on, I expect it be restored if done by selecting sub-folders of the main backup root).

Restoration tests for each version (0 & 1):

Original (Encrypted) Duplicati backup files:

A) GUI: Using ONLY: GUI > Restore > Direct restore from backup files:

  1. Restore by selecting main root

  2. Restore one by one subfolders (within the main root)

  3. Restore failed to restore files. It will be on two parts:
    First, by selecting all un-restored files (at least for one subfolder)
    Second, by selecting only one file for at least 5 samples.

B) CLI:
Suggest steps

Manually decrypted Duplicati backup files:

A) GUI:
Same as in:

A) #GUI

B) CLI:
Suggest steps

Hopefully, I didn’t miss out on any other methods and options. Tell me more of what I can do and add.

This might be a problem because you transferred from Linux to Windows. AFAIK windows doesn’t allow : in filenames, because it is a drive separator. That is a good issue to find, it should be fixable by replacing invalid characters when restoring on a different OS.

This probably means your restore setup doesn’t have an E: drive, so you have to select that drive separately and choose a target folder for it. Otherwise it won’t know where to put the files. This could probably also be handled better, by putting each drive in a subfolder if there are multiple.

I was typing up a similar reply. Here it is.

possibly because

is an illegal filename on Windows. That really didn’t get a message? Source:

Naming Files, Paths, and Namespaces (Microsoft)

The following reserved characters:

  • : (colon)

might be the same problem, if that’s the Linux file name, except this time Windows’ error was reported.

You can probably test with a Command Prompt to verify that Windows rejects file names like the above.

Oh, how I didn’t think of that lol. Yes indeed it didn’t give a message, just said restoration is successful. Only for Lawnchair… file showed the error.

You know what guys? I will move the Duplicati backup files to another Linux installation I have. I will update you guys with what I get.

EDIT:
I checked again by restoring CODE and Lawnchair files. No error showed up for CODE. Only for Lawnchair. (Windows BTW)

So I tried the restoration process again from a Linux OS. The only thing changed is that I am able to restore files with illegal Microsoft characters.

The same issues persists for:
0 byte files and empty folders.
Missing/Corrupt files. (I can Ignore it as the damage been done. It is just that Duplicati didn’t detect it earlier when I run numerous tests and checks :frowning: )

I decided to end this discussion.

It appears that 99.99% (if not 100%) of assets are recoverable due to other copies being available on different mediums (lucky me. But what if there were none?).

Though, the mental exhaustion that I will be going through is unbearable. Having to count asset, asset. Watching video, video. Uncompromising zip, tar (and then do the same for the contents within). Opening document, txt, You know the rest, just to make sure no further damaged occurred.

With Duplicati, I expected a leg on leg moments. No worries. No paranoia. However, I feel it is kind of my fault. I should have went through intensive reading of the Advanced Options section within the Duplicati manual. Though you really guys done me dirty by stating:

Those additional options should only be used with care. For normal operation none of them should ever be required. Use this alphabetical list as a reference to find the advanced options that fit your needs.
~ Advanced Options

LOL.

Forget about what I previously called Duplicati with. Duplicati is a pretty awesome piece of tool. The features it offer are really decent and useful.

Going forward, I will make sure to enable the following options within the configuration of every backup set (can be enabled globally for all backups within Settings > Default options):

  1. backup-test-percentage. I will set it to: 100

  2. block-hash-algorithm. Not that necessary. I will set it to: SHA512. EDIT: don’t use as it may introduce more bugs and instabilities.

  3. file-hash-algorithm. Not that necessary. I will set it to: SHA512. EDIT: don’t use as it may introduce more bugs and instabilities.

  4. full-block-verification. Not sure about it. It is confusing me a bit with full-remote-verification. I will set it to: true

  5. full-remote-verification. Set to: true

  6. list-verify-uploads. Set to: true

  7. no-local-blocks. Absolutely set to: true

  8. no-local-db. Not sure, but I will set it to: true

  9. upload-verification-file. Seems interesting. I will set it to: true

Other options, especially logging and mailing, seems to be useful and I may enable in the future.

For now, I feel confident about these options being enabled, and are enough to begin with, and hopefully keeping my backup sets solid and intact, with no to little failure rates.

I recommend everyone reading this to go through them - Advanced Options - one by one (or at least, skim read. See what peeked your interest).

I wish for some corrections and further guidance on what else is there, and whether I went overkill by using these options.

Thanks!

That is very fortunate and good to hear. Also shows how important it is to have redundancy.

This is especially frustrating, because you should have the hashes of all the files (the list files were not broken) and at least that check should just work.

While I don’t expect this to cause any issues, for the maximum stability I wouldn’t change the hash. The defaults are tested much more, so you might unwillingly uncover more bugs. Also, the bigger hashes take up more space with little benefit.

Otherwise, what I would recommend for stable backups, if your data doesn’t change much, is to keep all versions. In the past, there were several issues with the process of removing and compacting data that is no longer used. Even though the problems you had don’t quite fit with these previously fixed ones, that part is probably still the riskiest.

Just out of curiosity, do you have a rough estimate of how long the broken backup existed since the first version? There were quite some bugs discovered and fixed in the duplicati releases over time, so it would be good to know how recently your problems might have occurred.

These two switches, --full-remote-verification and --full-block-verification work together, and are relevant if you fear that the internal structure of the files is broken.

With --full-remote-verification the download process will decrypt and decompress the files, and check that the contents inside the zip file matches the expected hash, and that all blocks are present in the compressed file. The --full-block-verification changes the number of tested blocks inside the compressed file from 20% to 100%,

However, since the hash of the outer volume is recorded and checked with the normal verification, corruption will be detected by the normal backend consistency checks.

In that case I would use the TEST command separately, and here you can use --full-remote-verification and --full-block-verification. Once a volume has passed the test, there is generally no need to verify it again, as the hash check will reveal modifications.

I have the same feeling. To summarize, there were a few things that conspired to make problems here:

  1. Storage somehow corrupted a number of remote volumes
  2. The check to catch these was running on too low a number of volumes
  3. The restore process was used to test the restore, but was using the local files as source, not the remote

I have submitted a PR that addreses (2) by increasing the number of volumes to 0.1% for each backup. It may require a higher number if the storage medium is not actively managed and checked for errors, but for a balanced approach with users that are on cloud provider storage, I think a percentage at least scales with the backup size.

For (3) I have submitted a PR that disables the logic by default. While it is (presumably) faster to use local blocks, it is clear that the user would expect a successful restore to be an indication that everything is working correctly, and this topic shows it is not.

I really think this is the ideal. We cannot expect people to read and understand the advanced options. The default should be as secure and efficient as possible. Anything offered as additional options should really only be choices that required some understanding of the tradeoffs.

In your particular case, I think it was mostly an attempt to optimize the restore process that got too clever, so it clashed with a reasonable expectation that “restore on machine X means the backup is sound”.

The --backup-test-percentage could also be higher by default, but since you pay for the download from cloud storage, this has been set a bit too low.

I totally understand the desire to set these, I would say, like @Jojo-1000 mentions that not many people have tested another block/file hash algorithm, so I would caution changing that.

The option --full-remote-verification would certainly catch the errors, but only if the --backup-test-percentage is high enough. The --full-block-verification option increases the amount of testing to test every single block in the archive.

Both of these options are intended to check for internal errors in Duplicati, and the corruption you have seen would be detected by the regular hash checks, had --backup-test-percentage been high enough. I think setting it to 100% is a bit over the top, but the actual value depends on how many new files you generate for each backup. Duplicati will default to check new files, so the percentage needs to be higher than at least the percent of added files for each backup, to effectively check existing files.

As mentioned, I would recommend setting up an additional TEST operation that runs less frequently than the backup, and this can use 100% coverage. There is no support for this in the UI as it stands though.

The option --list-verify-uploads is a check for backends that fail to store files. By using this option, each upload will be followed by a listing of the remote destination to check that the files are really there.

The option --no-local-db is not supported for backups, only for restores. When restoring, you can use this option to not rely on the local indexed copy of data, but instead build a fresh database just for the restore operation.

The --upload-verification-file is most useful if you have some kind of access to the storage server, and can run the python verification tool on the destination. If this can be done, you can perform much the same checks as --backup-test-percentage=100 on the remote storage, and thus much more effecient.

Yeah I kinda felt it is overkill. I will fallback to SHA256.

Actually, I decided to go through separated backup sources approach. Some data and files including photos, seems to only change once a year (that if it changed). I think doing so would minimize the risk of a new block of data or a compact operation corrupting static assets that lives with regularly modified data (I will be keeping all backup versions as you suggested). Also, that would be helpful to run prolonged tests especially for big volumes of data in the long run, without wasting time retesting data that otherwise have not been changed.

Do you mean since the creation of backup till the discovery of the problem? If so, the first backup/version was on 29 May 2024 at 9:55:16 AM. I discovered the issue on 8 Jun 2024 at 9:10 AM. So you can say it almost existed for 9 days, 23 hours, 14 minutes and 44 seconds since the first backup version according to timeanddate.

To be fair, I think such an option is quite good. Maybe suggest the user on what type of approach to choose when setting up a restore operation, and inform them with a short description of the pros and cons of these options.

Yeah, I think it will be more safer and better to add a TEST GUI element that executes the test command. Most people may miss on it due to it existing under commandline. Also, they may confuse verify files button with a full back up check like what test would do.

Hm, sorry, but I still don’t seem to really understand it. I ran a TEST command using the full package:

  • backup-test-percentage = 100
  • full-block-verification = true
  • full-remote-verification = true
  • no-local-blocks = true

And I get this output:

Running commandline entry
Finished!

            
  Listing remote folder ...
  Downloading file duplicati-20240610T215939Z.dlist.zip (761,50 KB) ...
  Downloading file duplicati-iadaec9fcc78a40d7a827b8ba05748224.dindex.zip (222,10 KB) ...
  Downloading file duplicati-bcb340916c5ce4471999ab3d4367c648a.dblock.zip (2,00 GB) ...
Examined 3 files and found no errors
Return code: 0

So, it downloaded just a single file dblock, right? I would have expected that the whole backup (40GB) would be downloaded, extracted and rehashed.

Is there a way to see which command line actually was used for the command? Same question for the regular backup job. I would like to see what really happened. I have put log-file-level to “verbose” but still can’t find it.

I ran it again and it picked a different dblock. Even with an error this time (!):

  Listing remote folder ...
  Downloading file duplicati-20240610T215939Z.dlist.zip (761,50 KB) ...
  Downloading file duplicati-id47670d5c1e74e8fbee31b62b5c3d87a.dindex.zip (1,59 MB) ...
  Downloading file duplicati-bbb1b7e5b0a454e4abd471a34d67894c3.dblock.zip (2,00 GB) ...
duplicati-id47670d5c1e74e8fbee31b62b5c3d87a.dindex.zip: 2 errors
        Extra: ELLWpZNLk05pLAj7252gwrBr498jJoZnaSjo1opSsSs=
        Extra: ug/SSt/fkhKrNjT+Vr+mwwCAGypuATSmmoavmvS5Nj0=

This is … frightening :frowning:

The TEST command

<samples> specifies the number of samples to be tested. If “all” is specified, all files in the backup will be tested.

It looks like you left it at the default sample of 1, so got one of each kind. If you want more, ask as you wish.

Little point in testing the same one again, so it moves on, based on test tracking.

Typically a false positive. If all is Extra, you’re probably fine. If Missing seen, then worry.

test all with full-remote-verification shows “Extra” hashes from error in compact #4693

I think this is a usability problem that is caused by the option system in Duplicati. The parser checks that the option --backup-test-percentage is indeed valid and supported, but it does not check that the option does nothing for the current operation. I think this should be fixed (@ts678 has the solution, using the sample count argument).

This is indeed what the TEST command is for, validating that the internal consistency is correct. The problem here is that the blocks are no longer needed and they have been removed from the local database, but they have not been correctly removed from the remote volume, so the test identifies that the there are some additional blocks. It is a known issue that means you are storing more data than needed, but it does not affect the ability to restore.