Restore operation failed. Failed to retrieve file xxx.dblock.zip.aes & Message has been altered, do not trust content

efek405kg · June 9, 2024, 9:24pm

Yeah I was confused too on why would that happen.

When it comes to backups, I am very careful with it. I did run a lot of verification jobs before I went ahead and wiped the Linux disk. Not only that, but I even counted the number of files produced by duplicati and made a txt of each name and exact size of each file, and more.

Actually, I have a new problem. Turns out that there are more corrupted files than I expected. It says restoration was successful. When I went ahead and looked through the files, some are literally just 0 byte.

Additionally, duplicati won’t restore other 0 byte files if I tried to restore from the main root of files, I have to restore sub folders one by one.

Honestly, what kind of a piece of s*** is this software? No why I am recommending it. I understand it is in beta too. But apparently, hidden options and flags really says a lot about were its heading.

ts678 · June 9, 2024, 10:02pm

A sample of 1 set is enough to cover the 1 dlist, but falls way short of covering multiple dindex and dblock.

VerificationCount at 0 means it was never actually sampled, I think. Yes, one can adjust the sample size…

github.com/duplicati/duplicati

Add backup-test-percentage in addition to backup-test-samples [$25]

opened 02:33AM - 26 Jun 18 UTC

closed 05:53PM - 05 Jan 19 UTC

sylerner

enhancement bounty

- [x] I have searched open and closed issues for duplicates. ----------------…------------------------ ## Environment info - **Duplicati version**: 2.0.3.3_beta_2018-04-02 - **Operating system**: Linux - **Backend**: sftp ## Description Checking only 1 sample per backup when there are thousands of files doesn’t make much of a dent, especially if only backing up once per day. While backup-test-samples provides the option to test more, a user often will have no idea how many files they have in their backup. From a user point of view, I would think the ability to specify what percentage of your backup you want checked after each backup would be a friendlier method than having to specify an absolute number of samples. On a similar topic, should the default check be something like 1% of the backup set? Assuming that duplicati keeps track of who’s already been checked (I believe I’ve read that it does), a rolling 1% would - assuming there isn’t too much change each day - in a reasonable amount of time have checked 100% of the backup. ## Steps to reproduce 1. Perform backup - **Actual result**: Only checks one sample, no way to guess how many samples to make a dent w/o overdoing it. - **Expected result**: Ability to check a specific percentage of a backup set per backup, with a reasonable default value so that the entire backup set gets checked in a reasonable period of time (e.g., less than a year of daily backups). ## Screenshots ## Debug log

PR for above can help, but instead of trying to guess dblock count, one has to guess the right percentage, assuming the goal is to not sample repeatedly which slows things, but also finds files that break with time.

What said that? You were using RecoveryTool, but I don’t think it does. It also has little for restore selection.

github.com/duplicati/duplicati

Fix recovery tool empty file restore and other bugs

duplicati:master ← Jojo-1000:fix-recoverytool-empty

opened 09:38PM - 11 Jul 23 UTC

Jojo-1000

+228 -48

Closes #4884 - Fix missing block error message when restoring empty files, wh…ich were not restored - Fix bug when restoring backup that only contains one file with recovery tool - Fix path separator issues when recovering a Windows backup on Linux - Fix one unnecessary level of directories in restore path - Add unit tests for restore paths and add empty file to existing test case ## Steps to reproduce (single file error) - Create unencrypted local backup with only one source file (test.txt) - Run recoverytool `index .` in destination folder - Run recoverytool `restore . --targetfolder=restore` ### Current behavior - Output: `Removing common prefix ...\source\test.txt\` - Error: `System.ArgumentOutOfRangeException` ### New behavior - Output: `Removing common prefix ...\source\` - No error ## Steps to reproduce (empty file error) - Create unencrypted local backup (with at least two files) and one empty file - Run recoverytool `index .` in destination folder - Run recoverytool `restore . --targetfolder=restore` ### Current behavior - Error: `System.Exception: Unable to locate block with hash: 47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=` - Empty files are not restored ### New behavior - No error - Empty files are restored (without metadata like all files) ## Steps to reproduce (unnecessary directory level) - Create unencrypted local backup (with at least two files that are not empty) in folder `source` - Run recoverytool `index .` in destination folder - Run recoverytool `restore . --targetfolder=restore` ### Current behavior - Files are restore to `restore\source` - `restore` contains only the single folder `source` that could be removed from the prefix ### New behavior - Files are restored to `restore` directly ## Steps to reproduce (directory separators) - Requires Linux/macOS (Windows accepts slashes as separators) - Copy test backup from Windows install, or create unencrypted Linux/macOS backup and replace paths in dlist.zip with Windows paths including backslashes - Run recoverytool `index .` in destination folder - Run recoverytool `restore . --targetfolder=restore` ### Current behavior - Files are restored without a folder, instead they have the path including backslashes as a filename ### New behavior - Files are restored with the original folder structure

fixed various issues including

github.com/duplicati/duplicati

0-byte files are not restored with RecoveryTool

opened 04:34PM - 07 Jan 23 UTC

closed 02:29PM - 13 May 24 UTC

7ERr0r

The tool `Duplicati.CommandLine.RecoveryTool.exe` is faulty. It does not restor…e empty (0 byte) files properly. I tried it on: - Linux* - Wine on Linux (Ubuntu) - Windows 11 All these systems did not restore the 0-size files. These files are critical in database operation. <details> <summary>*Linux is buggy</summary> Backslash path separator is wrong: `\`. It should be `/` Restoring Windows backup of file `file.jpg` in directory `myDirectory` is buggy on Linux because: Windows path is `myDirectory\file.jpg` RecoveryTool then restores it as: ```shell $ ls -la 'myDirectory\file.jpg' ``` ❌ Which does not create `myDirectory`, but *a file named with a windows slash path* 🤦 ✅ It should create a directory: `myDirectory` and then `file.jpg` - with linux path `myDirectory/file.jpg` </details> - [x] I have searched open and closed issues for duplicates. - [x] I have searched the [forum](https://forum.duplicati.com) for related topics. https://forum.duplicati.com/t/recoverytool-error/8673/5 ---------------------------------------- ## Environment info - **Duplicati version**: duplicati-2.0.6.3_beta_2021-06-17. - **Operating system**: Windows 11 - **Backend**: Samba/SMB on Linux (Debian) ## Description Zero-byte files are not restored with the `Duplicati.CommandLine.RecoveryTool.exe` program. ## Steps to reproduce 1. Create files with 0 size. Maybe using Carestream database. 2. Backup using Duplicati on Windows 3. Run `Duplicati.CommandLine.RecoveryTool.exe restore` - **Actual result**: ⛔ Exception is thrown and file not created - **Expected result**: ✅ File being created ## Debug log ``` at Duplicati.CommandLine.RecoveryTool.Restore.HashLookupHelper.ReadHash(String hash) at Duplicati.CommandLine.RecoveryTool.Restore.Run(List`1 args, Dictionary`2 options, IFilter filter) 49443: D:\pc_backup\D\DatabaseCS\D.RVG\D0000001\.csi_data\.version_4.4\0@R10.rvg\.version_20112016 (0 bytes) error: System.Exception: Unable to locate block with hash: 47DXXXXXHBSa+/TImW+5JCeXXXU= at Duplicati.CommandLine.RecoveryTool.Restore.HashLookupHelper.ReadHash(String hash) at Duplicati.CommandLine.RecoveryTool.Restore.Run(List`1 args, Dictionary`2 options, IFilter filter) 49444: D:\pc_backup\D\DatabaseCS\D.RVG\D0000001\.csi_data\.version_4.4\0@R10.rvg\meta (2.13 KB) done! 49445: D:\pc_backup\D\DatabaseCS\D.RVG\D0000001\.csi_data\.version_4.4\0@R10.rvg\t.png (45.13 KB) done! 49446: D:\pc_backup\D\DatabaseCS\D.RVG\D0000001\.csi_data\.version_4.4\0@R11.rvg\.nsar_20181121 (0 bytes) error: System.Exception: Unable to locate block with hash: 47DXXXXXHBSa+/TImW+5JCeXXXU= at Duplicati.CommandLine.RecoveryTool.Restore.HashLookupHelper.ReadHash(String hash) at Duplicati.CommandLine.RecoveryTool.Restore.Run(List`1 args, Dictionary`2 options, IFilter filter) 49447: D:\pc_backup\D\DatabaseCS\D.RVG\D0000001\.csi_data\.version_4.4\0@R11.rvg\.version_20112016 (0 bytes) error: System.Exception: Unable to locate block with hash: 47DXXXXXHBSa+/TImW+5JCeXXXU= at Duplicati.CommandLine.RecoveryTool.Restore.HashLookupHelper.ReadHash(String hash) ```

but the fix was last month after 2.0.8.1. It’s in v2.0.9.100_canary_2024-05-30, but that just hit Canary, meaning it generally has more unknowns in it than the Beta. OTOH it also has various new bug fixes.

This part sounds like they are but shouldn’t be zero.

This part sounds like they’re intended. I’m confused.

EDIT 1:

I sabotaged a RecoveryTool index.txt file by editing a Base64 SHA-256 hash at start of the line. That got:

0: C:\backup restore\backup source\A.txt (1 bytes) error: System.Exception: Unable to locate block with hash: VZrq0IJk1XldOQlxjN0Fq9SVcuhP5VWQ7vMaiKCP3/0=
   at Duplicati.CommandLine.RecoveryTool.Restore.HashLookupHelper.ReadHash(String hash)
   at Duplicati.CommandLine.RecoveryTool.Restore.Run(List`1 args, Dictionary`2 options, IFilter filter)

and left me with no file rather than an empty file or a correct one with letter A in it. How are you restoring?
Regular restore does sometimes falsely declare success, but only after making a lot of noise before that.

EDIT 2:

You should be able to click whatever you like in the GUI. IIRC, CommandLine is harder. Which did you do?

EDIT 3:

So Linux Duplicati? I see some restore work on Windows. Be careful about mixing OS types for restoring.

EDIT 4:

It “sounds” like Windows previously was just an SMB storage destination, but OP mentioned a Windows restore (following a test restore on Linux, and Linux delete). Which OS is the current restore problem on?

efek405kg · June 10, 2024, 12:35am

I am talking about now. I used the GUI again. Said restoration was successful. Yet some files were corrupt.

I didn’t understand what you meant. Look, some files like books in .pdf were restored as 0 byte, they aren’t 0 byte. Other files are truly 0 byte files, like files created by programs and intended to be there, yet Duplicati won’t restore these files when restoring the whole back up by selecting the main root folder. You have to select at least sub folders inside the main backup point. The same goes with empty folders.

Which restoration attempt?

I was talking about the GUI.

In short, Duplicati on both Linux & Windows:

Source: Linux
Destination: Windows

Transferring method: Mounting Windows SMB share on linux as a folder: mount -t cifs. Then in Duplicati GUI, Local drive/folder option is selected for the mounted folder.

What happened: Backed up Linux. Tested a restore on Linux in preparation to wipe the disk. Restoration was successful, confirmed by comparing directories and files of both, the backup source and the restored assets. I used Properties menu. I used commands like: tree, diff, rsync etc. I manually checked the most important assets, and they were intact. I used Duplicati GUI and run all sorts of commands and pressed verify backup and other related GUI elements that were related to verification. I run a test for the last time. I rechecked by using the previously mentioned methods, and all were good. Duplicati is idle. I quited Duplicati on linux. unmounted the mounted SMB share successfully. Turned off Linux. Went to sleep. Woke up. Didn’t turn on Linux, I used a USB of a live OS. Wiped Linux. Days later, went to restore from Windows. And here we go.

ts678 · June 10, 2024, 1:39am

With what database? Backup on Linux Duplicati would have one with info on backup to Windows SMB.
Location would be in ~/.config/duplicati folder, for whatever user Duplicati was run as (e.g. root or you).

Preserving the old database and putting it back after wipe is probably the nicest way to get going again.
Restoring files if your Duplicati installation is lost could create one temporarily, or Repair for longer use.

I’m not sure if doing all this cross-platform complicates things, and there’s also the manual dblock work, which an old database (if you had one somehow) wouldn’t know of. In my image, note DB holds a hash.

I can poke a bit to see if I can repro an issue but generally I select whatever (even the root) and it works.

What happened to the RecoveryTool work? Did that work out OK, but now you’re testing the usual way?

If need be, there is some good logging at the restored file level, e.g. to pick one that is coming out wrong.
Verbose level would be good. About → Show log → Live or for large logging, log-file and log-file-log-level.

I’m not on Linux much, but I set up the following tree, pointed to the top line in Restore GUI, and restored:

$ ls -lRn BackupRestore
BackupRestore:
total 8
drwxrwxr-x 2 1000 1000 4096 Jun  9 21:21 sub1
drwxrwxr-x 2 1000 1000 4096 Jun  9 21:21 sub2

BackupRestore/sub1:
total 4
-rw-rw-r-- 1 1000 1000 0 Jun  9 21:21 empty1.bin
-rw-rw-r-- 1 1000 1000 6 Jun  9 21:20 file1.txt

BackupRestore/sub2:
total 4
-rw-rw-r-- 1 1000 1000 0 Jun  9 21:21 empty1.bin
-rw-rw-r-- 1 1000 1000 6 Jun  9 21:21 file1.txt

but I guess you’re saying you would have been missing the desired 0 byte files (are others OK) for that?

The FIND command (a.k.a list) can show what size the database has for any file. Is it size 0 there or not?

We haven’t yet gone digging directly in the database, and I’d prefer we don’t, so I’m proposing other ways.

EDIT 1:

Below are some examples of things Restore would say at Verbose log level. If it ended early, that might explain missing empty files. They’re a special case, after regular files. This log is reverse-chronological.


Jun 9, 2024 10:19 PM: The operation Restore has completed
Jun 9, 2024 10:19 PM: Testing restored file integrity: /tmp/BackupRestore/sub1/empty1.bin
Jun 9, 2024 10:19 PM: Testing restored file integrity: /tmp/BackupRestore/sub1/file1.txt
Jun 9, 2024 10:19 PM: Testing restored file integrity: /tmp/BackupRestore/sub2/empty1.bin
Jun 9, 2024 10:19 PM: Testing restored file integrity: /tmp/BackupRestore/sub2/file1.txt
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/sub2/file1.txt
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/sub2/empty1.bin
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/sub2/
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/sub1/file1.txt
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/sub1/empty1.bin
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/sub1/
Jun 9, 2024 10:19 PM: Patching metadata with remote data: /tmp/BackupRestore/
Jun 9, 2024 10:19 PM: Restoring empty file "/tmp/BackupRestore/sub1/empty1.bin"
Jun 9, 2024 10:19 PM: Restoring empty file "/tmp/BackupRestore/sub2/empty1.bin"
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/sub2/file1.txt
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/sub2/empty1.bin
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/sub2/
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/sub1/file1.txt
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/sub1/empty1.bin
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/sub1/
Jun 9, 2024 10:19 PM: Recording metadata from remote data: /tmp/BackupRestore/
Jun 9, 2024 10:19 PM: Backend event: Get - Completed: duplicati-b663e535bd645448eb918b6f100d206fa.dblock.zip (2.43 KB)
Jun 9, 2024 10:19 PM: Backend event: Get - Started: duplicati-b663e535bd645448eb918b6f100d206fa.dblock.zip (2.43 KB)
Jun 9, 2024 10:19 PM: 1 remote files are required to restore
Jun 9, 2024 10:19 PM: Target file is patched with some local data: /tmp/BackupRestore/sub1/file1.txt
Jun 9, 2024 10:19 PM: Target file is patched with some local data: /tmp/BackupRestore/sub2/file1.txt
Jun 9, 2024 10:19 PM: Target file does not exist: /tmp/BackupRestore/sub2/file1.txt
Jun 9, 2024 10:19 PM: Target file does not exist: /tmp/BackupRestore/sub1/file1.txt
Jun 9, 2024 10:19 PM: Creating folder: /tmp/BackupRestore/sub1/
Jun 9, 2024 10:19 PM: Creating folder: /tmp/BackupRestore/sub2/

kenkendk · June 10, 2024, 9:56am

In light of the experience detailed in this topic, I have added a new PR that inverts the local block logic. With this change, local block usage is now opt-in, rather than opt-out, which will hopefully prevent issues like this going forward.

I have added a new PR that uses 0.1% as the default number of remote files to test.

Jojo-1000 · June 10, 2024, 10:37am

That is a good change. I was working on a selection on the restore page, which defaults to no local blocks, but this way it will also help command line users

vbs · June 10, 2024, 7:02pm

I read through the documentation and found two more options that seem relevant:

full-remote-verification

full-block-verification

Not exactly sure what the difference is, but should we enable at least full-remote-verification? I don’t care if the whole backup is downloaded after backup. I backup just once a week like 20GB at night and it’s fine to download it again.
Threads like this make me a bit nervous

efek405kg · June 10, 2024, 9:06pm

There were no database involved. It is a direct restore. I guess you might be a bit confused, so tell me about what you want to know exactly. For now, I am restoring from Windows via both GUI using restore section, and CLI using RecoveryTools.exe.

No, both GUI and CLI didn’t work well. This is the CLI output for the list command (the missing book as an example):

/home/user/Base/Jellyfin/Library/Books/CODE: The Hidden Language Of Coumputer Hardware And Software.pdf (12.87 MB)

When restoring (GUI & CLI), it says restoration is successful. Going to the file location, this is what I found:

CODE

That’s it. CODE. A 0 byte file.

There were other errors like:

2024-06-10 22:32:45 +03 - [Error-Duplicati.Library.Main.Operation.RestoreHandler-RestoreFileFailed]: Failed to restore file: “E:\Lawnchair Backup May 29, 2024 9:23:29 AM.lawnchairbackup”. Error message was: The filename, directory name, or volume label syntax is incorrect.

IOException: The filename, directory name, or volume label syntax is incorrect.

@ts678 @kenkendk I will go through intensive tests for my backup by using different tools, methods, versions, and document every type of errors I come across. But before I start, I want you guys to check my methods. Tell me if anything is missing, what could I add, and what would be unnecessary or redundant.

Overview:

Platform: Windows 10
Duplicati version: 2.0.8.1_beta_2024-05-07

Available backup versions (using list command):
0: 5/29/2024 9:55:16 AM
1: 5/28/2024 6:32:23 PM

Both versions are going to go through the same restoration tests as mentioned below. The reason for this is I found some minor differences in errors by restoring from different versions.

For GUI, I will be pasting a copied log by using “Copy log” button within “Complete log” section. For CLI, I will be only pasting the errors that shows up.

Other additions:

When the restore operation finish: In File Explorer, with show hidden files enabled, I will use “Properties” option from the context menu, providing the “Size”, “Size on disk”, and “Contains” attributes. I will ignore the extra layers of folders by moving the subfolders and files directly to a root folder.
I will provide a count of files for both backup versions using the command list in Duplicati.CommandLine.RecoveryTools.exe

Things I expect to go wrong:

Decryption errors. Filename, directory name, volume label syntax is incorrect. Corrupted files. True 0 byte files and empty folders are not restored (later on, I expect it be restored if done by selecting sub-folders of the main backup root).

Restoration tests for each version (0 & 1):

Original (Encrypted) Duplicati backup files:

A) GUI: Using ONLY: GUI > Restore > Direct restore from backup files:

Restore by selecting main root
Restore one by one subfolders (within the main root)
Restore failed to restore files. It will be on two parts:
First, by selecting all un-restored files (at least for one subfolder)
Second, by selecting only one file for at least 5 samples.

B) CLI:
Suggest steps

Manually decrypted Duplicati backup files:

A) GUI:
Same as in:

A) #GUI

B) CLI:
Suggest steps

Hopefully, I didn’t miss out on any other methods and options. Tell me more of what I can do and add.

Jojo-1000 · June 10, 2024, 9:32pm

This might be a problem because you transferred from Linux to Windows. AFAIK windows doesn’t allow : in filenames, because it is a drive separator. That is a good issue to find, it should be fixable by replacing invalid characters when restoring on a different OS.

This probably means your restore setup doesn’t have an E: drive, so you have to select that drive separately and choose a target folder for it. Otherwise it won’t know where to put the files. This could probably also be handled better, by putting each drive in a subfolder if there are multiple.

ts678 · June 10, 2024, 9:32pm

I was typing up a similar reply. Here it is.

possibly because

is an illegal filename on Windows. That really didn’t get a message? Source:

Naming Files, Paths, and Namespaces (Microsoft)

The following reserved characters:

: (colon)

might be the same problem, if that’s the Linux file name, except this time Windows’ error was reported.

You can probably test with a Command Prompt to verify that Windows rejects file names like the above.

efek405kg · June 10, 2024, 9:41pm

Oh, how I didn’t think of that lol. Yes indeed it didn’t give a message, just said restoration is successful. Only for Lawnchair… file showed the error.

You know what guys? I will move the Duplicati backup files to another Linux installation I have. I will update you guys with what I get.

EDIT:
I checked again by restoring CODE and Lawnchair files. No error showed up for CODE. Only for Lawnchair. (Windows BTW)

efek405kg · June 10, 2024, 11:58pm

So I tried the restoration process again from a Linux OS. The only thing changed is that I am able to restore files with illegal Microsoft characters.

The same issues persists for:
0 byte files and empty folders.
Missing/Corrupt files. (I can Ignore it as the damage been done. It is just that Duplicati didn’t detect it earlier when I run numerous tests and checks )

efek405kg · June 11, 2024, 2:36am

I decided to end this discussion.

It appears that 99.99% (if not 100%) of assets are recoverable due to other copies being available on different mediums (lucky me. But what if there were none?).

Though, the mental exhaustion that I will be going through is unbearable. Having to count asset, asset. Watching video, video. Uncompromising zip, tar (and then do the same for the contents within). Opening document, txt, You know the rest, just to make sure no further damaged occurred.

With Duplicati, I expected a leg on leg moments. No worries. No paranoia. However, I feel it is kind of my fault. I should have went through intensive reading of the Advanced Options section within the Duplicati manual. Though you really guys done me dirty by stating:

Those additional options should only be used with care. For normal operation none of them should ever be required. Use this alphabetical list as a reference to find the advanced options that fit your needs.
~ Advanced Options

LOL.

Forget about what I previously called Duplicati with. Duplicati is a pretty awesome piece of tool. The features it offer are really decent and useful.

Going forward, I will make sure to enable the following options within the configuration of every backup set (can be enabled globally for all backups within Settings > Default options):

backup-test-percentage. I will set it to: 100
block-hash-algorithm. Not that necessary. I will set it to: SHA512. EDIT: don’t use as it may introduce more bugs and instabilities.
file-hash-algorithm. Not that necessary. I will set it to: SHA512. EDIT: don’t use as it may introduce more bugs and instabilities.
full-block-verification. Not sure about it. It is confusing me a bit with full-remote-verification. I will set it to: true
full-remote-verification. Set to: true
list-verify-uploads. Set to: true
no-local-blocks. Absolutely set to: true
no-local-db. Not sure, but I will set it to: true
upload-verification-file. Seems interesting. I will set it to: true

Other options, especially logging and mailing, seems to be useful and I may enable in the future.

For now, I feel confident about these options being enabled, and are enough to begin with, and hopefully keeping my backup sets solid and intact, with no to little failure rates.

I recommend everyone reading this to go through them - Advanced Options - one by one (or at least, skim read. See what peeked your interest).

I wish for some corrections and further guidance on what else is there, and whether I went overkill by using these options.

Thanks!

Jojo-1000 · June 11, 2024, 7:31am

That is very fortunate and good to hear. Also shows how important it is to have redundancy.

This is especially frustrating, because you should have the hashes of all the files (the list files were not broken) and at least that check should just work.

While I don’t expect this to cause any issues, for the maximum stability I wouldn’t change the hash. The defaults are tested much more, so you might unwillingly uncover more bugs. Also, the bigger hashes take up more space with little benefit.

Otherwise, what I would recommend for stable backups, if your data doesn’t change much, is to keep all versions. In the past, there were several issues with the process of removing and compacting data that is no longer used. Even though the problems you had don’t quite fit with these previously fixed ones, that part is probably still the riskiest.

Just out of curiosity, do you have a rough estimate of how long the broken backup existed since the first version? There were quite some bugs discovered and fixed in the duplicati releases over time, so it would be good to know how recently your problems might have occurred.

kenkendk · June 11, 2024, 8:29am

These two switches, --full-remote-verification and --full-block-verification work together, and are relevant if you fear that the internal structure of the files is broken.

With --full-remote-verification the download process will decrypt and decompress the files, and check that the contents inside the zip file matches the expected hash, and that all blocks are present in the compressed file. The --full-block-verification changes the number of tested blocks inside the compressed file from 20% to 100%,

However, since the hash of the outer volume is recorded and checked with the normal verification, corruption will be detected by the normal backend consistency checks.

In that case I would use the TEST command separately, and here you can use --full-remote-verification and --full-block-verification. Once a volume has passed the test, there is generally no need to verify it again, as the hash check will reveal modifications.

I have the same feeling. To summarize, there were a few things that conspired to make problems here:

Storage somehow corrupted a number of remote volumes
The check to catch these was running on too low a number of volumes
The restore process was used to test the restore, but was using the local files as source, not the remote

I have submitted a PR that addreses (2) by increasing the number of volumes to 0.1% for each backup. It may require a higher number if the storage medium is not actively managed and checked for errors, but for a balanced approach with users that are on cloud provider storage, I think a percentage at least scales with the backup size.

For (3) I have submitted a PR that disables the logic by default. While it is (presumably) faster to use local blocks, it is clear that the user would expect a successful restore to be an indication that everything is working correctly, and this topic shows it is not.

I really think this is the ideal. We cannot expect people to read and understand the advanced options. The default should be as secure and efficient as possible. Anything offered as additional options should really only be choices that required some understanding of the tradeoffs.

In your particular case, I think it was mostly an attempt to optimize the restore process that got too clever, so it clashed with a reasonable expectation that “restore on machine X means the backup is sound”.

The --backup-test-percentage could also be higher by default, but since you pay for the download from cloud storage, this has been set a bit too low.

I totally understand the desire to set these, I would say, like @Jojo-1000 mentions that not many people have tested another block/file hash algorithm, so I would caution changing that.

The option --full-remote-verification would certainly catch the errors, but only if the --backup-test-percentage is high enough. The --full-block-verification option increases the amount of testing to test every single block in the archive.

Both of these options are intended to check for internal errors in Duplicati, and the corruption you have seen would be detected by the regular hash checks, had --backup-test-percentage been high enough. I think setting it to 100% is a bit over the top, but the actual value depends on how many new files you generate for each backup. Duplicati will default to check new files, so the percentage needs to be higher than at least the percent of added files for each backup, to effectively check existing files.

As mentioned, I would recommend setting up an additional TEST operation that runs less frequently than the backup, and this can use 100% coverage. There is no support for this in the UI as it stands though.

The option --list-verify-uploads is a check for backends that fail to store files. By using this option, each upload will be followed by a listing of the remote destination to check that the files are really there.

The option --no-local-db is not supported for backups, only for restores. When restoring, you can use this option to not rely on the local indexed copy of data, but instead build a fresh database just for the restore operation.

The --upload-verification-file is most useful if you have some kind of access to the storage server, and can run the python verification tool on the destination. If this can be done, you can perform much the same checks as --backup-test-percentage=100 on the remote storage, and thus much more effecient.

efek405kg · June 11, 2024, 12:39pm

Yeah I kinda felt it is overkill. I will fallback to SHA256.

Actually, I decided to go through separated backup sources approach. Some data and files including photos, seems to only change once a year (that if it changed). I think doing so would minimize the risk of a new block of data or a compact operation corrupting static assets that lives with regularly modified data (I will be keeping all backup versions as you suggested). Also, that would be helpful to run prolonged tests especially for big volumes of data in the long run, without wasting time retesting data that otherwise have not been changed.

Do you mean since the creation of backup till the discovery of the problem? If so, the first backup/version was on 29 May 2024 at 9:55:16 AM. I discovered the issue on 8 Jun 2024 at 9:10 AM. So you can say it almost existed for 9 days, 23 hours, 14 minutes and 44 seconds since the first backup version according to timeanddate.

To be fair, I think such an option is quite good. Maybe suggest the user on what type of approach to choose when setting up a restore operation, and inform them with a short description of the pros and cons of these options.

Yeah, I think it will be more safer and better to add a TEST GUI element that executes the test command. Most people may miss on it due to it existing under commandline. Also, they may confuse verify files button with a full back up check like what test would do.

vbs · June 11, 2024, 6:54pm

Hm, sorry, but I still don’t seem to really understand it. I ran a TEST command using the full package:

backup-test-percentage = 100
full-block-verification = true
full-remote-verification = true
no-local-blocks = true

And I get this output:

Running commandline entry
Finished!

            
  Listing remote folder ...
  Downloading file duplicati-20240610T215939Z.dlist.zip (761,50 KB) ...
  Downloading file duplicati-iadaec9fcc78a40d7a827b8ba05748224.dindex.zip (222,10 KB) ...
  Downloading file duplicati-bcb340916c5ce4471999ab3d4367c648a.dblock.zip (2,00 GB) ...
Examined 3 files and found no errors
Return code: 0

So, it downloaded just a single file dblock, right? I would have expected that the whole backup (40GB) would be downloaded, extracted and rehashed.

Is there a way to see which command line actually was used for the command? Same question for the regular backup job. I would like to see what really happened. I have put log-file-level to “verbose” but still can’t find it.

vbs · June 11, 2024, 7:07pm

I ran it again and it picked a different dblock. Even with an error this time (!):

  Listing remote folder ...
  Downloading file duplicati-20240610T215939Z.dlist.zip (761,50 KB) ...
  Downloading file duplicati-id47670d5c1e74e8fbee31b62b5c3d87a.dindex.zip (1,59 MB) ...
  Downloading file duplicati-bbb1b7e5b0a454e4abd471a34d67894c3.dblock.zip (2,00 GB) ...
duplicati-id47670d5c1e74e8fbee31b62b5c3d87a.dindex.zip: 2 errors
        Extra: ELLWpZNLk05pLAj7252gwrBr498jJoZnaSjo1opSsSs=
        Extra: ug/SSt/fkhKrNjT+Vr+mwwCAGypuATSmmoavmvS5Nj0=

This is … frightening

ts678 · June 11, 2024, 7:18pm

The TEST command

<samples> specifies the number of samples to be tested. If “all” is specified, all files in the backup will be tested.

It looks like you left it at the default sample of 1, so got one of each kind. If you want more, ask as you wish.

Little point in testing the same one again, so it moves on, based on test tracking.

Typically a false positive. If all is Extra, you’re probably fine. If Missing seen, then worry.

test all with full-remote-verification shows “Extra” hashes from error in compact #4693

kenkendk · June 12, 2024, 8:06am

I think this is a usability problem that is caused by the option system in Duplicati. The parser checks that the option --backup-test-percentage is indeed valid and supported, but it does not check that the option does nothing for the current operation. I think this should be fixed (@ts678 has the solution, using the sample count argument).

This is indeed what the TEST command is for, validating that the internal consistency is correct. The problem here is that the blocks are no longer needed and they have been removed from the local database, but they have not been correctly removed from the remote volume, so the test identifies that the there are some additional blocks. It is a known issue that means you are storing more data than needed, but it does not affect the ability to restore.