Corrupted backups

After many succesful test with relatively simple test data I tried with some “real data” and run into the following two problems:

a) Versioning seems not to work … even going back 10 or 20 days in historical backups, the latest changes seem to be still available in the backup as if the latest version would be the only “real” backup.

The effect is, that even if I go back by 10 or more days, I still find the latest changes in the file(s) which unfortunately corrupted the system.

b) A data file seens to be corrupt, some times with the following error nessage, some times without:

2020-06-14 17:40:08 +02 - [Error-Duplicati.Library.Main.Operation.RestoreHandler-RestoreFileFailed]: Failed to restore file: “H:\Data\LEISTUNGR2\LEISTUNG.TPS”. File hash is DsSJdVCKlnUDWg+hiVU40WtOGFSNNoBqeCCyGSBF+E0=, expected hash is kudCnIrOrNNv695Am6M0nOgcau3SxhARSkPFu5M6iwQ=

In this second case, the file is unreadable by the application.

Does anybody have an idea how to solve this?

Is the application shut down and guaranteed-not-able-to-start while doing the restore?

A way for Duplicati to see its restored file looking unexpected is something changing it.
That could also explain inconsistencies if something is only changing files sometimes.
This is also a good recipe for the application to get upset, while Duplicati is changing it.

What happens if H:\Data\LEISTUNGR2\LEISTUNG.TPS is restored to a private folder?

also worries me, although I don’t know if system is the application or when it started.

Again, running a restore to a separate folder for a close look would be the safest check.
This would ensure the file is in the backup, and let you look at time stamp and contents.

I always restore to a new folder, never on top of the original files.

Also here I restore to a new folder.
Time stamp after restoring unfortunately is the date and time of the restore.
The file is an encrypted data file so I can’t compare the content other than opening it with the application, and the application claims the file is damaged.

Are issues a and b two different things? I’m trying to match “can’t compare” with the prior remark below:

There’s a third outcome based on time stamp. If you restored a backup of any age, times should restore:

Can you get time stamps for any file, if you can get a file that will restore without giving the restore error?

If no time stamps restore, do you have the –skip-metadata option checked? That would cause that result.

https://github.com/duplicati/duplicati/blob/master/Duplicati/Library/Main/Operation/RestoreHandler.cs

looks kind of like the restore file integrity test should happen after the file timestamp should get changed.
You can see the restore activity in some detail at About → Show log → Live → Verbose, or to a big file –log-file with –log-file-log-level=verbose.

The restore itself will obtain blocks from the original source file location by default, so the question of the source file possibly changing during a restore is still open, even if the restore is to a different new folder.
–no-local-blocks can avoid trying to use the source blocks optimization, and just fetch it all from backup.

Is this application similar to a database? Databases often have special needs for backups and restores. They may need you to dump with a special tool, rely on VSS or other snapshot technologies, and so on.

Yes they are and issue b is resolved. I didn’t wait long enough to access the files. They become visible after relatively short time in the new file location, but with date and time stamp of the restore … and after aaa vvveeerrrryyyy lllooonnnggg tttiiimmmeee the restore preocess finishes completely and then the time and date stamps are correct and the files are not damaged. So that part was clearly my fault not giving the restore process all the time it needed.

No. See above … the error was not wait for the process to finish.

Yes, it is kind of a database, but none running as a service and at backup and restore time the application is shut down.

For issue a I will try a clean restore again for different dates and see what happens. The problem there is that there is also a kind of database application changing the content of the files but without touching the date and time stamp, so there is no way to tell from the file when it was modified for the last time. The application uses a MS-SQL database that I use to stop before and restart after the backup … but according to the software vendor, the database only holds temporary data, so only the files have to be backed up. I have a call open with the vendor to get this confirmed.

Issue a (as perceived by now) is that the current data is somehow damaged and therefore certain functions can’t be performed and that’s why I would like to revert to a day before the files got damaged and it seems that this doesn’t work. But franky speaking I’m not sure if it’s not an issue with the database … especially now that I learned that the issue b was just a handling problem, and in that case, the versioning seems to be working absolutely fine.

Because Duplicati only uploads changes to files (which might be initial creation), bits of files get spread around in many dblock files. I think the restore method doesn’t redundantly download a given dblock file every time it needs blocks from it, but distributes all the needed blocks whenever it downloads a dblock.

What this means is that the restore happens in pieces, as file blocks and timestamps get onto the drive. Just looking at the file sitting around might not show that it’s not fully there yet, though some of it is there.

MS SQL does not backup modified files explains how one can use –disable-filetime-check to help there, however I’m not sure of its implications, e.g. does it have to look through every file that is in source area? Possibly having a very limited private backup for troublesome files would be a workaround, if necessary.