My story: backup DB broke when I needed it

Hi, I want to tell my recent experience with Duplicati 2. It all went down, but I assume that there is nothing I can do, so I am not asking for support, just sharing my sadness.

I built a new Computer and used the backup to transfer all user data. Since it was Win 8 → Win 11, I first restored into a temporary folder and then decided what to copy into which new folder. I didn’t delete the temp data immediately.
I used the same backup destination for the new machine, it was mostly the same data anyway (some folder names changed).

There is an issue with the new computer. It bluescreens somewhat frequently. During all that, some files (random selection throughout several folders) broke: They had the correct file size and everything, but had only NUL bytes. I identified the broken ones and copied them back from the temporary folder again, then re-run the backup. The temporary folder is not part of the backup source.

Due to the computer issues I decided to reinstall Windows. I decided to rely on the backup again for that. Big mistake. When I restore from the backup, the files are broken again, same pattern. This is true for both the backup date after I restored them from the temporary folder and before when the files got broken.

Now it seems the files are lost for good.

Some responses I can think of:
“You shouldn’t rely on a backup when you plan a clean install, use a changeable medium for that” → True, I hate myself now for not doing that.
“When the PC has issues like a bluesceen, you can’t blame a piece of software to fail” → Nope, a backup solution is meant for when things go wrong, it should be robust in these cases, so that it doesn’t delete old data that was already backed up.
“Duplicati is Beta, you can’t rely on it” → Yes.

Your biggest problem is that you’re relying on one backup solution. That’s always going to bite you. It is just how it is.

Lets say the drive it was on failed then you’d be in the same situation. Lets say the OS installer accidentally formatted the backup drive from a bug then you’d be in the same situation because weird things like that happen…And so on.

If you accidentally sync broken files to it and the backup solution cleans up all good ones then it too isn’t specifically the backup solutions fault but its another situation where you need multiple.

Learn from it in a better way. If you only use one then you will find yourself in the same situation again and again.

I’ve seen others make the same mistakes so its not original.

One isn’t ever good enough :slight_smile:

There are lots of things as you are willing.

Make sure you verify that on a working computer, i.e. with Direct restore from backup files.

While you might wish backup software to overpower all possible computer failures, it can’t be done.
Duplicati restore does verify that it restores files correctly (SHA-256 check). After that, it’s unknown.
If the computer is prone to wiping files to NUL sometime later, that’s not something Duplicati can fix.

Restoring files if your Duplicati installation is lost is how you can test a restore on another computer.
I take it your old one is gone and you require a few files from backup? There are other ways as well, however they are more awkward. Do you have enough space to restore whole backup somewhere?
Duplicati.CommandLine.RecoveryTool.exe can do that, but version and file selection is complicated.

The COMPARE command in GUI Commandline can show if file contents changed between backups.
If I read you correctly, you think most recent backup of fixed files is the same as the backup before it.

Examining file records in the database might help, but you’d need to operate a GUI as directed here.
We can probably even figure out if a file was NULs (how far into the file – all of it?) at time of backup.

I was doing this all the time, this was how I tried to restore to the new PC in the first place. It is not crashing all the time, only sometimes, so it was “working” enough to go through the direct restore.

I would at least expect that Duplicati wouldn’t also break the same files in the backup in a past version, one that was OK before.
I assume that happened by a Compacting step that was somehow using a broken local database.

I know how to get files from a backup using direct restore, the backup isn’t that big, there is plenty of disks space to store several versions. I already did this, this is how I know that even old backup versions from before the files got NULLed are now NULL files.

Some things in the original sequence weren’t said, and are still kind of unclear. For example:

The other option would have been to use Repair to make permanent database for the future.
Don’t have two computers on one destination though. Direct restore won’t allow backups:

It won’t do compacts either, so at some point you must have switched to permanent database.

presumably by careful analysis as below? Totally NUL may be hard to see, but size may help.

Any warnings or errors on Restore? You could run with About → Show log → Live → Verbose.
That will detail what’s done to some file. Restore patches files a block at a time via downloads,
meaning sometimes a download might extend a file and Windows makes it a sparse file with a
correct length and NUL everywhere a block hasn’t yet been written. Or maybe nulls got written.

Looking in the database with DB Browser for SQLite can relatively easily see if show if backup
of a file has repeating blocks. One would find repeats of the same block for the file in question.

Assuming you now have a database, you can post a link to a bug report or look at DB with me.
The bug report has pathnames sanitized, so maybe you can tell me a length to narrow it down.

If you’d rather not post or look, but try different restore, there’s that option detailed in link I gave.

Actually you can see two lengths in right-click Properties. I’m looking for Size, but you can probably see Size on disk too. If it’s smaller than Size, then you probably got a Sparse file but it’s unclear just how. Interrupting a Restore (or maybe having it bail out prematurely – so watch the logs) could probably do it.

Final step (I think) is timestamp setting per backup records. If timestamp is restore time, it didn’t do that.

https://github.com/duplicati/duplicati/blob/master/Duplicati/Library/Main/Operation/RestoreHandler.cs
is what gets run. The Logging.Log lines show what logs at different levels from this file anyway. The

shows the file verification against the SHA-256 at time of backup. Can you see that line for your file?
This wouldn’t prove that backup damage hadn’t occurred, but database investigation could test that.

Here’s backup of a 1024000 byte file of NUL at default blocksize of 100 KiB, so expecting 10 blocks.
Here’s the database view, and it’s easy to see that the file is made of 10 blocks which are the same.
The BlocksetID represents the file, and the series of BlockID represent the fixed-sized blocks it has:

image

This could show if a file really got backed up as (or DB somehow turned into) a run of the same block.
Knowing what BlockID 2 actually contains comes from knowing it was all NULs, so the hash for that is

image

and we could search the backup for other files that have NUL blocks in them by using column header:

image

In this test backup, of course, there is only one file, but if several of yours went NUL, they would show.
This behavior isn’t one I can recall having heard of (feel free to search), so something else is going on.

Whether you proceed with an investigation or a different restore is up to you. I can help with either one.

Sorry, I guess I didn’t describe that right. Let me try again in a list of actions I took:

  • New Computer, which replaces the old one. The old one is gone. So at no point two computers accessed the same backup location.
  • Direct restore on the new one, restoring from last backup of the old one (cloud storage), which is from 2022-12-07, into Downloads\backup.
  • Manual copy (not move) from Downloads\backup into the new folder structure, most importantly My Documents, Pictures, AppData\roaming\Thunderbird.
  • Set up backup on the new PC, destination is the same folder in the same cloud storage. Daily schedule, 6 month retention, Downloads folder is excluded.
  • Over the next 2.5 months it turns out that the new PC has issues (one Bluescreen every few days).
  • During that time some files get corrupted (random files from various sub-folders of Pictures). They keep their file size, dates, names but the content is broken. I didn’t check all, but it seems they now only contain NUL bytes.
  • With file comparison and careful selection, I restore all the broken files again from Downloads\backup (which just happens to be still there) while keeping the intentionally modified files.
  • Now with repaired personal files, I manually trigger a new backup run. There are no error or warning messages, so I assume that the now latest backup (2023-02-19) is OK, with the repaired files.
  • As part of the investigation and attempted repair of the Bluescreen issues, I wipe the SSD and re-install Windows.
  • I perform a direct restore from 2023-02-19, into Downloads\backup. Lots of warnings and errors occur, but the restore process still finishes. Among the files in Downloads\backup the same files (at least it seems so, I don’t have a full list) are broken in the same way again.
  • I perform a direct restore from 2022-12-07 into Downloads\backup_2022-12-07 and another one from 2022-12-30 into Downloads\backup_2022-12-30. They have the same issues: Warnings and an error, the same files are broken.

The main point is that the original backup from 2022-12-07 was confirmed good. Then the problematic PC added more backup versions. During that process, the same backup from 2022-12-07 is no longer fine.
I would understand if only new backups are broken, because the PC was behaving erratically.

… Update, hours after writing that above …

So right now I’ve downloaded the complete backup folder (the dlist/dblock/dindex) from cloud storage onto a different PC for testing. I’ve created a “dummy” backup with the local copy as backup destination, just so that I can build (“repair”) the database here. I hope this makes sense.
I got three warnings and one error during repair:
Found 1 missing volumes; attempting to replace blocks from existing volumes
Found 1 missing volumes; attempting to replace blocks from existing volumes
Remote file referenced as duplicati-b1cc610cfbedf4170a8119a847caa952d.dblock.zip.aes by duplicati-ib12048286fe647b18414d72393fbd5c3.dindex.zip.aes, but not found in list, registering a missing remote file
I got many more warnings when I tried the direct restore on the “broken” PC, but I don’t have the log available right now.

I am now trying to do a restore from this database on this “spare” PC. I started with the oldest backup version (2022-09-11), which I haven’t tried on the broken PC before. So far it looks like the files are OK here … which is strange, but reassuring. I will now try to restore a backup version that was broken when restored on the bluescreening PC. It takes surprisingly long, so I will probably not have a result today.

Well, if you compact your backups it’s a risk. Duplicati doing deduplication, it can compact old files to make place on the backend. It’s not really compacting files, what happens is that for example if you have oldfile1 that is filled at 40% and oldfile2 filled at 50%, it will remove them and create a new file storing the data of the 2 files. Unfortunately, it means that the old data will transit through the computer while being rewritten to the backend. And if the computer has a bad problem (such as memory failure or else) it can corrupt old data indeed.

You got a damaged backup somehow. The question is whether the damage will actually get in the way.
Errors in dindex files are avoided when using the RecoveryTool as it uses dblock directly without index.

Well I guess we’ll see how current attempt goes. You can watch About → Show log → Live → Warning.
Direct restore is (I think) not as good as regular restore at keeping the rather thin log file in its database.
Either one can add log-file=<path> and log-file-log-level=verbose to get more, but it’s too late on this try.

As I wrote in my step-by-step summary, I once replaced the broken files with the correct ones and run a manual backup, expecting that the correct ones get into the latest version of the backup.
Maybe that was wrong? The files I replaced had the same size, dates, same everything, just different content. Would Duplicati detect that as changed files? From the performance I can’t imagine it would hash every single file on every run.
Assuming that it doesn’t, that would explain why the latest backup still contained the broken files.

Now on the other computer (not the one with bluescreen issues, where the backup was performed) it seems like the old backups (at least 2022-12-07) are in fact consistent and all new ones (when the broken computer started to write its backups) are just NULs in some files. That would imply that it must have been the first manual copying from Downloads\backup into the proper folders (as described by the 3rd step above) that broke them somehow.

That still doesn’t explain how the broken PC got back the broken files when restoring the apparently good backup.

At least it seems like I will get all my files back, once I again manually sort them.

Few questions left:
The Compare command like function is neat, but is there also a quicker way to figure out in which versions a particular file changed? I can’t find a function like this in the CLI and I also looked at the sqlite, but don’t understand how the versions work, seems to be related to the Fileset table?

It is nice that Duplicate is robust enough that a broken file in the backup doesn’t render it completely useless, but is there a way to figure out what the missing dblock was supposed to contain?

It does not by default (slowly) check the contents, but it does look at file metadata such as time stamps, which you would be hard pressed to not change with what you did. Having noticed something changed, reading through the file would be the next step to find out exactly what blocks changed to upload those.

image

is an example of a job log file. It examines the source area, has to open a few files, and finds changes.
If you still had old database somehow after the SSD wipe, it could be examined quite carefully for clue.
Recreating the database could also have clues, but it’s not clear how well that will work at the moment.
If you had a database, a compare would be the somewhat user-friendly way to show the modified files.

Specific details of what blocks changed would not be in the output. That needs a DB bug report or you driving a database browser under direction, but again there is the question of how to get the database.

Without that, your question can be somewhat answered by downloading the dlist files for two backups (noticing that the timestamp in the filename is 24-hour UTC), decrypt if needed with a tool, and finding that file in filelist.json. If its information changed, that suggests that some change was backed up.

How the backup process works but you’d still have to fish around different dates to find a change point.

It’s difficult, but that’s how life is without a database. If you’d like to try any of these methods, just ask…

On the other hand, are you saying you got a database and are looking at tables? We can continue that.
The advantage is that rather than having to look in all backups, you can identify where a path changed.

File table has all versions of all paths. Type yours into filter at top of Path column. You’ll get a row for each version, Take the ID, filter for it in FilesetEntry column FileID, take the FilesetID to ID column from Fileset table, take Timestamp and turn it into a time, e.g. at epochconverter.com. Someday, the

Implementing the feature to restore any version of a single file will hopefully see its PR put in a release.
Display versions for a file and allow any version to be restored #4805 PR awaiting someone to commit.

An alternative way to find files using a 102400 block of NUL was shown earlier, if I tested correctly.
Once you find a thought-to-be-broken version of a path, you could look at its BlocksetEntry entries.

is my ideas for a discussion of how internals connect.

Documentation for the local database format is a slightly stale picture that might help with my words.