Restore Backup with only dblock and dindex files

Sparx82 · December 24, 2018, 10:54pm

Hi all,

I have a partly complete backup with only dblock and dindex files (no dlist). I also DON’T have the local database. Is there any chance to restore some files out of this “backup”? I’ve tried the repair command, the rescue tool, but nothing worked so far…

Best regards,
Oliver

Pectojin · December 24, 2018, 11:03pm

Hi Oliver,
I’m afraid without the dlist, or local db, the data is just unconnected blobs of data without any easy way to reassemble the original data

The data is there theoretically but realistically it’s useless without something to map the blobs together.

ts678 · December 24, 2018, 11:22pm

Do you still have the drive and the computer (if so, what OS)? Maybe there’s another way to get something.

Sparx82 · December 25, 2018, 8:52am

Ha, I just found out that I indeed still have the sqlite database. But what now? I have a local copy of the (decrypted) dblock files from the recovery tool. When I now run a repair it deletes the dblock files, which is not exactly what I want Do I need to original (encrypted files) and if yes, what command can I use to get my files out?

Sparx82 · December 25, 2018, 10:22am

Ok, I can do a repair with the encrypted files. But what now? I’d like to have a list of files of the backup as I don’t need to restore every file. With the command line utility and find with * as a filter I get the error:
Mono.Data.Sqlite.SqliteException (0x80004005): SQLite error
Expression tree is too large (maximum depth 1000)

Sparx82 · December 25, 2018, 11:54am

I used the SQLite Browser to extract the file list and compare with the existing files. Next step: Extract the few missing files which are in the backup but not on the local drive. Or even before: Locally encrypt the downloaded dblock files again, as this seems to be faster than downloading the encrypted ones.

JonMikelV · December 25, 2018, 1:59pm

Hi @Sparx82, welcome to the forum!

I doubt I can help much with the recovery, but it looks like you’re already in pretty good hands.

However, I’m curious how you got to this point… Did Duplicati somehow chew up all your dlist files? (I sure hope not, but if so we would want to go that!)

Also, I thought I’d ask others why you would need to re-encrypt the files… Does the local db know anything about the encryption if the files other than the file name (and maybe encryption type)?

If it’s just file name, a few quick database updates should get around that. At least in theory & based on how I THINK that stuff works…

Sparx82 · December 26, 2018, 8:47am

Well, long story short: it wasn’t Duplicatis fault. While I was uploading data to my backup, I decided to change to motherboard of my NAS and for that, I interrupted the upload. I was not aware of the fact that my (software) RAID 5 is not working anymore after a motherboard change. It was not a problem as I had a backup of all my data at another place (not uploaded with Duplicati). But some files there were corrupt and I try to get these now out of my partly complete Duplicati backup.

I re-encrypted the files because I need the encrypted ones for the repair command but the rescue tool decrypted them after downloading. And re-encrypting was faster than re-downloading.

ts678 · December 26, 2018, 1:32pm

Time is unclear, but if the recovered Duplicati job database is from the other backup, then it is possibly older than the Duplicati data at the storage destination. This might explain the dblock deletions, as Duplicati repair aligns two views that should stay aligned, by removing the unknowns that go beyond its stale database view. Were the deleted dblocks the newer-date files only? Fortunately you experimented on a copy (a good idea).

If you got a stale database, then you remain in the unfortunate state of not having filename info on new files, however you might be able to recover old versions of files that were found corrupted from your other backup.

If somehow you have a database and destination that are pretty close, then there is more hope of recovering latest-theoretically-possible versions, but it could take much doing. Normally an interrupted backup is noticed at the next backup, and a synthetic filelist is supposed to get uploaded to reflect progress before interruption. Although I think there may be a bug that prevents that, it’s far easier to get that built by code than by hand…

Extreme recoveries can take both skill and will. How far is it worth going? Skills with code and databases help.

How the backup process works shows the ordinary plan, and may help assess difficulty of doing manual work.
EDIT: For further technical background, you can read Disaster Recovery and How the restore process works.

Restoring files if your Duplicati installation is lost is probably the easiest path to get a few older files restored now that it sounds clearer that you have older dlist files, just not the most recent (uploaded at backup’s end).

Another question is whether you wanted your old Duplicati backup reassembled as well as possible given the age of the job database. Recovering the Duplicati-server.sqlite from the other backup might help that, but it’s still some reassembly. Alternatively you could just get what you’re willing to dig out, then start a fresh backup.

Sparx82 · December 27, 2018, 1:53pm

In the meantime I found out why at my first try the dblock files were deleted: I linked to a wrong local sqlite database. So what I have now:

An old backup created with Cloudberry stored on Backblaze. That means I don’t have complete old backup created with Duplicati.
A more recent backup created with Duplicati on OneDrive which is not complete. There I have a local sqlite database (which is somehow incomplete as well I guess) and the backup with dblock and dindex, but NO dlist files. If I open the sqlite database using SQLite Browser, I see a lot, also the files which have been backuped so far.

I tried now with my partly complete backup the restore some files. How can I do that. A normal restore says that there’s no backup, the repair exits without doing anything. Is there any other possibility?

ts678 · December 27, 2018, 5:15pm

That must have been a manual copy, but comparing its date with the dblock and dindex dates might tell how close they are, however lack of dlist implies the initial backup didn’t finish, so database would be incomplete. Possibly you saw evidence to the contrary. If so, please mention it, so we don’t head down an unsuited path.

You’re presumably looking at the Path in the File table, wishing you could get them without laborious manual work as I described (which I think is only feasible at small scale or by finding/writing specialized tools to help). What would be nice (assuming you have a backup that’s incomplete due to interruptions, and otherwise fine) would be a way to ask Duplicati to make the synthetic filelist that it would want to do if needed before backup, however that code looks like it’s tied to backup. A repair looks by your testing like it won’t do the equivalent.

I suspect that the feature does not yet exist, in any friendly form. Perhaps someone will stop by to correct me. You could Search or read the Features category if you like (probably after your crisis passes) and put that in.

If you’re not interested in the ones I painted as being very challenging, there may be the possibility of getting RAID 5 running again if there is still data on the drives for it to read, as an alternative to restore from backup.

For a Duplicati restore, it might be possible to have the incomplete backup become complete by having it run a trivial backup configuration, declare victory, and put a dlist out. Unlike some other backup programs, having the source configuration trimmed between backups doesn’t implicitly cause deletions, however I’m not certain how things work if it’s changed in the middle of an interrupted backup. If you go this route, you should at least add and check –dry-run which I’m fairly sure will protect the destination (even better, copy off the destination). For the database, see if you can tell if the one you saved on OneDrive seems close to the dblock and dindex files in recentness. Maybe the external timestamp will be off, depending on whether your copy method kept it. You can look inside at the RemoteVolume table and match up latest files if you like. There might have been a few in progress when the backup was interrupted. This explains State values such as Uploading or Uploaded.

Should you prefer to carefully take your chances on moving Duplicati forward enough for it to make its partial backup available through normal means, you could build a new job, or change your old one if you imported it. Here is a technique which you could try, to change the job path to a copy (adding safety) of old job database. Alternatively just rename your copy of the old database into the future path before Duplicati puts its file there. Your trivial backup can be any little file you like. With luck, the new backup will add that and finish the backup, leaving you a backup with the old and new files that you can restore through the normal restore mechanisms.

Sparx82 · December 27, 2018, 9:53pm

The database and the uploaded files are from the same backup run which I (as I stated above) myself interrupted to install a new motherboard into my server (new gadgets can’t wait )

Exactly true. Too bad there’s no possibility to get Duplicati doing that, but maybe I even try to write such a tool, as every information needed should be in the database, right?

Not possible as they are most probably already overwritten. But the missing files are not so important, it’s more a challenge for myself to get those files than a life-saving necessity.

Thanks for all other hints and explanations as well! I think I’m going to try to write an extraction tool… nice task for the christmas holidays

ts678 · December 27, 2018, 11:39pm

That helps. I caught the interruption part you wrote earlier, but it wasn’t clear the database save was then.

I gave you a method of getting Duplicati to do that by having its backup complete. You don’t want to try it? Because it sounds like you won’t miss the backup if it gets damaged, you could even omit safeguards.
Just move your old database in, make a proper destination config and a minimal source one, and backup.

Mostly, although some things (such as the passphrase for the backup) are intentionally stored separately.
Independent restore program points to a Python script to do restore without depending on Duplicati code, however it appears to start with dlist files (which you don’t have). If I had to try to do a recovery by hand, I would probably try to do it by building a dlist file, then letting Duplicati go off to actually gather all the data.

The basics of the file format are in the “How the * process works” articles that I mentioned above, and the primary contents are a filelist.json file as described in the article. I can give non-authoritative DB advice… Alternatively, do your own exploring of the database. I suggest tables Fileset, File, Blockset, Metadataset.

Making a local test backup of a small file and one larger than 100k would be a good warm-up experiment.
Below is a dlist entry from mine, so basically the job is to get the right values from the DB and write JSON.

{“type”:“File”,“path”:“C:\backup source\length1.txt”,“hash”:“a4ayc/80/OGda4BO/1o/V0etpOqiLx1JwB5S3beHW0s=”,“size”:1,“time”:“20181204T200947Z”,“metahash”:“5Rc4hdEFxvYIaXOfV7VteFa2hb5MVqWJxpPAiWG2MJk=”,“metasize”:137}

I hope you have some fun whichever way you go. If you come up with a useful tool, feel free to contribute. Developers and GitHub will probably be the pathway to actually get your tool put into the source archives.

Sparx82 · December 29, 2018, 9:38pm

I have now created a Python script to extract the needed files. If I find the time, I’ll enhance it and make it available for everyone. For the moment it’s just a hack, programmed in an hour and not nice enough for everyone

But anyway, the steps to get the files are the following (all can be found in the SQLite database):

Find the file in the “File” table and get the corresponding BlocksetID
In the table “BlocksetEntry”, find the BlocksetID and get the BlockIDs in the correct order (Index)
In the table “Block” look for the BlockIDs from step two (in this table only ID) and save the Hash and VolumeID
Now go to table “RemoteVolume” and go to the VolumeID from Step 4 (in this table only ID) and get the dblock filenames

Now you have all the information. Decrypt the necessary dblock zip files, get the blocks and concatenate them to get the original file.

ts678 · December 29, 2018, 10:11pm

Nice job, amazingly fast. You might have no interest in the full metadata load, but to get timestamps, perhaps:

Take File table ID, lookup as Fileset table ID, take Timestamp, convert UNIX time

You might also run into a character translation of the / and + characters before using hash as an entry in zip. Exactly where and how this happens is of interest to me because it has signs of correlation with compact bug.

Sparx82 · December 29, 2018, 10:18pm

You’re right, I forgot to mention that, I have to replace the + by - and / by _. I’ll have a closer look at this!