I was hoping to get more clarification on the following: I have a large number of dindex and dblock files from an interrupted backup. The backup was close to being completed so they should contain a good amount of data. Unfortunately I have lost access to the original data and these dindex and dblock files are all that remains. There were no dlist files created. I have attempted using the cli recovery tool but get the following error when using the restore command:
Program crashed:
System.InvalidOperationException: Sequence contains no elements
at System.Linq.Enumerable.First[TSource](IEnumerable1 source) at Duplicati.CommandLine.RecoveryTool.Restore.Run(List1 args, Dictionary`2 options, IFilter filter)
at Duplicati.CommandLine.RecoveryTool.Program.RealMain(String[] _args)
Would there be any way to get around this and restore part of my data? I have tried the python script and it seems to not do anything when pointed to my data. Any help would be greatly appreciated!
what do you mean by that ? I hope that you understand that to use this script, you need to download the files from your remote backend and get it on a local drive.
EDIT: I missed the part where you said that you donāt have any dlist files. You canāt hope to recover from a Duplicati backup where there are no dlist files. No list of files backed up == what you have is a bunch of binary data without any pointer to file names. That said, your backup data was the result of a first time backup, otherwise it would have already some dlist files.
If that includes the entire drive with your user profile, then you probably lost the database as well.
Your data is available in default 100 KB blocks if tiny pieces will help without any reassembly map,
which would be from either dlist files or the database. Thereās actually a bit of help in the dindex by
having blocklist information which can help form files you could rummage in, but names are in dlist.
Restore Backup with only dblock and dindex files was a similar dilemma, and I remembered a tool
had been made, but I had forgotten that a database was found and it used that, not the raw dindex.
It might be possible (if somebody has the time and skill) to bend an existing tool into just assigning
placeholder names to files, if it can put the files together from blocks (which would be a large help).
Do you do Python? If so, you could be the one who makes the tool you need, with some guidance.
To illustrate, I backed up a ten-block file whose blocks are identified by 32 byte hashes in a dindex.
The blocks are NUL because theyāre from a past test, but basically after Base64 itās a file in dblock. https://cryptii.com/pipes/hex-to-base64 of the SHA-256 f627ca4c2c322f15db26152df306bd4f983f0146409b81a4341b9b340c365a16
is (Iām using their RFC 4648 version which might be the right one ā need something filename-safe)
9ifKTCwyLxXbJhUt8wa9T5g_AUZAm4GkNBubNAw2WhY
and here it is in the dblock .zip file:
and grep can also help if you know what strings could help take you to whatever data youāre after.
If you have the space, you could unzip all the dblock files to a folder and go see what you can get.
That would be OK for single-block (default 100 KB) files. Large ones would benefit from assembly however this could be done by concatenating the blocks that you obtained from unzips of dblocks.
Duplicati Database recreate performance has more hints and a Python script if thereās a volunteer
wishing to extend it into a super-emergency recovery tool or maybe someday a blocksize changer.
Changing blocksize might start with what you need, which is extracting your existing files, however
blocksize change (which you donāt need here) would also require repackaging them into new ones.
Is this still of interest? One bit of somewhat good news is I found a program TrID - File Identifier that can guess-add a file extension for formats it knows.
Youād still be on your own with inventing file names and sorting out versions.
Youād get every version of every file (basically the entire backup) to sort out.
This also assumes that the unfinished backup has no other problems with it.
My understanding of whatās maybe ānormalā is also evolving as I look at this.
Part of my hope here is to be able to find ānot-normalā quickly to quickly look.
User seems to have lost interest, so recovery tool will not be pushed further until someone cares.
Below is what there is so far, and it might have other uses, e.g. if other restore methods disagree.
This one is very straightforward partly because it has to NOT read dlist. It reads dindex blocklists,
basically taking care of the problem of how to put blocks in right sequence to make the large files.
It also has the user do some manual decrypt and unzip. Some duplicates might occur, but ideally
the content of the same file name is the same. If any difference is seen, that would be concerning.
It grew up on Windows (which surprisingly does have a tar command), but is probably portableā¦
Hey Iām interested in this. For some reason, my entire backup folder on google drive has everything BUT the dlist files, no idea why this flaky backup software never included those. I tried using the Python script but is seems to do nothing even after following the instructions included in the comments of the file⦠Do I have to unencrypt all the files first, if so how would I go about that?
ā# If your files need decrypting, decrypt with AES Crypt or SharpAESCrypt.exe.ā
AES Crypt GUI is probably easier if you can run it, else look in Duplicatiās install.
AES Crypt can multi-select in Windows File Explorer, and is on right-click menu.
C:\Users\conta\Desktop\SLS-GoogleDrive>py reassemble6.py
Reassembling large multi-block files
Cleaning up blocks for large files
Cleaning up blocklist file blocks
Cleaning up file metadata blocks
Folder shows GoogleDrive, but all these files are downloaded and copied.
One of the where loops had incorrect syntax according to the interpreter, used just a = so I replaced with a :=. it appears to have worked, but I never really analyzed the code so Iām not sure.
Also I want to apologize if I was rude in my previous responses. I foolishly lost my company logoās illustrator file so Iām kinda stressed out about that haha, didnāt mean to push that on you.
Okay so I decrypted my files to .zips and re-ran the script. It has the same output as before, but nothing in the file folder. Looking in task manager, it seems itās only really using 5mb of ram and up to 20% cpu, hardly any disk usage.
Maybe Iām misunderstanding the purpose of this script.
I have a folder with whatās left of my backup (dindex.zip.aes, dblock.zip.aes) Iām missing my dlist files.
In that same folder I have list, block, and file folders and the python script.
The list folder has my decrypted dindex files, the block folder has my decrypted dblock files.
I run the script in-place next to the three folders. It takes time, and seems to be working, about 5 minutes later the script finishes. I check the file folder and there are no files in there, the other two folders has as many files as it did beforeā¦
Am I missing something, or would it not do anything if my backup is beyond saving?
Did you prepare the right way, as per example in the script, looking like the example layout above?
You canāt just decrypt and leave .zip files there. Thereās content extraction. Did you do all of that?
EDIT 1:
After you prepare folder, it reassembles files per dindex file directions, from blocks in the dblocks.
You have manually extracted the directions into the list folder, and blocks into the block folder.
Files do not have original names, as they were in the dlist files you donāt have, but you have files.
Figuring out what sort of files you have can be attempted using a file identifier program, e.g. TriD.
This sounds like you didnāt set things up. Do you see any of the same names in list and block?
One way to test is to copy all block somewhere, then try to copy in all list. Should be a lot of dups.
Thereās certainly a small chance that things are so bad there are no dups, but it seems unlikelyā¦
EDIT 2:
Here was my setup before a run. You can check some of the files in list to see theyāre in block:
Quoting the instructions in the script. I literally copied and pasted (excluding the comment at end):
# Example:
# for %i in (*.dindex.zip) do tar -xf %i list/* (ignore list not found error)
# mkdir block
# cd block
# for %i in (..\*.dblock.zip) do tar -xf %i (delete manifest, if you wish to)
# cd ..
# mkdir file
# Run the script by typing its name. It runs quietly, but inspect for errors.
Iām really displaying my poor reading comprehension right now, sorry.
Extracting the files from the zip now, itāll take a bit, but just to make sure⦠The dblock archives are pretty self explanatory. As for the dindex files, Iām guessing what I really need from those are the files in the list folder of each archive?
So it would be resulting in the block and list folder having only files with a base64 looking unique id as the filename and no file extension? Thatās what I gather from your directory tree example, but I just want to make sure.
Got it to work, recovered over 7000 files out of 10000. (the other 3000 still have an unknown filetype)
Unfortunately, the specific file I was looking for wasnāt among the 7k.
Besides that, your python tool is really good, would be cool to see it worked into Duplicati as a potential recovery method.
Correct. There are other ways to do that, but the easy one is to just ask for the list folder as tar -xf %i list/*. Since you used a few technical terms already, Iāll describe it furtherā¦
What I referred to non-technically as ādirectionsā identifies blocks by their SHA-256 hash, in
the required order. Thatās why the script reads 32 bytes (256 bits) at a time, then gets block.
Very close. Blocks are kept as files in a .zip, but because a file name canāt be raw SHA-256,
Base64 is done first. Duplicates sometimes occur due to reuses, but āunique idā is very close.
I guess that means TriD couldnāt figure it out. There are other possible ways to look based on format.
is maybe not recognized by TriD. Can you make a new test one to see if TriD can identify it properly?
In a brief web search, I tried to find Illustratorās format, got general info, but the format seems to vary. Pdf vs Ai gets deep, however the possibility remains that your file got identified as another extension.
Possibly true, but how do you know for sure? I guess it depends on how hard you want to try to find it.
Regardless, thanks for giving this emergency-use tool a try (and finding a typo ā maybe Iāll fix upload).
EDIT: Searching for strings, e.g. with findstr, might be an option, if you know what Illustrator might use.
I should say that small files (100 KiB or less) donāt need reassembly, so look in blocks too.
I donāt know if thereās any chance that an Illustrator file would be so small. I donāt use that.
.Ai files are embedded in Pdf, so are identified by a tool such as āfileā (Linux) as pdf files. So they are essentially text files, and contains the string āillustratorā. Quite easy to search indeed.