I am looking for a certain content in a file that might or might not have been included in my backup at some point in time. I don’t know the file name and the backup is very large.
To do this I would like to restore all versions of all files in the backup and then simply grep over them. However, even if I specify --all-versions=true and select the root directory in the file selection dialog, only the latest backup version is restored. It seems that deleted files are not restored if I also select non-deleted files.
Am I missing something here? It seems this is a relatively straightforward problem. (And I don’t care if the solution is slow, or needs a lot of space.)
Searches in all backup sets, instead of just searching the latest.
You’re doing a restore. Duplicati unfortunately doesn’t reject options that don’t apply to your action.
A Duplicati version is a point-in-time view of what was there at the time. Deleted files are not there.
What you wish for is a rare request, but it’s come up at least twice before. The key idea from these:
Do you know where in the file the data would be? If in the first block, you could just grep the raw blocks which would answer the question of whether the file is in the backup. Finding the file might be next step. This would definitely be a more technical method than just extracting all versions of all files with a script.
blocksize default is 100 KiB unless you changed it. You can also find your string in later blocks if you’re careful about the possibility that it might be broken across two blocks. So maybe you’d run two greps…
Ah, I hadn’t realized all-versions applies to FIND only (which presumably powers the file selection dialog).
I played with this some more and I believe the current behaviour (with all-versions enabled) is at least confusing. Say you have two files, EXISTING and DELETED. Now this:
will restore DELETED. But this:
will only restore EXISTING. That might be expected behaviour to a developer, but to a user it’s a bug.
Thanks for the block grep suggestion, but I don’t know where in the file the content is, and it might be after the first block. It also seems nontrivial to handle the case where the it’s across the block boundary.
Right now ISTM the easiest solution is a script that restores all N versions to …/dir$N and then to grep across all those dirs.
How, step by step, do you get deleted file in the GUI to check?
So if your long pattern is JFDFJSDLVMFIOUJSFIOASDJ, grep start and then end of total.
No matter where it splits across blocks, wouldn’t one of the greps identify some block?
To clarify, start and end don’t mean chop out the middle, but split the total string in two.
Regardless, do as you like. You don’t care about speed or space, so just script restore.
That will save a second step of mapping block to file, unless desired answer is in block.
To see deleted files in the GUI I just enabled all-versions in the general settings. Then I click on the backup and select “restore”, and the latest version (which is selected by default) shows the deleted files. I guess that runs FIND in the background, whose result is inconsistent with RESTORE for the all-versions option (because RESTORE ignores this option).
As for grepping, I see what you mean, but if I split my pattern in two I suspect I’d get many false positves (it’s short). I may be wrong, but scripting the whole thing still sounds easier