Fastest way to restore part of a backup

I’ve had a catastrophic failure.

I’ve pulled down all the files from the cloud to local disk. However using the GUI even with the latest Canary version is tracking on taking 5-6 days to recreate database. I just need the files.

What is the absolute fastest way to restore a 50GB file from the ~700GB backup?

Hello and welcome to the forum!

Duplicati does need to recreate the local database before you can do restores. This process shouldn’t take too long, but it can under some complicated circumstances. What version were you using before your disaster?

I am hardly an expert, but I have experimented with database recreation over the past year. Some older versions of Duplicati may not have written dindex files properly. When you recreate the database, Duplicati encounters these dindex files and decides it needs to read some or all of the dblocks (a very lengthy process) in order to rebuild the database.

There is a way to rewrite the dindex files without the corruption, but it requires an in-tact database. With recent canary versions this year, I have been able to fix my dindex files and then do a database recreation and have it complete in only 15 minutes or so on a 500GB backup.

In your case you may be forced to let the database recreation finish. If you go to About -> Show Log -> Live -> and choose Verbose in the dropdown, what does it show? You should see something like:

... processing blocklist volume X of Y

What do those numbers show?

Thanks for the reply.

Before the disaster I was using the latest Beta.

I tried to Restore using the latest beta, but looking at About>Show Log there, it looked like it would take about 20-23 days to recreate the database.

I have since downloaded the Canary version and am trying a restore there. Currently at 150/14138, tracking to take 4-5 days, which still seems excessive.

I am trying to use the recovery tool, however am not successful in getting the syntax to restore just the one file I need.

Restoring files if your Duplicati installation is lost is the standard path. It builds a custom partial temporary database. That should finish faster than a full database recreate which would have to make all files and all versions. You can point it at the downloaded backup files to gain a little extra speed from the local access.

You’re in somewhat strange territory if you’re downloading blocklist volumes, so I don’t know if the Direct restore from backup files would suffer the same fate of having to look for missing block information. Anything on the progress bar past 90% is the last exhaustive search for info that may or may not be found. Please record any error messages if it finishes with a complaint. We don’t want to do a rerun to get those.

I don’t know if I’ve done a partial before. I’ve done a complete, but I don’t know how much space you have. The help text says “Use the filters, --exclude, to perform a partial restore.”, but it crashes on me if exclude matches something. Possibly a bug, but what I was really hoping for you was that --include would work…

If you have another system, you can certainly try different methods in parallel to see which works, and fast.

EDIT:

My crash on --exclude match:

Using set 0 with timestamp 10/3/2020 1:14:33 PM
Building lookup table for file hashes
Index file has 5 hashes in total
Building lookup table with 4 entries, giving increments of 1
Computing restore path
Restoring 1 files to C:\tmp\RecoveryToolRestore
Removing common prefix C:\backup source\length1.txt\ from files
 error: System.ArgumentOutOfRangeException: startIndex cannot be larger than length of string.
Parameter name: startIndex
   at System.String.Substring(Int32 startIndex, Int32 length)
   at Duplicati.CommandLine.RecoveryTool.Restore.MapToRestorePath(String path, String prefixpath, String restorepath)
   at Duplicati.CommandLine.RecoveryTool.Restore.Run(List`1 args, Dictionary`2 options, IFilter filter)

I can avoid a crash (and have exclude work) by omitting --targetpath and letting it restore to original path. Information on the theory behind that (and the workaround that works here for that sort of crash) is here. Depending on how different the desired path is from the undesired ones, it might be possible to exclude.

That’s the procedure I was following, however the beta version was painfully slow at recreating the database. The latest canary version seems to be at least 4x faster, however will still take 5 days at this rate. I’ts using about 20% CPU, 7.5 GB of RAM, and reading from local disk. It’s not bottlenecked by anything that I can see.
Currently using the main GUI.

I think there may be a bug, where my original backup method was via a linux docker container. So when I do a Commandline --list, the folder hierarchy has the format of /folder1/folder2/folder3. Even when I try and do a full restore of that into a windows folder, it fails saying it can’t parse the folder structure.

I tried something like this, according to the documentation after the backup location you can enter a specific file to recover.

C:\Program Files\Duplicati 2>Duplicati.CommandLine.RecoveryTool.exe restore \\TOWER\Media\backups\microsoftbackup "/backups/192.168.2.2_seagate_backup_plus_drive/file1" --targetpath="C:\Users\name\Desktop\myrcovery\"

Program crashed:
System.Exception: Failed to parse the segment: /backup, invalid integer
at Duplicati.Library.Utility.Timeparser.ParseTimeInterval(String datestring, DateTime offset, Boolean negate)
at Duplicati.CommandLine.RecoveryTool.List.SelectListFile(String time, String folder)
at Duplicati.CommandLine.RecoveryTool.Restore.Run(List1 args, Dictionary2 options, IFilter filter)
at Duplicati.CommandLine.RecoveryTool.Program.RealMain(String _args)

I’ve been trying various ways of restoring this. Canary build via GUI is the only one that seems to be working, although still recreating the partial database… I"ll make sure to capture any errors in case it doesn’t make it to 100%… in 4 days.

Got a link? While maybe there are some discrepancies, what I found in the documentation just has

Restoring files using the Recovery Tool

Duplicati.RecoveryTool.exe restore <localfolder> [version] [options]

<localfolder> is a required option. It should point to the location where your downloaded remote files are stored. Optionally add --targetpath to specify where files must be restored to, otherwise the files are restored to their original locations. Use filters or the --exclude option to perform a partial restore. See exclude and APPENDIX D Filters for more information.

The command’s help is a little closer, but it’s not clear what version being a filename is supposed to do.
The code looks like you can give it the filename of a dlist file rather than imply it from a numeric version.

Restore
-------
Duplicati.RecoveryTool.exe restore <localfolder> [version] [options]

Restores all files to their respective destinations. Use --targetpath to choose another folder where the files are restored into.
Use the filters, --exclude, to perform a partial restore.
Version can be either a number, a filename or a date. If omitted the most recent backup is used.

Code says if it can’t find the dlist filename, it tries it as a number. That explains “/backup, invalid integer”.

One other issue which may or may not matter is Windows needs a double backslash before the closing double quote, otherwise using single backslash will escape the double quote instead of ending the string:

C:\>help "hi"
This command is not supported by the help utility.  Try "hi /?".

C:\>help "hi\"
This command is not supported by the help utility.  Try "hi" /?".

C:\>help "hi\"ho"
This command is not supported by the help utility.  Try "hi"ho /?".

This commonly bites people on filters where a trailing backslash is how to signify a folder in filter syntax.

\\TOWER\Media\backups\microsoftbackup is local disk? If it’s a share, it’s slow, but better than cloud.

I can’t find that Duplicati message, although sometimes Microsoft messages are shown in its messages. Cross-OS restores do require not going to the original path, but it sounds like you heard that word earlier.

If you have an interest in trying RecoveryTool, filter syntax might be able to turn --exclude into an --include using some exotic regular expression features. The below uses zero-width negative lookahead assertion:

--exclude="[(?!^C:\\backup source\\short\.txt$).*]"

I’m asking to include C:\backup source\short.txt by excluding everything (i.e. the .* part) that isn’t that path.

EDIT:

Added a backslash in front of the dot in the regular expression. Not critical, but that matches literal dot.
Also confirmed that filename of a dlist file for a version is taken, and if I typo it, it rejects it as a number.

What you have quoted I’m interpreting as restore just a single file:

Version can be either a number, a **filename** or a date. If omitted the most recent backup is used.

But maybe I’m reading that wrong.

It is an UnRAID server where the SMB share sits, on which the Windows VM is hosted as well. So it’s using the SMB protocol, but not actually going out onto the wire, just fetching within a virtual network. It’s the best I can do at the moment.

here is the command I’m trying with your regex, where the exclude really should be an include (thanks for crafting that for me) I think It’s still an issue where the original backup was in linux, but the restore is into Windows. We need to try and figure out the syntax for that…

C:\Program Files\Duplicati 2>Duplicati.CommandLine.RecoveryTool.exe restore \\TOWER\Media\backups\microsoftbackup 0 --exclude="[(?!^/backups/192.168.2.2_seagate_backup_plus_drive/UnRAID/Docker Backup/2020-08-26@01.00/CA_backup.tar.gz$).*]" --targetpath="C:\Users\name\Desktop\myrcovery\\" Building lookup table for file hashes Index file has 8214887 hashes in total Building lookup table with 2047 entries, giving increments of 4013 Computing restore path Restoring 1 files to C:\Users\name\Desktop\myrcovery\ Removing common prefix /backups/192.168.2.2_seagate_backup_plus_drive/UnRAID/Docker Backup/2020-08-26@01.00/CA_backup.tar.gz\ from files error: System.ArgumentOutOfRangeException: startIndex cannot be larger than length of string. Parameter name: startIndex at System.String.Substring(Int32 startIndex, Int32 length) at Duplicati.CommandLine.RecoveryTool.Restore.MapToRestorePath(String path, String prefixpath, String restorepath) at Duplicati.CommandLine.RecoveryTool.Restore.Run(List1 args, Dictionary`2 options, IFilter filter)

If you take the more obvious options out, you have “Version can be … a filename”.
It’s still a version. The question was how that happens. The answer is dlist’s date.
As I tested, it works with dlist filename, e.g. duplicati-20201003T171434Z.dlist.zip
As tested by both of us, it fails given an arbitrary filename, on the conversion, e.g.:

Program crashed:
System.Exception: Failed to parse the segment: typo-duplicati-20201003T171433Z.dli, invalid integer
   at Duplicati.Library.Utility.Timeparser.ParseTimeInterval(String datestring, DateTime offset, Boolean negate)
   at Duplicati.CommandLine.RecoveryTool.List.SelectListFile(String time, String folder)
   at Duplicati.CommandLine.RecoveryTool.Restore.Run(List`1 args, Dictionary`2 options, IFilter filter)
   at Duplicati.CommandLine.RecoveryTool.Program.RealMain(String[] _args)

Your startIndex error was commented on earlier, and the workaround for me was to not use --targetpath, meaning it will restore to original path. What’s different for you (and might change behavior) is cross-OS restore. I’m hoping it will do something helpful like create the sort-of-equivalent path-from-root for you, or perhaps you helping it with manual creation will help, or maybe it just won’t be able to run cross-platform.

it looks like the regex works as intended, but I think the cross platform may be biting us. I"ll try doing it on linux tommorow.

C:\Program Files\Duplicati 2>Duplicati.CommandLine.RecoveryTool.exe restore \TOWER\Media\backups\microsoftbackup 0 --exclude="[(?!^/backups/192.168.2.2_seagate_backup_plus_drive/UnRAID/Docker Backup/2020-08-26@01.00/CA_backup.tar.gz$).*]"
Building lookup table for file hashes
Index file has 8214887 hashes in total
Building lookup table with 2047 entries, giving increments of 4013
Restoring 1 files to original position
Removing common prefix /backups/192.168.2.2_seagate_backup_plus_drive/UnRAID/Docker Backup/2020-08-26@01.00/CA_backup.tar.gz\ from files
error: System.IO.IOException: The filename, directory name, or volume label syntax is incorrect.

I’ve tried duplicati-cli, and it just seems to be going through the same steps the GUi goes through.

I don’t see an equivalent to the duplicatiy.recovery-tool in linux.

11427/14138 on the main GUI restore… let’s hope it goes through.

Find your duplicati, maybe in /usr/lib/duplicati, run mono duplicati.CommandLine.RecoveryTool.exe

If you look at the text contents of duplicati-cli, you’ll see it’s just s convenience script to do a similar thing.

Here’s the heart of duplicati-cli. On some systems you need to actually say mono to get the .exe running:

EXE_FILE=${INSTALLDIR}/Duplicati.CommandLine.exe
APP_NAME=Duplicati.CommandLine

exec -a "$APP_NAME" mono "$EXE_FILE" "$@"

Duplicati.CommandLine is basically the CLI version of what the GUI does. I hope GUI restore works.

Cool I did not know about mono…

Thank you for all your help… trying the command gives a rather cryptic error.

/usr/lib/duplicati$ mono Duplicati.CommandLine.RecoveryTool.exe help restore | head - 10
==> standard input <==
Duplicati Recovery Tool

This tool performs a recovery of as much data as possible in small steps that must be performed in order.
We recommend that you use Duplicati.CommandLine.exe to do the restore, and rely only on this tool if all else fails.

The steps to perform are:

head: cannot open ‘10’ for reading: No such file or directory
@testserver:/usr/lib/duplicati$ mono Duplicati.CommandLine.RecoveryTool.exe restore 0 --exclude=’[(?!^/backups/192.168.2.2_seagate_backup_plus_drive/UnRAID/Docker Backup/2020-08-26@01.00/CA_backup.tar.gz$).*]’ --target-path="/home/restore/"
> Folder not found: /usr/lib/duplicati/0

mono Duplicati.CommandLine.RecoveryTool.exe restore 0 --exclude=’[(?!^/backups/192.168.2.2_seagate_backup_plus_drive/UnRAID/Docker Backup/2020-08-26@01.00/CA_backup.tar.gz$).*]’
Folder not found: /usr/lib/duplicati/0

Which cryptic error?

The standard input text is due to a space. Maybe you meant head -10 but did head - 0`.

head: cannot open ‘10’ for reading: No such file or directory is same typo.

head man page

head [OPTION]… [FILE]…
Description

Print the first 10 lines of each FILE to standard output. With more than one FILE, precede each with a header giving the file name. With no FILE, or when FILE is -, read standard input.

made me focus on solving “one file” portion, but maybe there’s a general issue with syntax here, which is:

Duplicati.RecoveryTool.exe restore <localfolder> [version] [options]

The item after restore is the folder, which you have (I hope) filled with download and index steps already.
Folder not found: /usr/lib/duplicati/0 is because you said the folder is 0 while in /usr/lib/duplicati
Basically you left off the folder to restore from, so the version number took its place and confused things…

Recovering by using the Duplicati Recovery tool gives the general flow, and for help text just use the help option and scroll through it or pipe to a pager. Because you already made an SMB folder for the Windows test, Linux can probably use it, and save some time compared to downloading from a far remote location.

I’m not sure if --target-path will choke with a startIndex error. It might work, as it seems to be in code that handles drive letters (which you don’t have in either source files or your folder). It’s worth trying it anyway.

It was just late at night and wasn’t thinking straight, I’ve managed to get the syntax to work.

So we got to the end of the GUI restore and am just a bit agitated.

I gave the VM 300GB of a local disk to restore the 50GB file, with the 700GB backup on the network share.

Just got home and the local disk is full with duplicati metadata. It’s still trying to pull down blocks, but can’t because all 300GB are full with duplicati files.

  1. Does the computer I’m restoring from require at least the size of the backup + the size of the restore? In this scenario 700 +50?

  2. The restore hasn’t failed, but it’s attempting to keep pulling down blocks. Any way to salvage this 5 day restore?

  3. Can you help me understand why the restore is so slow? Sure it can take 15-24 hours and that makes sense. But 5 days just to generate an index? It is not bottlednecked anywhere in terms of resources, reading effectively from local disk…

  4. Can I steal the temporary index file that was created and use it somehow? I’m not keen on having another 5 day restore once again fail.

Above syntax issue was in RecoveryTool, but then we head to GUI:

seems to conflict with below. I’m not sure which one is discussed…

which now sounds more like Duplicati.CommandLine.RecoveryTool.exe. Here’s a sample index.txt file:

474MCuFW3/VCw1x00cxfz2o44JIw9N+RzUVxW02nx7Q=, duplicati-ba303c5deab3d494d888f3afc3dff51a1.dblock.zip
9nqxCtTkxTEhtqX+TanBDd7pBbl403iNJyPXv6y+KKk=, duplicati-ba303c5deab3d494d888f3afc3dff51a1.dblock.zip
h8rHSAL59e8cLlPlVnGMV9P8rpwpBqmvYiD5vHMrRcs=, duplicati-ba303c5deab3d494d888f3afc3dff51a1.dblock.zip
manifest, duplicati-ba303c5deab3d494d888f3afc3dff51a1.dblock.zip
uba1z2XgTiXR9zNGHuFb5wqN3TnMIdYpVbtHli/9nwA=, duplicati-ba303c5deab3d494d888f3afc3dff51a1.dblock.zip

It looks like it should be portable, except maybe for line endings. Windows uses CRLF, Linux just LF, however there are editors and utilities to fix that if need be. It’s a map of block hash to containing file.

So if you want to try RecoveryTool on Linux, you can probably use or adapt (line ends) prior index.txt.

was the previous RecoveryTool problem, and hope was that it would be avoided by running on Linux.
I don’t think there are more downloads needed (everything is already local via SMB), just extractions.

The downloading part sounds more like GUI restore behavior, but I’m not certain what step you’re on.
“Building partial temporary database” does downloads of dlist and dindex files, and has to locate the needed blocks to their containing dblock file very much like index.txt does, but in an SQLite database.

Ordinarily the index files say where all blocks are, but a missing block can force a full dblock search. Regular recreate shows this slow spot at the 90% to 100% range on progress bar. and you can also observe the action in About --> Show log --> Live --> Information to see what files you’re processing.
The “processing blocklist volume” mentioned earlier is dblock files. Search was also described here.

Having to download all dblock files is not normal. If needed, it adds a lot to the time to create the DB.
If the slow spot is before the 70% mark or so on progress bar, time is probably from inserting blocks.
Default block size is 100 KB, so if that’s what you have, full recreate of 700 GB has 7 million inserts.
Partial (as in direct restore from backup files) should be faster, but might still need full dblock search.
There’s also some overhead on downloading, decrypting, and unzipping the files to extract contents.
If the hope was that moving to Linux would make the DB better than Windows does, I don’t see how.

There is no index to steal, but there is a partial temporary database (not really any way to reuse that).

In terms of downloads, if you’re in DB creation, then see above. If you’re in actual restore that follows, downloads should only be the dblock files that are needed for whatever is being restored. By default, there’s even an attempt to look at original source file paths to see if any blocks are available there, so theoretically one could do a “restore” without downloads. I’m pretty sure these dblock files download immediately before blocks are pulled out and put in the file being rebuilt in the restore area. It will look somewhat strange as it’s reassembled a block at a time. If you don’t see it, you’re not restoring yet…

If you can say whether you’re asking about RecoveryTool, GUI direct restore, both, or something else, perhaps I can answer a little better, however I’m not familiar with every last detail of the processing…
Please clarify what steps you’ve done. I see “got to the end”, but also hear talk of stuck-in-the-middle.

I’m also kind of curious why GUI is (maybe) being tried. I thought next try was RecoveryTool on Linux.

Appreciate your patience.

I am attempting 2 restores in parallel

  1. Windows via GUI
  2. Trying the CommandLineRecoveryTool in linux, in case it finished faster (or just in case this happened, where the GUI method failed/stalled/didn’t work, so I wouldn’t have to wait 5 days to start from scratch, and would have another one running in the background)

Regarding the Windows GUI

  • We are stuck at Recreating Database (because there is no more disk space, we had 260 GB free when we started)
  • We are 90.00000036% progress in Duplicati.
  • Messages under Verbose logging:

Blockquote
Oct 7, 2020 7:34 PM: Backend event: Get - Retrying: duplicati-b69c64479d7914c91bff7264ce753b6b7.dblock.zip.aes (49.99 MB)
Oct 7, 2020 7:34 PM: Operation Get with file duplicati-b69c64479d7914c91bff7264ce753b6b7.dblock.zip.aes attempt 3 of 5 failed with message: There is not enough space on the disk.
Oct 7, 2020 7:34 PM: Backend event: Get - Started: duplicati-b69c64479d7914c91bff7264ce753b6b7.dblock.zip.aes (49.99 MB)
Oct 7, 2020 7:34 PM: Backend event: Get - Retrying: duplicati-b69c64479d7914c91bff7264ce753b6b7.dblock.zip.aes (49.99 MB)
Oct 7, 2020 7:34 PM: Operation Get with file duplicati-b69c64479d7914c91bff7264ce753b6b7.dblock.zip.aes attempt 2 of 5 failed with message: There is not enough space on the disk.

And obviously not changing due to space requirements.

-C:\Users\user\AppData\Local\Temp is 263 GB, and filled with files prefixed with dup-. Majority are 50 MB (my block size), and a couple files that are 1.3 and 1.8 GB.
-I have been watching the restore like a hawk for the past 5 days, and only on the 5th day, did it start downloading blocks locally. For the past 4 days the disk on which the operating system resides had nothing going on.

My hope was to take the temporary database the GUI restore created in order to not lose this restore process, and have to wait another 5 days (cause I understand I probably have to restart). I get that’s probably not possible.

Regarding the Linux Commandlinerecoverytool,
I am currently running the download command, to decrypt the files, and will initiate an index creation in the morning. The linux VM restoring this data only has a 60GB drive, so I hope I can use the appropriate flags to not have the indexing stall due to to space constraints again.

Concrete Questions
Is there any way to salvage the Windows GUI restore? We’re at 90%, and a full drive (why is still a mystery) If not, then this restore has failed effectively, that’s what I was implying earlier.

Does the VM that is doing the restore require at least BackupSize + Restore Size worth of hard drive space?(700GB + 50GB in my scenario?) This is unintuitive considering I did a partial restore via the GUI, not a full restore, yet still using 260GB of local space.

Why is index creation so slow? 5 days seems excessive to read and process 700GB.

That explains the mix. I assume GUI is not DB Recreate, but “Direct restore from backup files”.
I just tested Recreate on latest Canary and 2.0.5.1 from local folder destination of bad backup.
Because it was bad (I had a broken one from some attempts to break it), it dowloaded dblocks,
however they never accumulated. Watching with Sysinternals Process Monitor on a 50 MB file,
file was written (slowly, because I have a throttle), read in less than 1/10 second, then deleted.

Searching in GitHub Issues and forum with Google found neither question nor issue like yours.

You’ll clearly need the restore size at least once. Based on my Recreate test, dblock searches
looking for missing blocks require no more space, yet you’re seemingly seeing files build up…

Because my files were transient, the view from dir /od dup-* showed empty files cycling by.
The view from File Explorer sorted by date showed the same thing. No file build-ups ever seen.

Direct restore test file action was hard to follow in this last-second experiment, so less certain.

For a Recreate, it looks like the DB gets put together in its normal location. For DB used for the
Direct restore from backup files, the DB is one of the dup- Temp files. Process Monitor
can hint it’s an SQLite DB because of the file suffixes of -journal and -wal sometimes shown.
Ultimately, opening it with DB Browser for SQLite shows its a DB. The one I looked into was the
one made in preparation for showing you the version dropdown and file list. That’s what it made.

Ideally it should be allowed to finish. I don’t know if you have any way to add drive space right now.

Where the database is may depend on whether you’re doing Recreate or Direct restore, as above.
If you can identify it and steal it, you might be able to get it to a somewhat working shape, however
what you probably want is your entire file, with all its blocks, and so the block-search is concerning.

The RecoveryTool also cannot completely restore a file if some of its blocks are missing, but it will
try harder (and with more tolerance for oddities) than regular restore. I forget how/if it signals issue.
Best plan if RecoveryTool succeeds is probably to test the file to see if everything appears working.

Assuming you don’t mean RecoveryTool index.txt, but the time to 5th day where dlist and dindex files
downloaded, it’s probably the 7 million SQL INSERT operations mentioned earlier. The way to check
for SQL operation time is gone now, but would have been to use a log at profiling level to observe.

One question in the above is where your database was forming. You might still be able to see activity
using Process Monitor and filtering to include paths including journal or wal. I’m not sure if downloads generate database records when they don’t complete though. Sysinternals Process Explorer can list
open files from the Find menu item, so you can see if a dup- file is always open. It might be your DB.

After running:

sudo mono Duplicati.CommandLine.RecoveryTool.exe restore /mnt/smb/msbackup2/decrypted/ 3 --exclude=’[(?!^/backups/192.168.2.2_seagate_backup_plus_drive/UnRAID/Docker Backup/).]’ --target-path="/mnt/smb/msbackup2/restore2"

I GOT MY FILES BACK! A bit older than I wanted, but that was ok.
This process was lightning fast.
Decrypt of ~650GB took 9 hours
sudo mono Duplicati.CommandLine.RecoveryTool.exe restore /mnt/smb/msbackup2/decrypted/ took less than 10 minutes.

And the recovery above took 15-20 minutes. The key was using any version other than 0. Looks like that version may have gotten corrupted somehow, so the software was working hard to fix that. Using previous versions ended up working much faster!

Thanks for all your help ts678.

1 Like