Unable to complete restore over NFS - Duplicati.Commandline.Recovery.exe

Hi there,

I need help with an issue that’s preventing me from restoring my Duplicati backup. To make the story short, I was using Duplicati to backup all my files to Backblaze B2 on a regular basis up until the start of October. All of my drives failed me at the same time around this time, and I lost the database file for my backup server, which wasn’t backed up, as I assumed only the dataset itself was needed to complete a quick restore.

The total dataset size in question is around 4TB and contains around 1,651,300 individual files over 18,673 150Mb blocks. I’ve yet to recover these files, and am at a loss as I’m encountering issues after issues.

When this occured, I tried to simply restore the files via the Duplicati Web UI on a fresh installation with new drives, but without the database, and seeing as the blocks were encrypted, and perhaps that I was trying to pull them directly from B2, Duplicati crashed before ever completing the index rebuild after a whole 2 weeks. I believe that there was a canary update implemented during that time but I didn’t see any improvements after trying again with that specific patch version for another whole week.

Seeing as the Web UI route wasn’t getting me anywhere, I set to follow the Disaster Recovery guide and attempt a command line restore: https://duplicati.readthedocs.io/en/stable/08-disaster-recovery

Out of frustration, I downloaded all the files from Backblaze on my rebuilt host, which was now running FreeNAS. This process alone cost me around 100$ CAD to pull.

One issue with this that I did not factor in, is that I unfortunately wasn’t able to install Duplicati on the host directly, or via a jail. I cannot financially afford to pull this dataset again from the cloud for a few more months, nor do I have sufficient space to move it elsewhere, so everything needs to be done directly from where the files are now stored.

As such, I’ve resorted to creating an NFS share on the FreeNAS host where my dataset is stored, and I have an Ubuntu VM running Duplicati on the same machine connected to said share running the Recovery tool. This VM is running Ubuntu 18.04.3 LTS and has 4 cores and 8GB of RAM assigned to it.

I first had to decrypt each file. This process failed about 3-4 times due to Out of Bounds exceptions. But I managed to decrypt each archive.
Same thing occured while indexing. If the indexing process stopped for whatever reason, Duplicati was unable to recover an indexing session, as it gave an error about running out of memory.

After going all this trouble, I now have all dblock.zip files decrypted and indexed. I was able to start a partial restore and can see all file names. The whole dataset seems to be intact.

However, I have been running into the following problem mid-restore, here’s one example:

331239: /mnt/data/Restored/srv/shared/Backups/BungeePC/BungeePC-Mar2019/Documents/Native Instruments/Massive/Sounds/!Dubstep/Dubstep lead 1/lead 3 (4).nmsv (2.74 KB) error: System.IO.IOException: Too many open files
at System.IO.FileStream…ctor (System.String path, System.IO.FileMode mode, System.IO.FileAccess access, System.IO.FileShare share, System.Int32 bufferSize, System.Boolean anonymous, System.IO.FileOptions options) [0x0025f] in <8f2c484307284b51944a1a13a14c0266>:0
at System.IO.FileStream…ctor (System.String path, System.IO.FileMode mode, System.IO.FileAccess access, System.IO.FileShare share) [0x00000] in <8f2c484307284b51944a1a13a14c0266>:0
at (wrapper remoting-invoke-with-check) System.IO.FileStream:.ctor (string,System.IO.FileMode,System.IO.FileAccess,System.IO.FileShare)
at Duplicati.CommandLine.RecoveryTool.Restore+CompressedFileMRUCache.ReadBlock (System.String filename, System.String hash) [0x00020] in <3f063307a1b340f3ac9162f9c12d9a98>:0
at Duplicati.CommandLine.RecoveryTool.Restore+HashLookupHelper.ReadHash (System.String hash) [0x00068] in <3f063307a1b340f3ac9162f9c12d9a98>:0
at Duplicati.CommandLine.RecoveryTool.Restore+HashLookupHelper.WriteHash (System.IO.Stream sw, System.String hash) [0x00000] in <3f063307a1b340f3ac9162f9c12d9a98>:0
at Duplicati.CommandLine.RecoveryTool.Restore.Run (System.Collections.Generic.List1[T] args, System.Collections.Generic.Dictionary2[TKey,TValue] options, Duplicati.Library.Utility.IFilter filter) [0x0041e] in <3f063307a1b340f3ac9162f9c12d9a98>:0

Duplicati starts giving off System.IO.IOException: Too many open files for each files after a certain number of them and the rest of the restore fails and this spams the console.
As this is all running under Linux, I attempted to raise the file descriptor limit on both the VM and FreeNAS host as high as I could to try and resolve the issue, but this did not fix the problem.

I am super close to getting my files back but unsure where to go from there. Does anyone have any idea how I can get past this error message?

Any help is appreciated.

Welcome to the forum @Bungeetaco

Can you say more about that? I think there’s both a user limit (in hard and soft flavors) in ulimit, and a system limit in /proc/sys/fs/file-max. How to set ulimit and file descriptors limit on Linux Servers

I’m not sure if it’s normal to have to raise open file limit. It might be a bug but you need a workaround. Possibly during some run you could use lsof or fuser to try to find out what process is holding what.

Hi @ts678,

Thank you for reading through my post, I should’ve probably mentioned the current system limits in there as well.

Here are the current limits on my VM:

Blockquote
bungee@restoredafiles:~$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 31545
max locked memory (kbytes, -l) 16384
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 31545
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Blockquote

I did look into the guide you posted above during my troubleshooting, but that did not resolve the problem. I think Duplicati might actually be using the kernel hard limit, which is bound to the amount of RAM I have (8GB), thus it’ll only go as far as 800K files-ish, which is what I’ve observed.

lsof | wc -l returned the following when it started failing:
6625

That number seems closer to the stack size, which again is bound to the amount of RAM. Seeing as I have almost 17 000 files, maxing out the RAM on my host isn’t a possibility.

I’ve tried to approach this a different way and ran into a different problem.

I’ve removed one of my 10TB drives from my ZFS pool that contain my dblock files, which left me with 1 drive still containing all the files available over Samba, and 1 drive that I formatted to NTFS which I plugged into my PC where that Samba share is mounted as a network drive.

When I attempt the restore on my Windows PC, this is the error I get with the following command:

.\Duplicati.CommandLine.RecoveryTool.exe restore Z:\Duplicati\Decrypted\ 0 --targetpath=“J:\Restored”

1: J:\Restored (9.84 KB) done!
error: System.IO.IOException: The target file “J:\Restored” is a directory, not a file.
at System.IO.File.InternalCopy(String sourceFileName, String destFileName, Boolean overwrite, Boolean checkHost)
at Duplicati.CommandLine.RecoveryTool.Restore.Run(List1 args, Dictionary2 options, IFilter filter)

If I try the same thing with just J: instead of J:\Restored for the targetpath, I get the following:

Parameter name: path
at System.IO.Directory.CreateDirectory(String path)
at Duplicati.CommandLine.RecoveryTool.Restore.Run(List1 args, Dictionary2 options, IFilter filter)

Z: is the Samba share containing the dblock files, and J: is the 10TB drive I took out of my FreeNAS host to restore the files to.

Any ideas why this isn’t working? I’m following the official documentation.
This is also the reason I tried switching to the Linux VM in the first place.

Losing track of systems. Is the backup server the one whose drives failed, taking out not only the source files but the Duplicati database? In what ways is it a backup server? Is there another backup going to it?

One reason I ask is because another option to try is Independent restore program in Python. How’s your Python? Does it look like this restore timestamps or just files, and how essential are the file timestamps? Duplicati backup files don’t care about their timestamps, but I don’t know whether your backup will care.

We’re getting into some rarely used things. Possibly your open file problem is only in RecoveryTool, and regular Duplicati Restore from GUI or CLI would work – if you could get installed and if other things held.

Keeping NFS in the mix adds potentially more questions. Can’t most VM system share host files with the VM in some limited way (I guess you want to be super-sure nothing you do breaks your only download)?

But the latest approach sounds like a Windows PC with access to backup files via either Samba or USB.

Is it only dblock files? Regular operations also need dlist and dindex. I think RecoveryTool does without dindex, but needs dlist. There’s also a step in directions about building an index file RecoveryTool-style. There was talk earlier of index, so I suspect there’s an index, but it wouldn’t hurt to verify what’s there…

IIRC index file is used to find blocks (identified by hash) which are then pulled out to rebuild original file. Yours looks from source like it probably got as far as making a temporary file, then choked on copy-out.

Is there any other context output before that to show where this was? The possible failure (and source code which might have made messages) is below, just guessing by seeing where a Copy might occur:

Documentation shows the following example of what I mean by context – what led up to the problem?

Sorting index file ... done!  
Building lookup table for file hashes  
Index file has 2047 hashes in total  
Building lookup table with 2046 entries, giving increments of 1  
Computing restore path  
Restoring 75 files to C:\Restore  
Removing common prefix C:\Users\User\ from files

Your command taken in isolation looks to me as documented, but I haven’t set up a test. First, does the output you get leading up to the error look sane for your situation based on the example shown above?

Because you’re doing a cross-platform restore (original has forward slashes, restore has backslashes), I’m wondering whether that difference could be tripping things up. Regular GUI restore can deal with it, somewhat, but IIRC it won’t let you restore to original file paths – you have to point it to a restore folder.

How the backup process works shows how paths, slashes, and maybe drive letters appear in dlist files.

EDIT: Not a perfect test, but I tried RecoveryTool on an existing Windows backup, restored to Windows, and it worked (as it has been doing for awhile AFAIK). An existing Linux backup restore to Windows did not work, so I custom-tailored one with two small files. This failed as below first, then I edited filelist.json in the dlist file to add a drive letter (leaving the slash differences alone), and it worked, but this seems a really ugly workaround, and perhaps trying a RecoveryTool restore to Linux would be a better aproach.

C:\ProgramData\Duplicati\duplicati-2.0.4.36_canary_2019-12-08>Duplicati.CommandLine.RecoveryTool.exe restore C:\RecoveryTool --targetpath="C:\RecoveryTool_targetpath"
Sorting index file ... done!
Using set 0 with timestamp 12/9/2019 12:45:53 PM
Building lookup table for file hashes
Index file has 5 hashes in total
Building lookup table with 4 entries, giving increments of 1
Computing restore path
Restoring 2 files to C:\RecoveryTool_targetpath
0: C:\RecoveryTool_targetpath (2 bytes) done!
 error: System.IO.IOException: The target file "C:\RecoveryTool_targetpath" is a directory, not a file.
   at System.IO.File.InternalCopy(String sourceFileName, String destFileName, Boolean overwrite, Boolean checkHost)
   at Duplicati.CommandLine.RecoveryTool.Restore.Run(List`1 args, Dictionary`2 options, IFilter filter)
1: C:\RecoveryTool_targetpath (2 bytes) done!
 error: System.IO.IOException: The target file "C:\RecoveryTool_targetpath" is a directory, not a file.
   at System.IO.File.InternalCopy(String sourceFileName, String destFileName, Boolean overwrite, Boolean checkHost)
   at Duplicati.CommandLine.RecoveryTool.Restore.Run(List`1 args, Dictionary`2 options, IFilter filter)

C:\ProgramData\Duplicati\duplicati-2.0.4.36_canary_2019-12-08>Duplicati.CommandLine.RecoveryTool.exe restore C:\RecoveryTool --targetpath="C:\RecoveryTool_targetpath"
Sorting index file ... done!
Using set 0 with timestamp 12/9/2019 12:45:53 PM
Building lookup table for file hashes
Index file has 5 hashes in total
Building lookup table with 4 entries, giving increments of 1
Computing restore path
Restoring 2 files to C:\RecoveryTool_targetpath
0: C:\RecoveryTool_targetpath\C/backup source/A.txt (2 bytes) done!
1: C:\RecoveryTool_targetpath\C/backup source/B.txt (2 bytes) done!

C:\ProgramData\Duplicati\duplicati-2.0.4.36_canary_2019-12-08>

Notepad-edit filelist.json with drive letter added in front of the original Linux forward-slashed pathname:

[{"type":"File","path":"C:/backup source/A.txt","hash":"BvlhuAK8Ru4WhVXwZtKPTw6a/fP4gXTB7m+d4AT8MKA=","size":2,"time":"20191209T174422Z","metahash":"GSk37v/IInH+jzfV2NXAPguQlbs7VN0GFhtH10EoqFw=","metasize":193},{"type":"File","path":"C:/backup source/B.txt","hash":"wM3nf6j++X1HbBCq09LVT8wvM2FA0HNlHC3Mzx43n9Y=","size":2,"time":"20191209T174436Z","metahash":"F52HR7pEdAbPn9p6rK/6SHkzB69QBf/WTmek90qzvV0=","metasize":193}]

EDIT 2: Drive letter used as dummy doesn’t matter. It turns into a folder (without a colon) in targetpath. This info is more for any developer who wants to dig in. I still hope that this workaround is not required.

If any other users or developers have other ideas, feel free to join in on the action with your thoughts…

There’s only one box. This box was running a VM, and in this VM, Duplicati ran. This was the backup server in question. The drives where this installation was stored died. I then got new drives to replace the ones that died, and those drives are connected to the same machine that my backup server used to run on.

The way I had it setup, all my other VMs, including my fileserver, were available over NFS, and were mounted on the Duplicati VM. Duplicati saw these mounted folders as local ones and would back them up on a regular basis.

I did try the script you linked during my troubleshooting. Unfortunately I was not able to get it to work on two different systems (Linux and Windows), as it seems the modules have been updated and do not get recognized by the script anymore:

C:\Users\r00t>pip3 install pyAesCrypt
Collecting pyAesCrypt
Downloading https://files.pythonhosted.org/packages/c5/99/ebae89da08a443d0191eff631d66b71955eec7e370e2a309d85e499ca18a/pyAesCrypt-0.4.3-py3-none-any.whl
Collecting cryptography
Downloading https://files.pythonhosted.org/packages/a4/9c/9ad11ad293bb75d37fa63a0d1f47ae7f6ccda32bc7452bf2ffb788ca753f/cryptography-2.8-cp37-cp37m-win_amd64.whl (1.5MB)
|████████████████████████████████| 1.5MB 1.3MB/s
Collecting cffi!=1.11.3,>=1.8
Downloading https://files.pythonhosted.org/packages/71/71/6011c17e8a523be0be20149ed7ba7e8671db7ec6fc52868e4f9a49d527f8/cffi-1.13.2-cp37-cp37m-win_amd64.whl (175kB)
|████████████████████████████████| 184kB 3.3MB/s
Requirement already satisfied: six>=1.4.1 in c:\users\gabriel\appdata\local\programs\python\python37\lib\site-packages (from cryptography->pyAesCrypt) (1.12.0)
Collecting pycparser
Downloading https://files.pythonhosted.org/packages/68/9e/49196946aee219aead1290e00d1e7fdeab8567783e83e1b9ab5585e6206a/pycparser-2.19.tar.gz (158kB)
|████████████████████████████████| 163kB 6.4MB/s
Installing collected packages: pycparser, cffi, cryptography, pyAesCrypt
Running setup.py install for pycparser … done
Successfully installed cffi-1.13.2 cryptography-2.8 pyAesCrypt-0.4.3 pycparser-2.19

C:\Users\r00t>pip3 install ijson
Collecting ijson
Downloading https://files.pythonhosted.org/packages/bb/f9/9dd57fe468e033610b979adcb45b159b76309ea4df63f70a10cfb75d71ae/ijson-2.5.1.tar.gz
Installing collected packages: ijson
Running setup.py install for ijson … done
Successfully installed ijson-2.5.1

Z:\Duplicati>python3 ./restore_from_python.py
Traceback (most recent call last):
File “./restore_from_python.py”, line 14, in
import ijson
ModuleNotFoundError: No module named ‘ijson’

Z:\Duplicati>pip install ijson
Requirement already satisfied: ijson in c:\users\gabriel\appdata\local\programs\python\python37\lib\site-packages (2.5.1)

Z:\Duplicati>pip install pyaescrypt
Requirement already satisfied: pyaescrypt in c:\users\gabriel\appdata\local\programs\python\python37\lib\site-packages (0.4.3)
Requirement already satisfied: cryptography in c:\users\gabriel\appdata\local\programs\python\python37\lib\site-packages (from pyaescrypt) (2.8)
Requirement already satisfied: six>=1.4.1 in c:\users\gabriel\appdata\local\programs\python\python37\lib\site-packages (from cryptography-pyaescrypt) (1.12.0)
Requirement already satisfied: cffi!=1.11.3,>=1.8 in c:\users\gabriel\appdata\local\programs\python\python37\lib\site-packages (from cryptography-pyaescrypt) (1.13.2)
Requirement already satisfied: pycparser in c:\users\gabriel\appdata\local\programs\python\python37\lib\site-packages (from cffi!=1.11.3,>=1.8->cryptography->pyaescrypt) (2.19)

Z:\Duplicati>python ./restore_from_python.py
Traceback (most recent call last):
File “./restore_from_python.py”, line 23, in
from pyaescrypt import pyAesCryptDecrypt, fail_with_msg
ModuleNotFoundError: No module named ‘pyaescrypt’

I even made sure to try both python and python3, as the pyaescrypt module uses Python 3 from what I could gather.

Unfortunately no, this is a VM, thus it doesn’t have any visibility over the host.

That’s correct, so I now have a 10TB drive that is plugged into my desktop PC running Windows that was previously mirroring the data contained on the other drive on my host containing the Duplicati backup.

I do have a dlist.zip file for each individual backup dates, but I do not have any dindex.zip files. I checked my Backblaze storage to see if I could find the dindex files in there, but I couldn’t, so it seems they never were created during backup. I did however generate the index.txt file in the process of using the RecoveryTool.

Yes, the console did push out the same exact output as you quoted. It took about 5-10 minutes for the restore task to actually start.

Even if that was the case, since I’m doing it over Windows, the path is using the same type of slashes as the rest of the filesystem as shown below. I don’t know how that translates when the program actually runs, I’m not a programmer.

https://rayquaza.bungee.systems/s/jXDc9yzrPPMt22N/preview

I’m very confused.

Shows another option that at least used to work. I mention it, but we’re already swimming in options.

Python is going through a rough transition from Python 2 to Python 3, but I don’t know if that’s what’s behind your troubles. This would probably be worth a GitHub issue in hope of getting that script back.

Why is the Migration to Python 3 Taking So Long?

The “thus” might depend on what exact VM it is. VirtualBox can happily do VirtualBox/SharedFolders.

That’s bizarre. They should have been, and they’re even audited as part of after-backup verifications.

It’s not the path you type there. It’s the paths that Duplicati originally backed up, which are in your dlist file and presumably are forward slashes because the use of NFS which gives hint against Windows…

What was view of NFS paths from Duplicati VM? If forward, then you’re attempting a cross-OS restore using RecoveryTool. I couldn’t find previous notes on this blending, but it wasn’t working when I tried it.

EDIT:

How the restore process works shows the "path": "C:\\data\\mydoc.txt" as a Windows example. Your dlist file might have forward slashes. You can certainly open up the zip file and peek at filelist.json, however unless the Duplicati VM ran Windows, you probably have forward slashes in your path listings.

EDIT 2:

One reason why extracting a filelist.json might be more worthwhile is to create a reduced version that hopefully won’t exhaust system resources such as file handles. The format is really quite simple, and
How the backup process works gets into it a bit more than the Restore article did. Here’s what I used:

[{“type”:“File”,“path”:“Z:/backup source/A.txt”,“hash”:“BvlhuAK8Ru4WhVXwZtKPTw6a/fP4gXTB7m+d4AT8MKA=”,“size”:2,“time”:“20191209T174422Z”,“metahash”:“GSk37v/IInH+jzfV2NXAPguQlbs7VN0GFhtH10EoqFw=”,“metasize”:193},{“type”:“File”,“path”:“Z:/backup source/B.txt”,“hash”:“wM3nf6j++X1HbBCq09LVT8wvM2FA0HNlHC3Mzx43n9Y=”,“size”:2,“time”:“20191209T174436Z”,“metahash”:“F52HR7pEdAbPn9p6rK/6SHkzB69QBf/WTmek90qzvV0=”,“metasize”:193}]

EDIT 3: Actually the above was my Linux dlist after the add-a-drive-letter experiment, Ignore Z: prefix.

which doesn’t give a Folder example, but the article does, under Processing a folder before it does file.

So you could take a copy of some dlist (never mess with originals), save filelist.json, delete curly-brace entries, put filelist.json into your dlist copy, take it to Linux, and see whether it fixes the open-file issues.

As a test on Windows, I just chopped down a 408 file restore into a 148 file restore using notedpad edit. I don’t think notepad is the best editing large files. If it can’t do it, try another. EditPad Lite does big files.

This is, of course, somewhat extreme, so you could still keep trying to use Duplicati GUI in a usual way, perhaps trying smaller restores at a time if a resource limit of some sort seems like it might be the issue.