Need Help! Am I using the system wrong? Extremely slow backup with B2 cloud

I am using Duplicati along with B2 cloud to replace the backup solution we are currently using.
We currently use a single bucket in B2 and have created folders such a way that, every endpoint/computer is a folder inside the bucket.
The issue is that we currently have about 3500 folders, and it takes more than 5-6 hours (in some cases, days)to back up a few Kbs. The upload process itself is fast, but it takes forever to complete the verification (Step that says - Verifying backend data), and it ends up downloading details about all the folders that are the bucket, than just the folder that corresponds to my computer. Am I using the system wrong?
I have tried with Duplicati version 2.0.3.6, 2.0.4.5, and 2.0.4.19, and have had the same issue with all the versions
What can I do so that it doesn’t end up downloading details about the whole bucket every time it starts a backup? How can I make sure that it only downloads details of my Folder?

Hello @divya_c_h and welcome to the forum!

To get a technicality over with, please read the B2 Files section called “Folders (There Are No Folders)”, which perhaps you already knew, but it’s not like Duplicati is jumping into the wrong folder. B2 has none.

Unfortunately Duplicati seems to not be taking advantage of list prefix filtering that tries to simulate them. Features get added to B2 sometimes, so I can’t say whether prefix was available when the List() code originally appeared, but it doesn’t seem to take advantage of it. That enhancement might avoid the issue.

Buckets says you only get 100, so that’s not a way out, though you could use it to take some of the pain away, meaning spread the folders around many buckets.

I’m not sure if there’s a short-term answer to this that’s not overly dangerous (–no-backend-verification could be tested with, but you don’t want to leave it that way, because you’d be overly blind to situations).

Choosing sizes in Duplicati might offer minor relief, if you raise –dblock-size (called “Remote volume size” in the GUI) which will reduce the number of files. Some people use this to avoid 5000 file limit OneDrive either has had or still has. There are drawbacks (explained in article), so this isn’t a great solution either.

If you’re not super-attached to B2, each backend has its own code, so some other might avoid the issue.

For a long-term change, the way to get in the (long) line for code work is to open an Issue for the change. Arguably it’s a performance enhancement (it works right functionally?) but one could also label it a bug…

Do you mean you have 3500 on your B2 bucket? That is pretty huge - you must be protecting lots of endpoints!

I have about 10 computers I back up and I use maybe three B2 buckets. I definitely have success backing up multiple computers to the same B2 bucket using different target “folders” but I have no where near 3500.

Great catch! Maybe it is a trivial change to use the folder path configured in the job definition as the “prefix”.

My guess is that it would be pretty easy. Might need to get some quoting done properly, but that’s typical. Question to @divya_c_h would be whether this would be acceptable in a bleeding-edge canary, since there’s a beta trying to get out right now (with some of us trying to squeeze a few ready-to-go fixes in…).

Ideal developer-tester might be somebody who backs up multiple computers to the same B2 bucket. :wink:

I might spend some time testing today… stay tuned…

Ok well I did a quick test and it seems to make a nice difference.

I set up a new test backup job in one of my larger B2 buckets (it contains 30,000 files). Targeted this new backup job to a new “folder” in the B2 bucket.

Did some test backups and was consistently seeing this timing in the log:

RemoteOperationList took 0:00:00:11.125

Noticed that it may do this twice in a backup run, perhaps when it decides to test one of the archive files for corruption (I forget how often it does that).

Made the trivial code change and now am seeing times like this:

RemoteOperationList took 0:00:00:00.265

I did several backups and restores and everything seems to be working fine still.

I’ll submit a pull request and hopefully someone else can test…

Pull #3803 submitted.

@divya_c_h if you are willing to help test, I can provide a compiled/updated B2 backend DLL for the version you are using. Let me know…

yup a little more than 3500 endpoints- which corresponds to 3500 “Folders”

Thankyou. will test this out

Did you want me to send you a dll? If so let me know which duplicati version.

Version that I use is 2.0.4.5

Duplicati.Library.Backend.Backblaze.dll-2.0.4.5.zip (13.0 KB)

Give this a try. Stop Duplicati on a test machine, rename the existing Duplicati.Library.Backend.Backblaze.dll file, and then put this newer one in its place.

In case you get any errors or it doesn’t work you can undo the changes.

Please report back and let us know!

@divya_c_h

Just curious if you tried the updated DLL for 2.0.4.5 and if it helped you.
Note that the change is now present in the 2.0.4.22 Canary release.