Upload to Backblaze B2 - no tomes available / Connection reset by peer

coofercat · August 8, 2023, 10:57am

I’ve got my Duplicati (docker - version 2.0.7.1_beta_2023-05-25) setup and running to backup my OMV data into Backblaze B2. That’s all fine, and the backup runs for several hours, but I’ve so far yet to get a successful firat backup - it seems to fail for a variety of reasons (which look like transient B2 issues to me).

I’ve got two backup jobs, the one I’m concentrating on is something like 230254 files (500.71 GB). After several attempts, I’ve got 76GB into B2, across 2900 files (with a volume size of 50MB).

First example:

System.AggregateException: One or more errors occurred. (Connection reset by peer (Connection reset by peer) (One or more errors occurred. (Connection reset by peer))) ---> System.AggregateException: Connection reset by peer (Connection reset by peer) (One or more errors occurred. (Connection reset by peer)) ---> System.Net.Sockets.SocketException: Connection reset by peer

I’ve looked through the stack trace and can’t see the hostname (I’m guessing it was a B2 server though). I’d imagine this was recoverable but there’s no mention of sleeping or doing retries.

The next example:

System.AggregateException: One or more errors occurred. (503 - service_unavailable: no tomes available (503 - service_unavailable: no tomes available) (One or more errors occurred. (503 - service_unavailable: no tomes available))) ---> System.AggregateException: 503 - service_unavailable: no tomes available (503 - service_unavailable: no tomes available) (One or more errors occurred. (503 - service_unavailable: no tomes available)) ---> System.Exception: 503 - service_unavailable: no tomes available

This one’s actually documented at Backblaze (Why You Are Getting a B2 503 or 500 Server Error and What to Do Next) - and is recoverable. They just say to retry until it works.

So my question is… does Duplicati’s B2 driver retry, if so, how can I configure it to maybe sleep and retry a little longer before the backup fails?

(I can supply the full backtraces for these errors, if that would help?)

gpatel-fr · August 8, 2023, 12:33pm

Yes, in the Auth handling, up to 5 times with exponential backoff from 2 to 32 s. This is not configurable. After that, there is a generic retry capability (for all backends, not specific to B2) configured with number-of-retries (default 5), retry-delay (default 10s), and retry-with-exponential-backoff (default false, double the value for retry-delay for each retry configured with number-or-retries).

You can monitor what is doing Duplicati by using the live log (About / Show log) at retry level (it will show only the generic retry, not the specific B2 one)

If Duplicati does not retry in the live log, it may be a problem in the B2 driver.
If it retries but it’s not enough, try to set exponential backoff to true.

There is a possibility

coofercat · August 8, 2023, 12:55pm

Thanks for the great and detailed information. I see the retry-with-exponential-backoff in the job settings, so I’ll try that if this fails again.

I also see in the live log that it has had the ‘no tomes available’ problem along the way, yet has recovered and continued.

Your description of the retries looks like it actually retries a lot before it fails. I don’t know how long B2 transient issues last, but I could extend some of the times and counters to try to work around it. If I ruled the world I’d maybe like to make the B2 driver retry longer for some errors, but what Duplicati does right now looks sensible and usable enough.

Thanks for your help - I’ve have a play around and see what works for me.

coofercat · August 10, 2023, 3:58pm

Just to say… it failed again, so I added the exponential backoff - and now it’s run to completion (and the second job has started)

Ralf · October 31, 2023, 10:10am

Got the same problem since yesterday.
Worked flawlessly before.
retry-with-exponential-backoff solved it.

Maybe B2 working on their infrastructure?