Back Up to Cloudflare R2 Storage Fails

JeffH2 · December 25, 2022, 1:02am

Hello! I’m running Duplicati 2.0.6.3 on Ubuntu Server 22.04.1 (64-bit - Intel).

I tried to create a backup job to upload data to Cloudflare R2 Storage (R2 is a relatively new Cloudflare product - not to be confused with Backblaze B2).

Cloudflare R2 Storage is compatible with the S3 API.

Everything seemed to work nicely until Duplicati attempted to process the backup job. I received the following error:

One or more errors occurred. (The remote server returned an error: (501) Not Implemented. (The remote server returned an error: (501) Not Implemented.) (One or more errors occurred. (The remote server returned an error: (501) Not Implemented.)))

I configured a new backup and set the destination settings as follows:

Storage Type: S3 Compatible
Use SSL: True
Server: Custom Server URL - <REDACTED>.r2.cloudflarestorage.com (I also tried adding https:// before the FQDN)
Bucket Name: <blank>
Bucket Create Region: default
Storage Class: default
Folder Path: default
AWS Access ID: Access ID from R2
AWS Access Key: Access Key from R2
Client Library to Use: Amazon AWS SDK

I selected ‘Test connection’ and received the message “Connection worked!”

I finished the backup wizard by selecting the files I wanted to back up, then set the schedule accordingly.

I chose ‘Run Now’ to kick off the backup job immediately. Duplicati chugged along for a bit, and I received the error message I mentioned earlier.

I can collect some debug logs if that would be helpful. I looked in the usual places on the server’s filesystem to see if I could locate the logs, but I couldn’t find them.

gpatel-fr · December 25, 2022, 10:29am

Hello

if you want a log on the file system you can create one by adding the advanced options log-file and log-file-log-level. For your problem using ‘Verbose’ for log-file-log-level should probably be enough, if you want to attach it to a post on this forum it should not be too big (especially because you will need to zip it to conform to the weird rules of this forum that forbid usual extensions for log files such as .log or .txt)

ts678 · December 25, 2022, 2:25pm

Somewhat, but with lots of missing functionality, maybe explaining those (501) Not Implemented.

Duplicati default logging is mostly some light logging in job database (which gives an advantage of OS independence). One problem with failed backups is it doesn’t make a job log in the usual job logs, but server log at About → Show log → Stored usually has something. Please look around backup fail time.

Your <job> → Show log → Remote looks awfully insistent on getting a dblock upload. You can click on those to see file hash value. Repeated hash under a different name is likely retry. Usually the limit is 5, however limit is configurable. Under normal conditions, one expects mixed dblock and dindex uploads.

About → Show log → Live → Retry would be enough to see retry, and a Verbose log to logfile will also.

What I’d like to see if it can be seen somehow is R2 detailed explanation behind its Not Implemented, because without that it’s much harder to guess what configuration change might avoid such a situation.

An easy one to try though is changing Client library to use from Amazon AWS SDK to Minio SDK. There are far fewer options in it, but as a shot in the dark, maybe it will just work better than Amazon’s, which does (I think) have some low-level logging in it, but I don’t think anyone has tried running it. See:

Add support for cloudflare R2 #4673

It might also be possible to use Export As Command-line to get the destination URL, then test that with Duplicati.CommandLine.BackendTool.exe and Duplicati.CommandLine.BackendTester.exe to see what operations work (or don’t). Seemingly list is OK, because that’s what Test connection usually does.

JeffH2 · December 29, 2022, 2:38am

I doubt it has anything to do with Duplicati using API calls that are not supported by Cloudflare or vice versa. Cloudflare’s implementation might not be 100% feature parity, but I can’t see Duplicati needing to use much more than the basics of what the S3 API is capable of in the first place.

I was able to generate a debug log file. Everything is great until Duplicati attempts to begin the authentication process (line 574):

2022-12-28 21:04:54 -05 - [Retry-Duplicati.Library.Main.Operation.Backup.BackendUploader-RetryPut]: Operation Put with file duplicati-b9880880ea5f2465aab6cf527c7ff4784.dblock.zip.aes attempt 1 of 6 failed with message: STREAMING-AWS4-HMAC-SHA256-PAYLOAD not implemented

Interestingly enough, someone posted to the Cloudflare Communities with a similar issue but did not say which application they were attempting to use:

Unable to upload to R2 with .NET S3 SDK:
https://community.cloudflare.com/t/unable-to-upload-to-r2-with-net-s3-sdk/386262

I tried the MinIO SDK - no good. The Minio SDK fails to connect to Cloudflare R2 entirely. The “Test Connection” option doesn’t seem to return a response. I gave up after waiting a minute for it to complete the test. It may be something I’m doing wrong. I tried changing the SDK from AWS to MinIO on the same backup job, and I also tried creating a new backup job, a new bucket and selected the MinIO SDK. The same results in both cases - the connection test fails.

Using the Test Connection with the AWS S3 SDK returned a success almost instantly.

I’d be happy to share the debug log - I don’t see a way to attach it in Discourse.

ts678 · December 29, 2022, 4:03am

Thanks for providing the STREAMING-AWS4-HMAC-SHA256-PAYLOAD not implemented message.
That’s pretty much what I meant, except they don’t admit to it on the main page about API calls…

PutObjectAsync() not working for R2 with AWS S3 .NET SDK in Cloudflare Community leading to
R2 how to implement in .net project? #4683 talks about a PR into the docs, but gives an example:

DisablePayloadSigning = true // required otherwise it attempts to use streaming sigv4

Configure aws-sdk-net for R2 has a possible change bar (maybe just highlighted for attention?):

DisablePayloadSigning = true must be passed as Cloudflare R2 does not currently support the Streaming SigV4 implementation used by AWSSDK.S3.

I’d hope that they’d say it if there was an easy workaround short of requiring application changes.
I’m not an S3 or Duplicati developer, but the lack of support for streaming feels like a major block.

github.com

duplicati/duplicati/blob/666b2281032460254839fdc3b6e1055fdf7ce1db/Duplicati/Library/Backend/S3/S3Backend.cs#L33-L33


      
          public class S3 : IBackend, IStreamingBackend, IRenameEnabledBackend

github.com

duplicati/duplicati/blob/666b2281032460254839fdc3b6e1055fdf7ce1db/Duplicati/Library/Interface/IStreamingBackend.cs#L26-L30


      
          /// An interface a backend may implement if it supports streaming operations.

          /// Backends that implement this interface can be throttled and correctly shows 

          /// the progressbar when transferring data.

          /// </summary>

          public interface IStreamingBackend : IBackend

If you don’t mind slightly reduced functionality using non-streaming transfers, you might try rclone.
Rclone Storage Type could be configured for Duplicati to run that for its interaction needs for R2.

JeffH2 · December 29, 2022, 7:56am

I will open an internal support ticket on this and bring it to the R2 Product Manager’s attention. It’s a nice benefit I get from working for the company that graciously employs me.

I’ll share the details with him and find out what he thinks can be done. It seems like it’s a reasonable request to be able to handle authorization in the manner Duplicati is attempting. The fact that there’s at least one other case of a similar issue, should make it easier to get some eyes on it.

I am out of the office the rest of this week. I go back to work on 1/3. I’ll keep you posted as I hear more.

If you need me to intervene sooner, I can…I’m just trying to stay away from the work keyboard as much as possible on the last few days of two weeks off.

JeffH2 · December 29, 2022, 8:12am

OK - I couldn’t resist temptation. I just opened an internal ticket.

I’ll let you know what I find out.

ts678 · December 29, 2022, 2:08pm

Testing as shown in my first post might be good. The link you cited got upload fixed but download broke.
AWS SDK for .NET is open source. I see references to what we’re talking about, but haven’t traced well.

Using Cloudflare R2 prompts STREAMING-AWS4-HMAC-SHA256-PAYLOAD not implemented #643
mentions the GitHub issue I mentioned, and uses a different workaround that seems to be code change, concluding “R2 doesn’t support streaming uploads”. I don’t know what else uses that approach though…

Above project is “Swift SDK for AWS”, and presumably they’re pretty familiar with S3 technology issues.
For whatever it’s worth, IDrive e2 did directions for Duplicati, so apparently clears the hurdle that R2 hit.
Their compatibility hit a different hurdle though, and caused some confusion which you can see in PRs.

The only open people asking are the GitHub issue I linked, and this. Code change from Duplicati would maybe be very delayed just based on lack of resources, so I don’t expect any quickly released solution.

If somehow it can get there well on advanced options AWS SDK lets one set, great, but I didn’t see that.

JeffH2 · January 6, 2023, 1:08am

OK - I spoke with the product team. Evidently, the issue is that we aren’t supporting streaming sig4 at this time. The recommendation for the time being is to disable it, and the API will be able to connect without it.

They said that adding support for streaming sig4 isn’t a very high priority since the workaround is so simple - disable it and it works.

Is there a possibility of adding an advanced option to allow streaming sig4 to be disabled on a given backup job?

ts678 · January 6, 2023, 4:36pm

OK, I think I see what you want. You want the DisablePayloadSigning = true possibly added near:

github.com

duplicati/duplicati/blob/666b2281032460254839fdc3b6e1055fdf7ce1db/Duplicati/Library/Backend/S3/S3AwsClient.cs#L129-L137


      
          public virtual async Task AddFileStreamAsync(string bucketName, string keyName, System.IO.Stream source,

              CancellationToken cancelToken)

          {

              var objectAddRequest = new PutObjectRequest

              {

                  BucketName = bucketName,

                  Key = keyName,

                  InputStream = source

              };

which looks like it works only on HTTPS and then weakens or removes the data integrity checking per:

https://github.com/aws/aws-sdk-net/blob/master/sdk/src/Core/Amazon.Runtime/Internal/IRequest.cs

WARNING: Setting DisablePayloadSigning to true disables the SigV4 payload signing data integrity check on this request.

which is the only warning on the page. Text suggests that there might still be an MD5 integrity check…

Was that the request? I question whether going down the big-warning path is a proper way to proceed, however I think some users are going to deprecated signature version 2 for providers that don’t do 4…

I haven’t traced through the SDK, but I assume that uses UNSIGNED_PAYLOAD instead of STREAMING-AWS4-HMAC-SHA256-PAYLOAD and sounds like a great weakening of its check. I’m not a cryptographer.

JeffH2 · January 6, 2023, 10:01pm

The streaming sigv4 is mostly for scenarios where you may not want to compute a checksum when you make the putobject request. Using regular (non-streaming) sigv4 + content-md5 has the same level of integrity checking as using streaming sigv4. The main difference is that to use content-md5, you need to potentially compute a checksum of a large file before initiating the upload.

More specifically - imagine that you want to upload a 1TB file. If you use streaming sigv4, you can just stream the data via putobject and be sure that the integrity will be maintained chunk-by-chunk. W/o streaming sigv4, you’d need to make one pass over that 1TB file to compute an MD5, and then make a putobject request with content-md5 to get an integrity check of the final upload. (this is a bit of a simplification since you’d likely use multipart for such a large file)

And yes - it would have to be HTTPS - Cloudflare doesn’t do anything that’s not SSL/TLS encrypted.

ts678 · January 6, 2023, 11:07pm

Regarding integrity, the big WARNING in the SDK seems to differ on “same level of integrity checking”, however it does talk about the different components that bring integrity, later in the warning paragraph. There’s a suggestion that unless DisableMd5Stream is set, MD5 handling is done for you by the SDK. Please let me know if you think that’s so. If so, we can just not set that one, and at least get MD5, BUT

MD5 (Wikipedia) says

As of 2010, the CMU Software Engineering Institute considers MD5 “cryptographically broken and unsuitable for further use”

however anybody who cares about security of remote files has Duplicati encrypt (e.g. AES-256) before upload, and its file format has an integrity check in addition to the file SHA-256 being in local database.

You enforce HTTPS which is good, so integrity checking looks fairly robust, just not “same level” IMO… Basically, we’re talking about transport integrity checking. Do you know if MD5 is there unless disabled?

If so, then this means adding an SDK option, and NOT adding the option that disables its MD5 integrity. While you might think that this is easy, it still requires developers, and those are very scarce these days which means make this as easy as possible, including how to test with an actual Cloudflare R2 account.

GitHub Issues is where to file the enhancement request, and I would suggest linking back to result here. Some things are still TBD in my view, and in my view there are much larger issues waiting in the queue.

Getting in line is fine, but there’s nobody who dictates to the volunteers which thing they should work on, and making a case for change for one S3 provider based on one outside request is difficult to justify.

Maybe you know of some other S3 software or service with this limitation? That may help justify change and also provide a better ability to search in forum and issues to see if there are any other cases of this.

JeffH2 · January 6, 2023, 11:18pm

Thanks for your response. You’re great to work with!

Yes, as far as I am aware, the MD5 handling is still there even if not specified.

I know MD5 is not widely accepted from a cryptography standpoint - but it’s still VERY widely supported.

If need be - I can get you squared away with an R2 account - that shouldn’t be an issue. We’d just need to figure out how to sync up outside of the forum - that’s not something I can do here (for obvious reasons). Given you’re a moderator, I’ll assume you can get my contact info. Reach out to me via my personal contact info and I’ll shoot you over to my work account so we can get an account set up.

I work with developers daily - I know the rules.

And I also understand the weighting of feature/enhancement requests. This is a nice-to-have - I’d love to see Duplicati claim support for Cloudflare R2. It would be a very nice call-out!

I will heed your advice and submit a request on GitHub Issues…probably over the weekend, if not early next week.

Thanks so much!

-JeffH

gpatel-fr · January 6, 2023, 11:30pm

As you don’t seem to be very used to Discourse (that’s the forum software), it allows for private messages all right. Just click the user’s icon.

ts678 · January 6, 2023, 11:44pm

Trust Level Permissions Table (inc Moderator Roles) suggests emails need a special setting.
There were a number of topics where people ask to reduce the exposure of email addresses.
Regardless, private messages are on profile page (click your name) using a Message button.
I’m not sure I want to go between whoever picks this up and you, though you may prefer that.

Users are always shopping for storage, and S3 is sort of a commodity on the surface, though
deeper down there are differences. I haven’t evaluated yours, but welcome to the S3-ish club.
Editing my prior post, I sort of invited you to name other things with the same limitation, which
may or may not be in your interest. It may help get the change done, while giving away ideas.

EDIT:

A PM is probably visible by the admins group but not by the moderators group (which I’m in).
Again this becomes a question of what’s “good enough” to do the data transfer well enough…

Simon_Blomsterlund · March 27, 2024, 10:47am

Hi!
Any news on the R2 compatability, would love to try it out but I don’t want to waste my time if it is not working

ts678 · March 27, 2024, 3:06pm

Welcome to the forum @Simon_Blomsterlund

github.com/duplicati/duplicati

Add option to disable chunked encoding for AWS S3.

duplicati:master ← Jojo-1000:s3-chunk-encoding-option

opened 10:00PM - 22 Jul 23 UTC

Jojo-1000

+12 -5

Some S3 providers do not support MultiChunkedEncoding (#4994). This adds an o…ption for the S3 backend: ``` --s3-disable-chunk-encoding = false This disables chunk encoding for the aws client, which is not supported by all S3 providers. ``` It only affects the AWS client, because Minio either does not use this encoding, or does not allow to configure it.

possibly helps some providers for the reasons described in the pull request. It’s just my hope.

Nobody has tried it, as far as I know. Someone has to be the first, and then maybe we’ll know.

One problem is that this change is not yet in a Beta release (just Canary), and Canary are not recommended for production (as they’re hot off the build, so fast pickups might get surprised).

2.0.7.100_canary_2023-12-27

--s3-disable-chunk-encoding added to the AWS backend (only useful for some providers)
Add option to disable chunked encoding for AWS S3, thanks @Jojo-1000

but if you’re willing to try Canary, might as well try latest which is 2.0.7.101_canary_2024-03-08
and seemingly working well except for a problem in RPM build. Other builds appear good so far.

Maybe @JeffH2 can forecast the outcome. or has news of some changes within Cloudflare R2.

ts678 · May 25, 2024, 11:06am

Canary got promoted to 2.0.8.1_beta_2024-05-07, and s3-disable-chunk-encoding option does help.
Add support for cloudflare R2 #4673 has complete recipe. Cloudflare R2 changes are still welcome.

JeffH2 · May 27, 2024, 2:20am

I just installed Duplicati 2.0.8.1 on Ubuntu 22.04 LTS and tried to create a backup with Cloudflare R2 as the destination.

I click “Test Connection” and am presented with the following message:

The bucket name should start with your username, prepend automatically?

I selected No - Cloudflare does not require the username to be prepended - in fact, if I select yes, the connection fails.

When I select No, the test is successful - I get the message:

Connection worked!

Per Add support for cloudflare R2 #4673, I added two options under “Advanced Options”

s3-disable-chunked-encoding
s3-ext-disablehostprefixinjection

I tried running the backup job and kept getting an error message - the same 501 not implemented error I was getting when I first started this thread in December 2022.

I went back through and reviewed the settings - I realized there were two checkboxes next to each of the advanced options that were unchecked. I would have thought simply adding the options from the drop-down select would have done the trick, but I guess not?

I checked both boxes, then saved the backup settings and tried to run the backup again… It worked!

I’m still very new to Duplicati. I haven’t spent any time with it since I originally started this thread. Now that I know this works - I’ll be using Duplicati a lot more often!

Thanks to everyone that collaborated to get this working!

JeffH2 · May 27, 2024, 4:10am

Actually - the backup started to run but eventually failed. I will post the details in the GitHub issue: Add support for cloudflare R2 · Issue #4673 · duplicati/duplicati · GitHub