I have a specific use-case where I’d need to install Duplicati on client’s machines and periodically backup files to my S3/Google Storage buckets (upload only, no recovery).
Reading through the manuals, I take it that both AWS S3 and Google Storage are supported but only with public buckets or using access tokens/keys. Given the buckets are private and the configuration will be stored in (assumed insecure) clients’ systems, I can’t use access token/key information in plain text.
What I can, however, is fetch presigned upload URLs from an API using said clients’ credentials. I looked at --run-script-before-required and related options, but couldn’t find a way to request the URL using it. The manual has a long list of providers, what hinted the possibility of creating a custom provider for my use case. This process would be basically:
Fetch upload URL with an HTTP GET request using credentials
Upload file (rarely over 100 mb) to that URL
(Possibly) Send backup confirmation to previous server
I went over to the source code, and found the providers under duplicati/Duplicati/Library/Backend/, but I have a couple of questions.
Am I overlooking any other, simpler way to do this?
Is there any documentation on how to develop providers? Are there any providers that make simple HTTP uploads available for reference?
I consider public buckets to be a huge security risk, and hope nothing is advocating them for general use.
They’re not stored in quite plain text on Windows, but web UI does let the user see backup configurations.
Are your talking about writing your own API and server, and hooking calls to it into Duplicati’s file uploads?
Sounds ambitious. If you’re a .NET developer, your skills are very much needed in many Duplicati areas.
Duplicati default remote volume size is 50 MB. That can be configured, but it’s still multiple files uploaded.
There is generally at least a dblock of source file data, a dindex that indexes dblock, and a dlist listing files. How the backup process works
I don’t use S3 or Google Cloud Storage, but have you considered S3’s IAM, which sounds very extensive?
They are not plain text? That’s interesting. I’ll have another look.
Using IAM was the first thing that came to mind, as both S3 and Google Storage work with a very similar API (Storage is S3 compatible). Still, it would only solve the authentication problem, but would still need to access an API to get / update IAM access. Either that or at least a hacky companion application to update Duplicati’s configuration with new IAM and make the API calls. Figured properly adding the backend would be better.
Yes, exactly. We have a medium volume backup system for sensitive text files (read Json and XML); one that we are extending to allow specific binary files. We would need some API calls upon starting and finishing an upload.
I’ll definitely try and contribute back if I add anything that’d be useful outside of our specific use-case!
--server-encryption-key
This option sets the encryption key used to scramble the local settings database. This option can also be set with the environment variable DUPLICATI_DB_KEY. Use the option --unencrypted-database to disable the database scrambling.
which make the weak encryption slightly more secure (or at least off the default), or turns it off if desired. Which tool can open encrypted DB talks about the encryption used on Windows, but it’s not high quality.
Duplicati security is more against attacks on the remote backup, and less on attacks on its own system.
The support of CLI use requires the ability of the Duplicati administrator to see their own credentials, etc.
I’m not an IAM expert, and I’m even less an expert on IAM automation, but manual setup won’t scale well.
Still sounds ambitious, and I’m not sure how it would be worked into the general product code base.
Typically new features go out for Canary release in the hope someone will try them. Server setup in conjunction with Duplicati public test is something that nobody may be willing to do. So how to test?
It’s not clear to me exactly how much sharing is in mind. Are you putting all the users in one bucket?
This avoids using tons of buckets, but I assume some administration and per-user config is needed.