Allow changing the archive passphrase?

Pectojin · August 27, 2018, 6:28pm

The tricky part about doing it for non-advanced users is Duplicati has to carefully convert all the data in-place - without somehow ending up in a broken state.

Duplicati doesn’t currently support restoring from or backing up to a mixed encryption backend, which means it wouldn’t be possible to use the backup until the conversion is complete. And then there’s the whole aspect of dealing with resuming the encryption, which also has to be carefully implemented.

A workaround for that would be to have the user specify a new directory to put all the converted data, but not everyone has capacity to double their backup size for a week (or four) while it’s being converted.

The script is simple from a programming perspective because it can ignore all these things, but it’s not user friendly.

I definitely agree that it’s a necessary feature to have both for enterprise and for regular people. I’m just not a fan of the “re-encrypting all the data” approach, as I argued in [feature request] Changing volume encryption password · Issue #2991 · duplicati/duplicati · GitHub

I think my thoughts on this back when I opened #2991 was to insert the key into each dlist file and then having those be password protected so you minimize the amount of re-encryption work.

kenkendk · September 4, 2018, 8:07am

After thinking about this for a while, I see that we need to consider the attacker scenario.

User exposes password (like password re-use, etc)
Machine/network is breached
Destructive malware/ransomware

If we use the keyfile approach, as duplicacy, we can only really cover (1).

If the machine is breached, they can easily recover the real passphrase, and changing the keyfile passphrase is no going to prevent anything.
Malware/ransomware can effectively kill the keyfile and make backups useless.

Yes, but is it “enough” to simply change the keyfile passphrase? I think it can give a false sense of security if your real key is compromised and you simply change the passphrase on the keyfile, because the attacker might have the contents of the keyfile when you discover that the passphrase is compromised.

In this scheme, the dlist files would be encrypted with the user supplied passphrase, and all others with a session key? This makes it easy to determine what key to apply (just look at the file extension). If we do not store the encryption key locally (should we?) then we need to fetch a single dlist file before being able to start a backup.

This would cover (1) and make it a bit more robust against (3). And I think this can be implemented without many issues, except for supporting existing backups where the user supplied passphrase is used to encrypt everything.

I guess we could make the simple approach where we support a list of potential passphrases (should be a short one, less than 5 items) and then just try with each one until we succeed. Then we grab the passphrase from a dlist file (if it has one) and adds the user supplied passphrase to the list and then just attempt to decrypt.

When changing the passphrase we would need to re-encrypt all dlist files. The problem here is that we can be interrupted and end up with some files using one passphrase and others using the new passphrase. It is a bit messy, but we could mark the database as being in “re-encrypt mode” to avoid other things happening. Then we can continue with the user supplying both the old and the new passphrase, and use the same approach with a list to detect if the dlist is already re-encrypted or not.

It still does not solve (2), which I think can only be solved with a re-encryption of all files. We could add new encryption keys to the dlist file, such that it knows how to decrypt old files, but uses a different passphrase for encrypting new volumes. It is a bit dangerous, as we need to ensure that we only remove old keys when there are no old volumes depending on it.

Pectojin · September 4, 2018, 10:13am

Good point. I think it’s a great start if we can make the described process robust to the point where it can function in a partial state with multiple keys. That way we can comfortably offer at least some protection.

Then I’d suggest a next step where we offer to re-encrypt all the data in an amortized fashion. So we re-encrypting X volumes each backup, with an option to force re-encrypting everything in one go.
Even if the “all in one go” approach is interrupted it will then just continue over the next backups to finish the re-encryption process.

I think we should. It’s not much more secure to have it only on remote since the DB contains the info to download and decrypt the remote data anyway. The security implications are not any worse than the current (which won’t be resolved until we get OS password managers involved)

RaptorG · December 12, 2021, 11:26pm

As I understand it encryption passphrase is used to encrypt back up data chunks / files/ blocks. I don’t know if that’s the right approach. IMHO better option would be to use pass phrase as a seed to generate a 128 char long string [which in turn to become the encryption key], now encrypt and save this string in a file like a key file as well as in duplicati’s Database. Prompt user to save this key file. This way you can allow user to change the pass phrase which will then trigger in memory key file decryption and re-encryption as well as updating encryption key saved in the data base. As an additional layer of security encrypt the database record with the passphrase and decrypt re-encrypt when passphrase is updated.

Essentially data encryption key never changes for a backup set but the encryption key’s encryption passphrase can be rotated.

If the idea of loosing the encryption key file sounds terrible why not save the key file in the db, on the local disk and with the backup set and prompt user to save it else where?

ts678 · December 13, 2021, 11:28pm

Welcome to the forum @RaptorG

Thanks for continuing this discussion. Unfortunately, many of the previous posters participate less now.

If you read through this (and also [feature request] Changing volume encryption password #2991), much discussion involves where to save the encrypted keyfile, however like so many other features, somebody interested in implementing it needs to volunteer. Thank you for your recent forum help. Do you code too?

As a small side note, Duplicati’s commandline tools don’t use the Duplicati-server.sqlite database, however there are other ways to achieve redundancy, Which to use might depend a bit on what’s simpler and covers all the needed cases, including all the command line tools, and also the Python restore script .

RaptorG · December 14, 2021, 9:34am

Hi, I work as a Primary & Object Storage Consultant for one of the Top 5 Data Storage vendors. I do code a little from time to time but mainly limited to automation scripts.

ts678 · December 14, 2021, 12:57pm

Thanks for the info. If you code in C#, JavaScript, or Python occasionally, that would be especially useful.
There should be plenty to do, if you wish to contribute a pull request someday. Even the ability to read the current code somewhat is very useful to trying to understand problems or propose good ideas to coders.

That’s kind of what I do, because my programming languages are almost none of what Duplicati is using.

Automation scripting might also play into some of the test suites, and there are others that could help a lot.
For large-data slowdown, large-data needs to be created, then Duplicati run to see how the SQL slows…
Robustness against network failures, reboots at sensitive times, etc. all seem like things on can automate.