I have a few questions about the cryptographic aspects of Duplicati, which I couldn’t find answers to in the official documentation. I apologize if I overlooked relevant articles.
As clarified in a previous thread, Duplicati encrypts not the files themselves, but the .zip files containing the data chunks.
Does Duplicati use different encryption keys for the generated .zip archives (dlist.zip, dblock.zip, and dindex.zip)?
For simplicity, I will further refer to the keys in plural, regardless of the answer to the first question.
Where and how does Duplicati store the encryption keys?
Are the encryption keys themselves encrypted with a key-encryption key?
Does Duplicati derive the keys from the user-defined password using a key derivation function, or are the keys generated by a cryptographically secure pseudorandom number generator? What is used as a salt input?
Does Duplicati utilize a dedicated MAC key for the integrity of files?
As explained in the previous thread, the encryption is fully handled by the SharpAESCrypt library. It uses the AES Crypt file format (V2). This documentation might be interesting to you:
There seems to be a separate session key for each file, stored at the beginning of the encrypted data. There is also a SHA-256 hash for the data. Only the backup password is used as the basis for encryption, although there seems to be some kind of key derivation implemented in the library.
For the other questions, please look at the source code of the library. There does not seem to be much documentation about this.
SetupHelper might be a good place to start:
It’s embarassing, but I have to admit that only now I start to understand what is meant by “fully handled” by the cryptographic library. So in short, Duplicati users input their password and everything else is fully handled either by SharpAESCrypt or GPG. Sorry for being so daft.
I’m going to stop there, and I might have already gone too far. I am not a cryptographer, and details
about the cryptographic design are probably best found in answers from the AES Crypt developer.
Beware, though that they have a product containing cryptography, but Duplicati made a bigger one.
You will find product to product differences, even though the file format and crypto are likely similar.
I don’t know how far your key storage question meant to go, but the passphrase story is complex, due to. Different ways to make a Duplicati backup. The server database is used by the Server – basically in GUI.
CommandLine (like most of them) can have passphrase given as an option on terminal or script, like this:
Supply a passphrase that Duplicati will use to encrypt the backup volumes, making them unreadable without the passphrase. This variable can also be supplied through the environment variable PASSPHRASE
If a passphrase is needed and is not supplied, the user is prompted to type it (obscured by * for privacy), which means that where it is stored is up to the user, similarly to the above case where an environment variable is set – somehow – from whatever storage the user wants. Another way to set any option is with Scripts writing new options to their output. For CommandLine, parameters-file can store the passphrase.
Yes, exactly. The idea is to not have any self-rolled crypto in the project.
The SharpAESCrypt library is written by me, so you could argue it is kind-of hand-rolled, but it is based on the AESCrypt specs and file-format compatible. This was needed as AESCrypt did not have binaries for all platforms at that time.
Yes and no. The is a different file-encryption-key created for each encrypted volume, but it is derived from the same passphrase, as @Jojo-1000 points out.
As explained by @ts678 “it depends” on what interface you use. If you use the CLI, the user is responsible for providing the passphrase and can provide this in various ways.
While you can technically use the same for the GUI based aproach, most users will likely let Duplicati handle the passphrase. In this scenario, the passphrase is stored inside the Duplicati-server.sqlite database along with other credentials. Up until now, this database has been encrypted with RC4 encryption where supported (which is mostly just on Windows). The password to this database can be changed but most likely very few people have changed it.
For the .NET8 releases we have updated to the newest SQLite binaries, which do not support RC4, so for some time, the database will not be encrypted, but encryption with a known password is very close to having no encryption in terms of security analysis.
The plan has been to offload the passphrase and other sensitive information into the OS keychain, but this has not been implemented so far.
That is the responsibility of the SharpAESCrypt library, but in short, a random IV is generated, the passphrase is used to derive an encryption key. The random IV and key are used to encrypt the session iv+key that is used for encrypting the file.
There is HMAC-SHA256 built in to the AESCrypt format.
Sorry for digging into an old topic.
Last time I checked, in 2025, the KDF to derive the key from a user password was WEAK. I don’t really remember now what was it, but I remember my conclusion.
PBKDF2 with 500k iterations is currently considered “barely enough”. And for a backup, waiting 5-10 secods for the KDF to complete is like nothing. It’s not an interactive service, so iterations could be rised to millions easily.
That’s why I always say: use LONG (over 50 chars) passwords with duplicati backups. This protects even in the case of very very weak KDFs.
And one more thing: storing a symmetric key locally is a bad idea. Currently there is no other way, but a hybrid asymmetric system could be used to make non-incremental backups without storing the key localy.
Forgot to mention, this is also related to encryption, current attacks on privacy are severe. We all must take steps to counterbalance this, and one of those steps could be a mitigation of a scenario described below:
Most attacks these days look like this: an attack on the main device, if successufl → keys extraction, for example for duplicati backups → subpoena to the storage company if possible → getting the backup → older data compromised + current data compromised from the primary attack.
To mitigate this, the encryption key should not be stored on the device. So an asymmetric encryption would have to be used, but it’s causing problems with deduplication and incremental backups, a new way of storing incremental data would have to be proposed and implemented.
Thanks for bringing this up again. I recently looked at the implementation again, and we are a bit tied to the AESCrypt spec for interoperability. The “stream format v2” mandates 8192 iterations, so that is what is used currently.
While it is not a problem if you have a sufficiently strong passphrase, it is trivial to bruteforce a weak passphrase by now.
For that reason, I have recently updated the library to support “stream format v3” which has configurable iterations. Since we have users on low-performance hardware (like RPI) the default is currently 300k:
Duplicati does not really need to read any data from a backup when running, so it can easily use asymmetric keys (like with GPG). The test only downloads and verifies hash+size.
You only need the private key when recreating the database or restore data.
I am open to working on it, but we need an established format so we are sure that a tool that is not Duplicati can eventually decrypt the files.
Sooo, I accidentally stumbled upon a tool called… Duplicity. And it turns out, it does exactly that - asymmetrically encrypted incremental backups. So it already uses an internal format for that, which I imagine is completely different from Duplicati, but maybe instead of inventing the wheel from scratch, it would be possible to borrow some logic from there. If both tools would use the same format, then maybe it could become an officially recognizable standard one day.
[edit]
Forgot to mention - any backup encrypted asymmetrically would have to use a PQC crypto which is available only in GPG 2.5.x and up and this GPG version is not by default available in stable distros like debian. So linux versions would probably have to be bundled with a static gpg binary. Aaaand currently only kyber+brainpool are available where brainpool is not the best curve out there, which would force a double wrapped keys for a better EC and kyber
Also a hybrid approach asym+sym would be a good idea, as it doesn’t cost cpu cycles, because only the main key is wrapped in both asym and sym, but it protects from quantum attacks even further and allows for mixed access.
There is a lot of possibilities when it comes to multi-key setup, especially with GPG which allows for keys in parallel and series.