After reading the reports on integrity problems, I’d like to know how to actually check integrity of an archive (client on a Mac). I don’t mean the integrity of the local database, but the integrity of the actual data. And secondly, if my backup is remote (over the internet) and I have the keys and all, could I run that integrity check at the location where the backup is actually stored, or is the only option to do a check on the client side?
I think you could do this with the --backup-test-samples=inf
option. And yes I would definitely run this test “local” to your backup data, if possible. You’d have to install the Duplicati client there and import your backup configuration. You should probably also copy the local database over, or maybe you could use this as a way to test the rebuild of the local database, as well.
Where are your remote files? If they are in S3, you could spin up an EC2 VM in the same region to perform this test. Not only would this be faster than running the test from the machine that is “remote” from the backup data, but you’d also avoid S3 data egress costs. (Of course you’d have to pay for the EC2 instance while it is running, but the very small VMs are dirt cheap.)
Make sure you don’t run a backup at the same time from your main machine while the test is in progress.
I am running a minio backend myself (S3 compatible) and a couple of users (both on the lAN and via the internet) connect to the minio backend with their Duplicati frontend.
So, what I probably should do is:
- Export the configuration on the remote machine and copy it over
- Copy the database from the remote machine to a local machine in a local account. I have the setup, but not the original files, they are on the remote machine
- Run a Duplicati Server process using this database
- Import the configuration
- Prevent a backup from the original machine from happening
- Run the integrity check (command line, add flag to settings?)
But what do I do exactly at each step? Maybe someone can create an error-proof recipe?
It would be nice if Duplicati had UI support for this.
Are each of the users using a different directory so that they aren’t intermixing files?
Yes I think those steps are right. But you bring up a good point in item 2 - I wonder if Duplicati can test the back end files without having access to the source files? A duplicati dev would need to chime in.
Yes you can!
While researching for the How-To below I was able to:
- copy my database (only) from a wifi only machine to a wired one
- use the “Command line” tool for an existing (but different) backup
- select
test
- replace the defaulted destination with the “wifi only” one
- replace all “Commandline arguments” with
all
and a few other parameters - replace the “Advanced options”
--dbpath=
and--passphrase=
defaulted values with the path of the file I copied over and passphrase for the “wifi” backup - run the test in a lot less time that it would have taken over wifi
If you’re destinations are on a machine where Duplicati isn’t available via the GUI you should be able to use the command line version of test
as well.
You might also want to check out the DuplicatiVerify.py
and DuplicatiVerify.ps1
Python and PowerShell scripts in the the ...\Duplicati 2\utility-scripts
folder.
The Python script has header comment of:
This file is a standalone python script that is tested with python 2.7
If Duplicati is producing a backup with the option --upload-verification-file,
it will produce a *-verification.json file on the backend, which can be verified
by this script. Simply run this script with the path to the backup
folder as input, and it will verify all *-verification.json files in the folder.
I vaguely recall there being a third party tool somebody wrote to remotely test as well, but I’m not finding reference to it at the moment…
Yes, they all have a different directory
Writing help/recipes for people who are only working with this intermittently is a pain, I know, but a sentence “use the “Command line” tool for an existing (but different) backup” is not immediately clear and nor is “replace the defaulted destination with the “wifi only” one” (I have no idea what this means and experimenting is not an option, scared as a user like me is to damage something, e.g. amd I changing the setup of the backup and do I need to change things back afterwards?).
But the python thing looks interesting. I just have to add “-upload-verification-file” to the existing backup and I can do integrity checking on the backend only, right?
Yes, that will produce a .json
file with the hash of all remote files, making it easy to check if all files are present and unmodified. This guarantees that whatever Duplicati thinks should be on the remote storage, is actually present on the remote storage. The verification script can then perform this check without having access to any of the data inside the volumes, such that it is safe to run the script on the remote storage, without trusting it with your passphrase.
However, if you want to ensure that you can restore files (also guarding against errors inside Duplicati’s logic), you need to do a restore.