Hi all,
I have Duplicati running in datacenter environments scattered across about 20 sites. I’ve been (and remain!) a fan of the project for perhaps a year now and I’ve been slowing rolling it out to more and more servers and workstations that need the backups.
I’m also using all the notification tools available – I’ve got duplicati-monitoring.com in use on all of my backup jobs so I have one dashboard to look at. Then I’ve also got dupReports running for each location, with the daily reports going to the local network admin of that site and also the management of that site.
Here’s what I’m noticing from an enterprise administration point of view: The amount of jobs that “fail” is abnormally large. For example, I’m looking at my Duplicati Monitoring report for this morning, and the subject line states “34 backup sets with errors, 15 with warnings, 31 OK”… so, out of 80 backup jobs I have set up, 34 fail out, 15 succeed with warnings, and 31 are OK. That means that less than 50% of my backups are successful. I’m utilizing snapshot-policy=auto where feasible and snapshot-policy=required where absolutely necessary. I’m excluding windows system files, temp files, etc. where possible as well.
Drilling down into this data a little deeper, I start looking at the reasons the backups are failing, and some of them are due to storage devices being offline – that’s fair, I can address those individually – but a lot of them are due to differences in the local vs remote databases, or due to “fileset version” errors. Examples:
Failed: Unexpected difference in fileset version 5: 4/7/2019 8:00:01 PM (database id: 161), found 261351 entries, but expected 261352
Failed: Found remote files reported as duplicates, either the backend module is broken or you need to manually remove the extra copies.
I’ve been scouring through my email and I cannot find a good example of it, but I also get a lot of errors where the local database and the remote database don’t match up, meaning I end up trying a repair and ultimately deleting the database on both ends and then letting Duplicati recreate the database.
Overall, all of the above are just examples. My big question is this: What can be done to make Duplicati a “more stable” backup solution? Right now I get emails from managers who do not care what the detail message of the failure says, they just care that the backup notice says “fail” or “success”, and they get a little testy about “warning” – so is there anything that can be done on the Duplicati side to identify when a database does not match and “auto-magically” rebuild it? Is there any way to take some of these errors and resolve them so human intervention is not required to fix the problem?
Thanks in advance!