Why the negative reviews?

I’m currently comparing backup software like Duplicati, Kopia, restic and Duplicacy. Duplicati is the one with most negative reviews (e.g. Reddit). Although personally I find it pleasant to use, including its user-friendly GUI, extensive cloud storage support, as well as a friendly support community. The most common criticism of Duplicati is corruption of backups. Some observations:

  • Backup corruption seems to be mostly on Linux (including Docker). I’m seeing fewer complaints from people running Windows. This could of course be a statistical illusion or confirmation bias. I don’t know. I do however suspect this could have something to do with Duplicati being written in .NET and ported to Linux using Mono. Can’t also help but notice the large amount of files in the installation (700-1100) which makes it less elegant that a single binary solution (like restic or Duplicacy) because any of them, in theory, could get damaged.
  • Another hypothesis is the architecture of the software itself, such as the choice to use local databases. My understanding is that this is not the case with other programs such as Kopia and restic where the database is integrated in the repository. Devs say the local database is good for data saving in the cloud. Perhaps the database could be disabled for local backups where data cost isn’t an issue?

Anyway, what do the devs think of this? Just curious, cause I’d very much like Duplicati to succeed. In fact, it’s really awesome we’re even getting such software for free. Thanks!

I have not heard of actual backup corruption. What usually happens is that the local database ends up in a broken state. This does not prevent restoring anything, but Duplicati refuses to make new backups.

The reasons for these failures are hard to reproduce, but we have actually managed to get the most common ones fixed for the recent beta release.

With the latest beta, there is no more Mono, everything is compiled for the platform, and includes the runtime it was tested with. I have not heard of Mono causing corruption, but there has been reports of excessive memory usage and CPU usage.

Most of the files are small text files, including the html/js files for the UI. We could combine them during the build, but so far there has been no real requests for this.

I personally think the idea of having a single binary is neat, but has little practical value. In most situations you are most likely better off downloading from the website instead of copying. For most systems, using a package manager of sorts is also preferred.

Bit corruption could as easily happen with a large file, compared to a single large file.

I think that is based on preference and as such subjective. I do not have enough deep understanding of the other systems to give a fair comparison, but I can explain the reason for the local database.

Fundamentally, Duplicati does not trust the storage to work as it should. For that reason it keeps track of what the storage is supposed to look like, and complains loudly when things are not as wanted. This is based on my experience with storage providers and storage devices, and an attempt to give errors before you need to restore.

The rest of the database is essentially a cache file, but the database allows us to use the fast B+Tree for lookups with limited memory needs.

For local backups, you still need to keep a check on the storage to make sure you do not loose files from defect disks for instance.

If you just view the local database as a (structured) cache file, I don’t see it as so different from other solutions, but the distrust of storage capabilities is not something I have seen from others.

Thanks for sharing. I think on Reddit particularly, we have not been very active, so that might be some of it. It could also be the issues you mention with the Docker/Mono setup.

In any case, I think we are moving in the right direction and steadily increasing stability of the core engine. You can see some of the work we have completed for the current release and some of the work we have completed for the next release.

1 Like

I’m not a dev but I volunteer a lot on the forum (help wanted), so I have a few opinions on it.

I’d be happy to remind you of some, but I agree data loss is rare. Noise and work is less so.

The way I usually put it here is that Duplicati keeps careful records in the database of what it expects to see, self-tests quite a lot, and sometimes complains. It’s an early warning system, although there are things it could still do better to avoid surprises at, say, database recreates.

Anything the check complains about is probably seen as corruption and will need some work.
Difficulty varies. Sometimes an obvious path like a Repair works. Next might be DB Recreate, however sometimes this fails, and at that time I would probably call it a corruption, despite the Duplicati.CommandLine.RecoveryTool which tends to forge ahead despite certain corruptions.

Recreate is pretty robust. Ideally it runs at an acceptable speed. Less ideally (given a problem where it can recover by trying hard), it’s slow. Sometimes it still can’t work. This is getting rare.

Beyond the more obvious buttons to push (or advice from the GUI), recovery sometimes does better given some expert advice, which is why forum exists. Situation is improving, not perfect.

Unusual cases still occur. https://usage-reporter.duplicati.com/ count is 65 million backups/year which can be compared against (say) forum reports. Far from a flood, maybe down to a trickle.

I classify the problems roughly as:

  1. Local database corrupted, recreate possible, restore possible
  2. Local database corrupted, recreate fails, restore possible
  3. Remote data is corrupted, recreate fails, restore is not possible

Duplicati is designed to avoid ending up in category 3, so if there are any cases where Duplicati has produced damaged data, we need to prioritize those issues.

There has been types 1 and 2, but we added fixes for those in the latest beta. There are some issues where the internal format contains additional data that gives warnings, but this does not affect correctness. If you are aware of open issues in category 1 or 2 that are not on the list of pending issues, please do add them or make me aware.

Thank you for the response! It’s awesome to hear developers take Duplicati seriously! I’ve stepped up using the program recently and I may return with some (minor) suggestions soon!

2 Likes

I am new to Duplicati. I signed up and logged in here because of a related problem I saw today. I’ve been saving backups to a 2 TByte USB disk drive, Windows 11. ALL of the Duplicati backup files disappeared, gone, vanished. Here is what I think happened, and how you could help prevent it.

In my early experiments it created the expected backup files but I have occasionally gotten “files missing in local database” messages which I have tried to Repair. The activity message at the top of the page can be misleading and never clearly says when a Repair is finished. Combined with the sometimes slow response I cannot always know what it is doing, if anything at all. That may have induced me to do something which interrupted an ongoing operation. I noticed the lost files only because I decided to try deleting the local version of an unimportant file to see how the restoration worked. THERE WAS NO WARNING ABOUT THE LOST BACKUPS other than the Please Repair notices which didn’t seem to work.

At this point I deleted my databases and configurations, then created new configurations. As before, when I ran it I got surprising “files missing in database, please repair” messages again, as if it didn’t remember all the file names in the folder I had just told it to include.

After I got the backups working again I also finally realized that when it said it was performing a backup (I forget the exact wording) it was actually restoring the local database, not the remote backup files. Only later did I notice the “No backup performed” notice on that configuration. Fortunately I have just begun and have not yet needed any backup I have attempted. All seems to be working now.

More informative messaging might prevented a possible disaster had I actually needed any backup.

Thanks.
BTW, I dislike github.

I am not following 100% what has happened in this case, but I am guessing that you somehow have 2 backups that are making a backup into the same remote folder. This is not supported because the two backups do not co-operate, so one backup will notice “files missing in database”, meaning that files seemingly appeared in the destination folder without being written by the backup.

The repair feature tries to bring things back in order and will remove the “offending” files. When you do this, then the other backup will then complain that “files are missing”. Repair here can then sometimes recreate the missing files, but often the source data needed is gone, so this will fail.

We have added a check to the canary builds that refuse to delete files, it it can detect that a backup has been made that was newer than what was expected (you can override this behavior). This prevents at least the delete part which starts the chain of broken things.

If you have a screenshot of the place where the message should be clearer, I would love to see it and a suggested text.

FYI this was submitted as a Support topic as linked above your post.
Yes it was two backups in one folder, but new docs don’t warn not to.
Even beyond docs, various possible code preventions were covered.