Is Duplicati 2 ready for production?

drwtsn32 · July 29, 2019, 5:12am

What steps are you following in your DR plan? And where are you getting stuck? Also what size is your backup set?

Maybe I can offer some pointers. I have done a DR test myself and it worked well for me but I may have taken extra steps.

Sami_Lehtinen · August 1, 2019, 11:19am

The simple answer is still absolutely not.

Version: 2.0.4.22_canary_2019-06-30

Is still extremely dangerously broken. This is the reason which totally ruins Duplicati. Backup restore fails, even if all tests pass perfectly. This is the ultimate trap. Stupid users and administrators think their backups are ok, and finally find it out when there’s a need to restore data and bang, it wont work.

It’s like having an insurance, which won’t ever cover anything.

But this was already discussed here:

Just reminding now that the situation is same, and nothing has changed since.

oliver · August 1, 2019, 12:34pm

TL;DR: NO

I have been using Duplicati for several months and observed that often it breaks out of nothing.

I have daily backups and the monitoring tool called duplicati-monitoring.com tells me that every 2 or 3 days a backup has a warning, the day after is okay and then starts over.

I often found jobs in an unusable state: could not add files nor restore them. Others had dlist files missing, database is broken and basically the backup is lost.

Sometimes backup duration is 10x o 20x what you would expect, without any obvious motivation (CPU, Memory and bandwidth OK).

It becomes VERY unreliable when backing up more than 1TB per job (at least in my experience) and takes ages to start such job because database grows quickly.

I still use it because it’s free and could not find a better alternative. Sometimes everything is great, sometimes you want to throw it down the flush.

drwtsn32 · August 1, 2019, 3:27pm

I believe some of the unreliability can be attributed to the back end being used. Some seem to be inherently more reliable (S3, B2, etc) than others (OneDrive) for this type of application.

Setting that aside, there are still some bugs to be worked out. I personally have experienced the “unexpected difference in fileset” issue a couple times as well as “found inconsistency while validating” recently. I think they may be caused by bugs in the compaction or fileset pruning processes.

I also wish the database recreation process was more reliable - on my systems it still seems to need at least some dblock files, where I believe it should only need to read the dlist and dindex files. The fact that it has to get dblock files seems to be an indicator of some underlying bug.

Perhaps that is due to an auto compaction event which also may trigger a database vacuum operation, or something similar.

oliver · August 3, 2019, 2:55pm

Yes, the back end I am using is not super reliable. Nevertheless, Duplicati should also manage manage such situations and actually verify that files have been uploaded on the backend or use smarter retry systems.

baip · October 15, 2019, 3:04am

Has anyone used Duplicacy? I just started using it on one of my Linux servers to back up ~3TB of data. I really like its lock-free deduplication feature. Backup & restore work fine with some basic testing, but I’m curious about the experience of people who have used both Duplicati & Duplicacy longer and in more complex situations.

drwtsn32 · October 15, 2019, 3:34am

Yes,I have used it. The lock-free dedupe is probably its killer feature. Awesome design! Personally I don’t like how each backup set only protects a single source folder - doesn’t work well for my workflow. The workaround is use symlinks but that seems like a kludge to me.

Sami_Lehtinen · October 15, 2019, 6:57am

Why that’s a killer feature? It suits only very specific operating environments. Every decision involves trade-offs. I didn’t find anything amazing about the lock-free approach here. Except this works well in cases where you’ve got set of systems, which share mostly same data and same encryption keys. At least in most of cases I’ve seen, this is rarely the case. Because shared data usually doesn’t need to be backed up. And the private data which isn’t shared, can’t be naturally backed up with shared encryption keys.

Yet it would be naturally possible to use content specific encryption keys, as described. But it would of course require managing the content lists in a way where the information about non-shared data can’t leak. I guess it’s possible to transfer some of the keys from system to system to enable this kind of access. Then it would require the master list of data to be encrypted, so the chunks can use shared content specific keys. → Which would allow you to only restore data based on the master-list, even if you’ve got access to all chunks. Yet then you wouldn’t know which blocks are referenced and which ones aren’t, because you can’t have access to the master list(s).

Another drawback seems to be also obvious. When every chunk is in it’s own file, it might lead to situations where there are absolutely staggering amounts of files. Once again, depending from situation, protocols and platforms, this might be a problem (overhead, malfunctions) or not.

Every application is designed for some use case, and there are very different use cases. Also renaming files or moving to another directory, is something which some of the cloud storage services do not support. Yet, adding 0 bytes extra file indicating fossilization of chunks also works.

ts678 · October 15, 2019, 1:04pm

The “killer feature” term might have originally been coined by the Duplicacy developer in their forum:

Duplicacy vs Duplicati

pointing back to the Duplicati forum:

Duplicati 2 vs. Duplicacy 2

Both articles are a little dated now. Duplicacy has now gone over to web-based UI like Duplicati uses, however it’s licensed and extra-cost. The good/bad thing about that is it might be allowing the original Duplicacy developer to remain actively involved, whereas Duplicati might be in some transition mode.

Duplicati probably still wins on feature quantity, but has trouble with scaling and stability (i.e. it’s beta). Possibly its web UI is still better than Duplicacy’s new paid one (which I have not tested). Trade-offs… There have been Duplicati users leaving for Duplicacy. You can search this forum and theirs for notes.

drwtsn32 · October 15, 2019, 2:01pm

By “killer feature” I mean something unique to Duplicacy - their stand-out differentiator.

(Enterprise backups have global dedupe as well but Duplicacy and Duplicati and others are not targeting that market.)

I agree there are downsides to lock free dedupe… Didn’t mean to imply there are not.

Sami_Lehtinen · January 22, 2020, 11:04am

I did just run massive backup restore batch job to verify backup integrity and there were some errors, but not a single complete restore failure. That could be seen as a huge progress compared to earlier tests, which almost always included some totally non-restorable backups.If anyone is really interested, I can give bit more details in private. But this is all I can say publicly.

Sami_Lehtinen · December 22, 2020, 6:00am

Repair fails, seems to be years old issue.

Sami_Lehtinen · January 20, 2022, 12:04pm

Quick summary, for period test:
~83% backups good, worked like it’s supposed to and quickly.
~11% backups broken, but still restorable, very slow recovery step involved
~6% backups damaged beyond repair, unable to restore

This is nightmarish result. Something is still inherently extremely broken and software is very dangerous to use. I guess it would be a good idea to change the software being used.

drwtsn32 · January 20, 2022, 4:51pm

Can you elaborate? Is this a database recreation step that is taking a long time?

While the software certainly has some room for improvement, your experience doesn’t really jive with mine personally. I have very few issues, if any at all, on my 18 backup sets.

I know you had some corruption on the back end, and Duplicati can’t really deal with that as it doesn’t include parity in the back end files. You could mitigate that risk by using more durable storage.

Sami_Lehtinen · January 20, 2022, 5:31pm

Sure I can, so when I run restore tests I log everything. And when restore returns code 2, it’s successful, but during the restore it has to read all dblock files from the directory trying to recover missing dblocks.

But ending up with this situation means that something is already seriously wrong, we were just lucky that the recovery was successful. It could have been worse, if those blocks wouldn’t have been available from other files. As far as I understand, that’s the situation.

I can edit this message tomorrow and add the exact message and related key parts of the log of one of the failures, to be 100% clear. I don’t have it right now at hand.

And the totally failing restores, those end up with code 100. I’ll drop a few log snips here as well.

Btw. Those initial results were only from “small” backup sets, because I’ve saved the large ones later. With the large ones, I assume that the failure rate is even higher, because well. Let’s just share the probabilities and execution time, amount of bytes transferred and stored. So I basically know that it’s going to be worse.

But I’ll know that in a week or maybe two tops. These backup sets are measured in terabytes instead of tens of gigabytes.

But just as a generic reminder, always always test your backups regularly. With full recovery. Otherwise when you need your backups, well, it’s highly likely that there’s nothing to restore.

Also one thing I’m going to do, is install the latest version on all sources. Because that could be meaningful thing.

And if there’s something good about this. At least the full restore error reporting works. It could be worse, if it would say ok, when everything isn’t actually ok. Also the final hash verification is good. When Duplicati says backup is successfully restored, it hasn’t been ever broken.

drwtsn32 · January 20, 2022, 5:50pm

I suspect you are doing “direct restore from backup files”? (Which of course is the best type of test for a DR situation, not relying on the local database.) When you do this Duplicati does have to build a temporary database, and my guess is you are hit by a bug where bad dindex files cause dblocks to have to be read in order to create that temp database. Same issue when you aren’t doing a restore test but instead just recreating the local database.

There is a fix for this (at least in my experience) - a way to regenerate those “bad” dindex files. But it requires a functioning database first. If you have the time and inclination it may be an interesting test. After the dindex files are regenerated, I am betting a direct restore from backup files will work much more quickly (at least the temp database creation phase).

Sami_Lehtinen · January 24, 2022, 7:48am

I guess this is wrong thread for this discussion, but just saying that these are age old problems with Duplicati which are still unresolved.

Slow recovery, code: 2

Remote file referenced as duplicati-bf9fb2b38282d40bab8b6c31ffa1a685b.dblock.zip.aes by duplicati-i8c60e298171e421094efc4f16460ebcd.dindex.zip.aes, but not found in list, registering a missing remote file
Found 1 missing volumes; attempting to replace blocks from existing volumes

Totally fubared, code: 100

ErrorID: DatabaseIsBrokenConsiderPurge
Recreated database has missing blocks and 8 broken filelists. Consider using “list-broken-files” and “purge-broken-files” to purge broken data from the remote store and the database.

I’m sure there are better threads for this topic… So let’s not continue here…

Anyway, when backups are systemically and repeatedly corrupted, it’s not a good sign for any program. Especially not a backup program.

Edit: Linked to this thread: Backup valid, but still unrestorable? Issue persists, after three years.

I’ll confirm one sample case, and update that thread accordingly.

GPaul · January 27, 2022, 5:42pm

Hi to all!

Since the assumption of this thread, a lot of time passed.
I would like to ask you for a brief opinion:

Is Duplicati stable and trustworthy for you after such a long time?
Are there any magical problems that the most terrifying is “Duplicati claims that everything is fine, and suddenly turns out that the backup is damaged”?
Does anyone changed Duplicati to another solution?

I know that there is still a Beta status, but some posts sound, as if Duplicati was in the pre-pre-alpha phase

Personally, I need to use it on several Ubuntu machines and I wonder if it can work better in the Mono environment or worse than Windows?

The entire Duplicati project seems to have a great idea, brilliant design bases (whitepaper), fantastic functions … but there was no programming power (I am not talking about the competences of the creators, only about the number of programmers).
I’m sorry, but I have the impression and I hope I’m wrong.

Your opinions are important to me because I need a production solution. Currently, I am looking for a solution for Ubuntu, which is Open Source + has a GUI + deduplication + incremental copies + encryption).
Consider between Duplicati or Vorta (GUI for Borg) or a kopia.io

Cheers!

drwtsn32 · January 28, 2022, 12:49am

Hello and welcome to the forum!

Many people use it and find it reliable, but some people do have issues. (The posts on this forum are of course by people looking for support with issues. You don’t really see people post very often that DON’T have issues.)

There are also pending known issues that still need fixing of course. Whether or not those affect you depends on your use case. For me the pending issues aren’t a problem.

You’re not wrong, it seems like we don’t have many people actively working on development. Volunteers are always welcome.

There are a lot of options available. I suggest you try several out and compare. That’s what I did when I stopped using my previous backup solution (CrashPlan).

Good luck on your journey!

ts678 · January 28, 2022, 1:54am

Look at the dates on the posts in this topic. Personally, I had problems before 2.0.5.1_beta_2020-01-18. but debug/fix efforts prevailed on many of them. There are some unfixed, maybe due to the developer shortage. Others are rare, and poorly understood because they’re rare. Reproducible test cases can help immensely.

Good practices for well-maintained backups has things you can do to attempt to keep things happy and fast.

For a sense of scale, https://usage-reporter.duplicati.com/ shows 4 million backups/month. Not all post.

Keep multiple backups of critical data, maybe by different programs. Reliability varies but none are perfect.
If you’d consider a licensed closed source GUI on top of an open source engine, some folks like Duplicacy. Comparisons are hard due to different user base sizes and uses, so personal evaluations become helpful.