I hope they solve these small problems of recovery, I like the program a lot and it’s a pity that development stops
Just so nobody is confused, development hasn’t stopped, but the primary dev is very busy and as with many open source development projects the availability of other contributors varies greatly.
I agree there are items that need to be resolved and appreciate your, and everybody’s, patience as they are addressed.
perfect, good job
Today I decided to do DR test - I wanted to simulate a total loss of my data center.
I got myself a brand new Azure VM, installed Duplicati and started following my DR plan. Its first step was to recover a few folders (we are talking tens of files and megabytes in size) that would facilitate the rest of the recover process. It has been over 12 hours now, I am still nowhere with that first restore due to UI bugs and I am in the process of giving up for the day and going to bed.
I think the next step in my DR planning will be to start looking for another solution…
Not sure what happened to Duplicati over the last year or so, but its overall quality has gone significantly down
What steps are you following in your DR plan? And where are you getting stuck? Also what size is your backup set?
Maybe I can offer some pointers. I have done a DR test myself and it worked well for me but I may have taken extra steps.
The simple answer is still absolutely not.
Is still extremely dangerously broken. This is the reason which totally ruins Duplicati. Backup restore fails, even if all tests pass perfectly. This is the ultimate trap. Stupid users and administrators think their backups are ok, and finally find it out when there’s a need to restore data and bang, it wont work.
It’s like having an insurance, which won’t ever cover anything.
But this was already discussed here:
Just reminding now that the situation is same, and nothing has changed since.
I have been using Duplicati for several months and observed that often it breaks out of nothing.
I have daily backups and the monitoring tool called duplicati-monitoring.com tells me that every 2 or 3 days a backup has a warning, the day after is okay and then starts over.
I often found jobs in an unusable state: could not add files nor restore them. Others had dlist files missing, database is broken and basically the backup is lost.
Sometimes backup duration is 10x o 20x what you would expect, without any obvious motivation (CPU, Memory and bandwidth OK).
It becomes VERY unreliable when backing up more than 1TB per job (at least in my experience) and takes ages to start such job because database grows quickly.
I still use it because it’s free and could not find a better alternative. Sometimes everything is great, sometimes you want to throw it down the flush.
I believe some of the unreliability can be attributed to the back end being used. Some seem to be inherently more reliable (S3, B2, etc) than others (OneDrive) for this type of application.
Setting that aside, there are still some bugs to be worked out. I personally have experienced the “unexpected difference in fileset” issue a couple times as well as “found inconsistency while validating” recently. I think they may be caused by bugs in the compaction or fileset pruning processes.
I also wish the database recreation process was more reliable - on my systems it still seems to need at least some dblock files, where I believe it should only need to read the dlist and dindex files. The fact that it has to get dblock files seems to be an indicator of some underlying bug.
Perhaps that is due to an auto compaction event which also may trigger a database vacuum operation, or something similar.
Yes, the back end I am using is not super reliable. Nevertheless, Duplicati should also manage manage such situations and actually verify that files have been uploaded on the backend or use smarter retry systems.
Has anyone used Duplicacy? I just started using it on one of my Linux servers to back up ~3TB of data. I really like its lock-free deduplication feature. Backup & restore work fine with some basic testing, but I’m curious about the experience of people who have used both Duplicati & Duplicacy longer and in more complex situations.
Yes,I have used it. The lock-free dedupe is probably its killer feature. Awesome design! Personally I don’t like how each backup set only protects a single source folder - doesn’t work well for my workflow. The workaround is use symlinks but that seems like a kludge to me.
Why that’s a killer feature? It suits only very specific operating environments. Every decision involves trade-offs. I didn’t find anything amazing about the lock-free approach here. Except this works well in cases where you’ve got set of systems, which share mostly same data and same encryption keys. At least in most of cases I’ve seen, this is rarely the case. Because shared data usually doesn’t need to be backed up. And the private data which isn’t shared, can’t be naturally backed up with shared encryption keys.
Yet it would be naturally possible to use content specific encryption keys, as described. But it would of course require managing the content lists in a way where the information about non-shared data can’t leak. I guess it’s possible to transfer some of the keys from system to system to enable this kind of access. Then it would require the master list of data to be encrypted, so the chunks can use shared content specific keys. -> Which would allow you to only restore data based on the master-list, even if you’ve got access to all chunks. Yet then you wouldn’t know which blocks are referenced and which ones aren’t, because you can’t have access to the master list(s).
Another drawback seems to be also obvious. When every chunk is in it’s own file, it might lead to situations where there are absolutely staggering amounts of files. Once again, depending from situation, protocols and platforms, this might be a problem (overhead, malfunctions) or not.
Every application is designed for some use case, and there are very different use cases. Also renaming files or moving to another directory, is something which some of the cloud storage services do not support. Yet, adding 0 bytes extra file indicating fossilization of chunks also works.
The “killer feature” term might have originally been coined by the Duplicacy developer in their forum:
pointing back to the Duplicati forum:
Both articles are a little dated now. Duplicacy has now gone over to web-based UI like Duplicati uses, however it’s licensed and extra-cost. The good/bad thing about that is it might be allowing the original Duplicacy developer to remain actively involved, whereas Duplicati might be in some transition mode.
Duplicati probably still wins on feature quantity, but has trouble with scaling and stability (i.e. it’s beta). Possibly its web UI is still better than Duplicacy’s new paid one (which I have not tested). Trade-offs… There have been Duplicati users leaving for Duplicacy. You can search this forum and theirs for notes.
By “killer feature” I mean something unique to Duplicacy - their stand-out differentiator.
(Enterprise backups have global dedupe as well but Duplicacy and Duplicati and others are not targeting that market.)
I agree there are downsides to lock free dedupe… Didn’t mean to imply there are not.
I did just run massive backup restore batch job to verify backup integrity and there were some errors, but not a single complete restore failure. That could be seen as a huge progress compared to earlier tests, which almost always included some totally non-restorable backups.If anyone is really interested, I can give bit more details in private. But this is all I can say publicly.
Repair fails, seems to be years old issue.
Quick summary, for period test:
~83% backups good, worked like it’s supposed to and quickly.
~11% backups broken, but still restorable, very slow recovery step involved
~6% backups damaged beyond repair, unable to restore
This is nightmarish result. Something is still inherently extremely broken and software is very dangerous to use. I guess it would be a good idea to change the software being used.
Can you elaborate? Is this a database recreation step that is taking a long time?
While the software certainly has some room for improvement, your experience doesn’t really jive with mine personally. I have very few issues, if any at all, on my 18 backup sets.
I know you had some corruption on the back end, and Duplicati can’t really deal with that as it doesn’t include parity in the back end files. You could mitigate that risk by using more durable storage.
Sure I can, so when I run restore tests I log everything. And when restore returns code 2, it’s successful, but during the restore it has to read all dblock files from the directory trying to recover missing dblocks.
But ending up with this situation means that something is already seriously wrong, we were just lucky that the recovery was successful. It could have been worse, if those blocks wouldn’t have been available from other files. As far as I understand, that’s the situation.
I can edit this message tomorrow and add the exact message and related key parts of the log of one of the failures, to be 100% clear. I don’t have it right now at hand.
And the totally failing restores, those end up with code 100. I’ll drop a few log snips here as well.
Btw. Those initial results were only from “small” backup sets, because I’ve saved the large ones later. With the large ones, I assume that the failure rate is even higher, because well. Let’s just share the probabilities and execution time, amount of bytes transferred and stored. So I basically know that it’s going to be worse.
But I’ll know that in a week or maybe two tops. These backup sets are measured in terabytes instead of tens of gigabytes.
But just as a generic reminder, always always test your backups regularly. With full recovery. Otherwise when you need your backups, well, it’s highly likely that there’s nothing to restore.
Also one thing I’m going to do, is install the latest version on all sources. Because that could be meaningful thing.
And if there’s something good about this. At least the full restore error reporting works. It could be worse, if it would say ok, when everything isn’t actually ok. Also the final hash verification is good. When Duplicati says backup is successfully restored, it hasn’t been ever broken.
I suspect you are doing “direct restore from backup files”? (Which of course is the best type of test for a DR situation, not relying on the local database.) When you do this Duplicati does have to build a temporary database, and my guess is you are hit by a bug where bad dindex files cause dblocks to have to be read in order to create that temp database. Same issue when you aren’t doing a restore test but instead just recreating the local database.
There is a fix for this (at least in my experience) - a way to regenerate those “bad” dindex files. But it requires a functioning database first. If you have the time and inclination it may be an interesting test. After the dindex files are regenerated, I am betting a direct restore from backup files will work much more quickly (at least the temp database creation phase).