Does duplicati just not work with more than a few dozen GB?

I’m finding duplicati just plain doesn’t work well with datasets exceeding a few dozen gigabytes. OK, it technically works, but it takes literally days to create a base image for larger datasets.

By comparison, Macrium Reflect basically hits the limit of the transit medium (SATA, USB, Ethernet, whatever). It can easily hit gigabits per second with the right combination (eg.: nVME to nVME over USB-C).

Why is Duplicati so slow?

You can’t relay compare block level backup apps with something like Duplicati.
A fair comparison would be backup data with Macrium Reflect ( potenciály without compression) any then run dedupe on target data ( for example via build-in Windows deduplication Understanding Data Deduplication | Microsoft Learn ). And do that after every backup.

Duplicati will definitely be much faster:)

You’d think so, but Macrium File & Folder backup (not block level) is still substantially faster than duplicati.

I will concede that Duplicati is not the fastest backup solution if your measure is raw transfer speed.

I use Duplicati to back up virtual machine images that are in the 500 GB range, and the backups only take about 1-2 hours to run every night. True, the initial backup takes a little longer, but definitely not days on local network storage. The block level deduplication allows me to keep a 30 day rolling set of backups in a much smaller space than most other backup products I am familiar with.

In one instance I am able to keep 30 nightly backups of a 676 GB set of VMs in about 950 GB.

For me where Duplicati really shines is for offsite backups. In my area it is uncommon to have an upload speed faster than 5-10 Mbps. All the raw transfer speed in the world won’t help you when it’s bottlenecked through a slow network link. In cases like that I suspect that Duplicati would be just as fast as Macrium, or perhaps even faster depending on how many duplicate blocks there were.

I’m not familiar with Macrium products, but I suspect it’s not exactly an apples to apples comparison.

I have a similar complaint. I’m a long-time Macrium user and enjoyed the convenience of being able to browse my backups and restore en-masse or specific files without waiting minutes at a time between directory level browses, or taking many hours to restore, from a local networked drive.

I’m very interested in using Duplicati as a Macrium replacement, the incremental backups and the space-saving features are the main draws for me. Unfortunately, I’m hitting a wall when browsing for restores due to some issue with “mono-sg”, I’ll make a post in the forum and see if I can get some optimization tips.

If I’m barking up the wrong tree trying to use Duplicati as a Macrium replacement please tell me now, I can still see daylight from the bottom of this hole I’ve dug for myself!

I took a suggestion from someone in another thread and tried restic and I’m much more impressed. Way faster than Duplicati, incrementals are tiny.

It’s CLI with no built-in GUI (I think there are web GUI add-ons available), but it’s completely self-contained with no dependencies, libraries, or installation required, and there are only about five basic commands you need to know to use it well.

If you feel adventurous, I would sugggest trying out the new canary builds. They are no longer based on Mono, and we increased the default block size to 1MiB, which both contribute to faster execution.

Happy to hear you have found a solution. I was testing performance of Duplicati, and can see a steep drop after a certain size in the database, and I hope to fix that.

That is actually also the direction Duplicati is going. The latest versions are without dependencies, and also have a .zip package. You can see the canary builds here if you like.

And the CLI interface for Duplicati is actually only 2 commands: backup and restore, if you prefer not having the UI.

I’m skeptical that this is so. Got a good citation? My guess is it’s a file-filtered version of usual.

Image backups these days can do better than copying all sectors, whether used or free space. Generally they know about specific filesystems and know how to backup only the needed data, meaning on Windows they probably study NTFS Master File Table and file cluster information.

I used to use Macrium Reflect Free for occasional image backup for full coverage that frequent Duplicati file backup didn’t get. Macrium to USB drive ran fast, so much so that I think Macrium must be optimizing for physical hard drive head position to avoid seeking which kills throughput.

Look at a drive benchmark, or benchmark your own. Here’s my hard drive on Crystal DiskMark:

image

so there’s an example of how much faster sequential is, so trying to achieve that is worthwhile, however normal programs are not filesystem aware. They let their OS abstract all of that away.

I tend to agree for the image backup, and am wondering if file backup is just an image subset.

Windows Explorer shell integration appears very similar between the two backup types. If you know, did File and Folder .mrbak get same speed as partition .mrimg? That’s some evidence.

Free is gone, and I’m still looking for a good replacement for the same use. I never had a paid version, so can’t speak to File and Folder, or differential, or incremental. Duplicati still does OK running my frequent smaller backups. I don’t like to do all-permissible-file backups on anything because it’s an overnight job even on restic, and Duplicati is slower though not enormously so.

I do like having an offsite backup of the smaller selected data, as it’s better disaster protection. Having multiple backups is a good idea too. I’m not sure how to get all I want in any single tool.

If your use case is on-site-only at Macrium speeds, you might be hard pressed to find even an image backup program that can match that (based on seeing reports of others who have tried).

I’m currently trying Paragon image, and it’s slower and not as nice, but file backup is yet slower, inherently, because AFAIK they are all non-filesystem-aware and so suffer from random access.

Some of this is SQLite cache size related, and can be configured around, but it’s a bit awkward.
CUSTOMSQLITEOPTIONS_DUPLICATI is the environment variable, and forum has examples.

Off topic for imagining drives, I can recommended TeraByte Image for Windows, really solid program but not free.

1 Like