Big Comparison - Borg vs Restic vs Arq 5 vs Duplicacy vs Duplicati

For me, the obstacle has been that Duplicati does not compress files that already are compressed. Now that I checked, also restic actually does provide an option to avoid that (by listing file extensions). So I guess I will start preparing a switchover then.

You can specify file extensions for Duplicati as well:

But in general re-compressing compressed files is just burning CPU cycles. Of course there can be some special cases where it’s worth of it.

Soorry, I wasn’t clear. Whn I used the word “obstacle”, I didn’t mean obsacle to using Duplicati, I meant my obstacle to moving away from Duplicati to restic was that I thought restic compressed everything. Now that I realize it doesn’t always do that, I can make the move I have considered for a long time.

When would it be helpful to recompressed already compressed files?

In situations where the existing compression is poor. As example small files in package without “solid” compression. This is a standard problem as example with ZIP files. Or if the zip is compressed with store only mode. Which basically means it’s not compressed, even if it is zip archive. You can’t know those things from the file extension alone.

Study of repository damage types was an impressive work, though with important Limitations noted. Ultimately, a lot probably depends on the specific use case, so testing yours would be most interesting.

Some of Duplicati’s known issues are in compact, whereas Restic’s study began on prune corruptions. Possibly my Duplicati theory holds, which is that the less common operations are harder to investigate.

As always, easily reproducible test cases help greatly. If not very reproducible, then with lots of logging. Maybe that will let someone get a handle on things, but we still need to find volunteers to change code.

1 Like

Here I am in 2024, found your posting back from pandemic time :smiley:
Thanks for the posting, I’m literally trying to compare various backup software that can do deduplication and encryption at the same time.

@Kelly_Trinh, can I please know which one you settle at the end?
I’m tossing between duplicati, duplicacy and borg.

I rather like the idea of borg that allows you to ‘mount’ the archive directly on my mac (without the need of restoring the overall files) - as mentioned in this “How to back up data privately and securely to the cloud using Borg and Vorta” by Sun Knudsen, which allows you to use borg + vorta on mac OSx

But somehow, I’m having difficulty setting up self-hosted borg server (instead of using borgbase) right now…sigh

Thanks

Welcome back @televisi

I currently run arq, duplicati, duplicacy and restic.

duplicati is more for legacy reasons (it is kinda cool to see 700+ daily revision!). It has problems from time to time but repairs seem to work. I had occasion to do a restore recently (hard disk died) and duplicati was really really slow

duplicacy I do coz the backups are super quick. restores are also quite intuitive. Have some problems with backend.

restic is my main go to. I made a lot of custom scripts to do daily backups and other regular routine tasks. Repairs work reasonably well. Restores also have the mounting benefit and are quite quick.

Borg I dropped coz it seems to have (fairly regular) issues with unrecoverable repo corruption. After trying a few different backends - I couldn’t solve it and used restic instead.

1 Like

Thanks for your response @Kelly_Trinh, appreciate it!

Out all of these backups, how would you ensure the sync integrity between your current files and the backup?

Let’s say your files are stored on a local drive/server/RAID, and for whatever reasons some of the files got deleted/corrupted. Obviously, when you run the backup, it’ll sync the deleted/corrupted files to the backup too.

E.g.

  • Backup 1: 5 files backed up (File A, B, C, D, E)
  • Backup 2: additional 2 files added (File F, G)
  • Backup 3 (deduplication): File A is updated => thus, only File A is updated on the backup
  • Backup 4: RAID failure, and file B is corrupted => and the corrupted file is sync’ed to the backup

Let’s say we run backup prune and delete backup 1 - 3; does it mean now we have a copy of corrupted file B?

Is that the reason why you have 700+ backup versions? just in case?

Thanks

For restic - I have also setup a periodic (monthly) check of the repo - so any corruption should be captured through that. Also my retention policy would keep end-of-month snapshots up to 6 months — so hopefully that is enough to avoid issues you mention.

For the duplicati 700+ backups - it is just a pride thing to have ongoing backup for so long. Coz the duplicati restore is so slow - it is a last-resort thing for me among all my backup strategies.

Can you give more info about it, as in an absolute time vs your reference, for a comparable amount of data ? could it have a relation to the not exactly typical number of versions you are keeping ?

I had hard disk die on me last October.

My backup is about 650Gb comprising of mostly photos/videos and small portion of office documents.

I was mostly getting around 6-10MB/sec restores from others but for duplicati it was in range of 300-500KB/sec. Coz it was so slow and hard to coordinate all the different restores - I took approach to only use duplicati for files that (all) others failed on.

It was quite a long process to restore everything but basically I got complete recovery from other sources and never did use duplicati.

Dont know what it is so slow - I used to use duplicati for ‘spot’ recovery of files in the past when I had about 500 backups. It was about the same speed (there isn’t a timer - so just going on gut - but 10-15MB file would still take 10s of seconds to completely restore

Thanks, I will try to make a test next week end for a restore on a speedy network to check what kind of speed I can get.

Test restore done, took 25’34’’ for 36,54 GB, so about 24 MB/s.

the no-local-blocks was set (not that it mattered since the restore was done on a different computer than the one used for the backup)

Now I am not including the database recreation time (took about 20’ for a backup of 1 TB, the backup has only one version and is in excellent shape - no damage), so it’s not a worse case obviously.

What backend do you use?

this test was done using FTP (simple, without encryption). As I said I did not have ample time to do this testing so I selected all the fastest options (S3 is a bit slower than FTP, SFTP more so). For full disclosure I used also the env variables to get faster database handling (but no ram disk since it was done on a client Windows OS that has no built-in support for that).

Currently I think Restic is faster than Duplicati, could it be?

I was curious to know if you’ve tried out this restic web UI.
Backrest

My testing was done on command line restic under the Ubuntu 18.04 running on the Windows Subsystem for Linux.