Can I sync large image files (by name?)?

Broadly speaking I’m trying to figure out how to do large offsite backups for safekeeping.

  • Say I have a VM image, or a truecrypt volume, or even a large backup file made by some 3rd party local backup software (for instance a full sector by sector drive backup).

  • These can be large, hundreds of gigabytes to maybe 1TB even. And they need to be synced as containers (not the files they contain).

  • But in some cases the files themselves might change (e.g. replacement with a newer version)

To solve this can I configure Duplicati to track these files by name and obviously only upload the differences to the online server? Is it even feasible to track huge files? Won’t this create huge overhead or huge metadata?

How else do people remotely sync their large backups?

While you can use Duplicati to do this there are a few things you should consider before doing so, including:

  • Duplicati is a backup tool, not a sync tool
  • VM disks are likely to have LOTS of churn (consider all the temp files you OS creates) so you’ll likely often have a lot of “changed blocks”
  • encrypted files / volumes can easily have block shifts causing larger portions to appear to Duplicati as changes - so again likely lots of “changes” to be uploaded
  • any use of the VM or encrypted volume is likely to trigger metadata updates (like"last changed date") so the ENTIRE file will need to be rehashed to find the changed blocks
  • obviously restoring large files will retire large downloads so keep that in mind when considering how quickly something can be restored

If you give this a try and find Duplicati is slow on the local database side, consider looking at Dupliciti which sacrifices some privacy for not having to keep a local database of what’s been backed up.

If you really only need a sync tool, check out rsync or Syncthing as they both also work at the block level.

Whichever route you go, good luck!