I have been using duplicati for some years to backup critical files daily including one 7 GB database from my Win10 desktop computer to various clouds. Now I have setup a Win10 vm in Google Cloud Platform and I am restoring the cloud backup to the desktop at this moment. I want to do this restore daily going forward thereby syncing the local filesystem with the file system in the GCP vm.
Is there any reason that won’t work?
Will duplicati only write the changed sectors or will it restore the whole of every file every day? The largest file, the 7 GB database, changes very little from day to day, but I can see the restore time is going to be lengthy.
Will it work? It won’t work as well, e.g. overwrite-restore won’t sync deletions as sync tools may do.
It might actually be able to do an incremental update on existing files though. Test while watching in
About → Show log → Live → Verbose to see whether it takes advantage of blocks that are OK now.
For test, I made a backup that made 4 dblock files, renamed the source file to avoid local block use,
did a Direct restore to a folder without the source file, and saw live log show 4 dblock downloads.
Fine so far. Rename source file back, change first byte, backup, get a tiny dblock with revised block.
Hiding source file again and second Direct restore only downloaded the new dblock, not them all.
It was easier for me to test than to read the code, but if someone wants to read the code, it might be:
Live log shows file processing (read it from bottom to top):
Jul 26, 2021 4:12 PM: 1 remote files are required to restore
Jul 26, 2021 4:12 PM: Target file is not patched any local data: C:\tmp\duplicati-188.8.131.52_beta_2021-06-17.zip
On the line above the concept of patching with local data means look for the source file to get blocks, saving the extra work of downloading dblock files. no-local-blocks is an Advanced option to stop that:
--no-local-blocks = false
Duplicati will attempt to use data from source files to minimize the amount of downloaded data. Use this option to skip this optimization and only use remote data.
Setting above option true makes for a more reliable restore test, as it’s forced to use only remote data.
There’s no equivalent option to make it NOT use restore target data already partly or totally as desired.
In fact if all files are already fine, you’ll get a confusing warning that no files were restored – no need to.
Agree. It’s missing sync features, but if Duplicati really only patches the changed blocks of the 7 GB DB, possibly it will be worth the trouble. If doing Direct restore, a partial temporary database is built each run.
If Duplicati and its DB are remote, the remote DB could be copied down, but it gets even more unnatural.
Thanks for the comments. I did the initial restore of the duplicati cloud backup of the source machine to the target vm, which is 15 GB. I started by repairing the database and then did the full restore. Took a few hours and one restart, but then everything worked fine on the vm.
I then did a test by making a small change in the subject db on the target vm, which is 8 GB, and then restoring from the duplicati cloud backup of the source machine. Again I started by repairing the database on the target vm. Then I did a restore of just the 8 GB subject db, which would take hours if restored entirely, but completed in 18 minutes. Most of the restore time was spent in scanning and then verifying the files.
So, it works as ts678 reported: only changed blocks are restored. For my purposes that achieves the goal. Of course, I am backing up a single tree of data files from the source machine, not taking an image including the OS. The file system is an archive, so deletions don’t occur.
It hardly matters that Duplicati was not designed for syncing files, if it works. It Windows had a decent rsync I wouldn’t bother with duplicati at all. The duplicati restore apparently can’t be scheduled, but
I suppose you’ve looked at cwRsync. rsync.net has commentary here (and their own product…).
Yes. It does need to look over the whole file at both ends, although I think rsync is also read-heavy.
Very sophisticated software sometimes uses Changed Block Tracking (CBT) instead of scanning.
I talked about that here, and mentioned Veeam and UrBackup, but I don’t know how they restore…
On the backup side, you probably want at least a crash-consistent one, and you might want more.
If your database is VSS-aware, several backup programs (including Duplicati) can work with VSS.
Because you mentioned rsync, perhaps consistency is OK already (maybe you shut DB down?).
Comparison of file synchronization software Commercial section shows some block-level syncs.
Wikipedia doesn’t have that listed for free software, but Syncthing has some sort of delta support.
The manual says very little, but you can see people in the forum discussing its benefits and limits.
If folder and other issues can be solved, DropBox and some OneDrive do differential (delta) sync.
That ended mid-comment, but maybe you were going to note Task Scheduler can run CLI restore.