Backup the Backup - recursion

lairdb · May 8, 2020, 9:26pm

Found How to backup the backup?, but wanted to poke at it from a different angle.

Most of the discussions in that thread cover doing a backup and then a sync, but what’s the argument for/against using Duplicati to back up Duplicati?

“Why would you do that, rather than just one duplicati with multiple destinations?”

So, suppose I have computers Dog, Cat, Hamster and Horse. All four back up across the local network to Horse-USB. Dog, Cat, and Hamster have only one daily backup set, nice and simple.

Horse, as well as having the backup set “Horse to Horse-USB” also has “Horse-USB to Wasabi”.

It seems administratively simpler to me – especially if I add complications like multiple backup sets from the individuals. (E.g. Dog has Dog-Critical, Dog-Important, and Dog-Nice-to-have, each with different cycles. They all go to Horse-USB, which gets backed up by Horse to Wasabi without any additional overhead. Otherwise I have to manage Dog-Critical to Horse, Dog-Critical to Wasabi, Dog-Important to Horse, Dog-important to Wasabi… etc.)

Thanks.

drwtsn32 · May 9, 2020, 12:22am

Yes, I like the idea of backing up everything to one spot (your USB) and then going to Wasabi from there. BUT - I would not use Duplicati for that second hop to Wasabi. You will gain nothing (no added deduplication or compression) and you will lose on simplicity: if you lost everything at your house - computers and first target USB - you’d have to restore twice with Duplicati to get your actual data files.

Instead I would use something like rclone to sync the USB to Wasabit. rclone is the best (in my opinion) tool for synchronizing to/from/between cloud storage.

This will ensure you have an off-site copy, but still have quick restore. If you lost everything at your house, you could restore with Duplicati by pointing it directly to Wasabi. Or you could rclone everything from Wasabi down to a new local USB drive first before doing Duplicati.

Remember to store each backup set from each PC in a unique folder on your USB drive. Then sync it using rclone to Wasabi with the same structure.

I do almost this exact thing, but I use a NAS at my house instead of a USB drive. My backups are stored in:

\\NAS\Backups\<pcname>-<backupsetname>

or you could do

\\NAS\Backups\<pcname>\<backupsetname>

or whatever else is logical to you. I use Synology Cloud Sync (similar to rclone) to keep the entire \\NAS\Backups area in sync with B2 storage.

Hope this helps…welcome to the forum!

lairdb · May 9, 2020, 1:17am

Thanks for the validation.

I hadn’t thought of the possibility of direct restore; I had just been assuming a two-step if it came to that. Surviving pro’s for using duplicati for the second hop would be

having rewindable history of the local backup set (though I’m not sure I can articulate the real-world benefit of that)
only one set of processes to administrate, monitor, etc.

The latter is… attractive to me, vs. also having to keep an eye on the rsync/awscli/cyberduck.

(
Yes – I figured out the folder structure thing pretty quickly: it’s all \\horse\USB10A\dog-important etc.

Yep – I also actually have a NAS-RAID that I have been using so far, but my NAS is soooo old and crufty, with a 2TB max drive size (net 6TB total) that I’m leaning toward the 10TB USB as the local target (easy to grab and go, if it came to that.)

Some amount of time this weekend will be seeing if the GL.iNet AR-300M that I have somewhere can actually put the USB10TB onto the network as a babay-NAS, rather than sharing from \\Horse.
)

drwtsn32 · May 9, 2020, 2:51am

You already have a backup history built in to the first backup stage. rclone sync would not need to retain any backup versions itself, it would simply replicate the same backup versions that are stored in your primary Duplicati backup.

I suppose that is true. I’d still advise against it. Duplicati would be slower at syncing up to Wasabi because it would analyze all the data for deduplication purposes, a pointless endeavor in this case. Your sync to Wasabi would be much faster with rclone.

lairdb · May 10, 2020, 3:39pm

Figured put what was bugging me. Using backup rather than sync protects against less-than-totoal corruption of the intermediate stage. Suppose, for example, that a fat-finger error deletes a substantial chunk of the intermediate backup that goes un-noticed for at least one sync cycle.

Using backup for the local-to-cloud (or, tbf, some other journaling scheme) would allow a rewind recovery of the local repo.

drwtsn32 · May 10, 2020, 5:19pm

There may be some alternatives to mitigating that risk: protect the files on USB so only the backup application can write data… Configure rclone to sync on a delay (this way youd find out if your USB files were deleted accidentally before the rclone sync)… Configure lifecycle policy on wasabi bucket to retain deleted items or changed items for a certain amount of time… Etc

I’m just trying to discourage using Duplicati to back up Duplicati backups a second time

lairdb · May 10, 2020, 6:58pm

I get that, and all your alternatives are reasonable, but I think we’re assigning costs differently. When I was a young 'un, bytes and clock cycles were much more expensive than developer time. Now, computrons are nearly free – and devops time and meat cycles are much more expensive.

drwtsn32 · May 11, 2020, 12:18am

lol meat cycles … haven’t heard that one before.

Test out your idea and see if you are comfortable with the restore workflow. There is no technical reason it would not work.