Backup of backups

Sami_Lehtinen · September 15, 2019, 8:41am

Nice thinking, you’ve got lots of good questions there.

Yes, it would mean restoring backup, and running backup of the files to another location.
Duplicati 2 doesn’t do “incremental” backups anymore, it’s always differential task.
If you keep the temporary directory content in the path between the runs, you can use local blocks option to make the restore faster, afaik. Haven’t tested that, but that’s how it’s supposed to work. Also restoring backup needs the encryption keys + it’ll do the database rebuild. Depending on backup size, this task can be extremely slow.
But everything is a trade off, it’s up to your own preferences. Like what kind of situations you’re protecting the data against and how’s your system CPU, disk I/O, Storage capacity, network I/O features are balanced, etc.
No option to do just the last backup version. But if you want to do that, then just take backup of the backup files, then you’re efficiently doing exactly that. It’s good to remember that in normal conditions the data of the “latest version” is actually scattered all around the files, unless you’ve got some kind of very specific use case where that doens’t happen. This is what makes the backing up very efficient (in terms of disk space and network bandwidth), but can make the restore really slow process. But personally I’ll consider this is a good trade off to have in most of use cases.

Well thought again. Sure this is something you’ll have to consider case by case. Also one thing to do between different security zones is that the “backup server” which executes the backups, doesn’t need to be the source of the data being backed up, nor the destination where data is backup up to. This allows locking up the server with the encryption keys to much higher security level than the file server which is the source of the files and many users got credentials / network access to it.
No, really, it’s not hard to deal with the files created by Duplicati. This is exactly why Duplicati is so awesome. It’s very efficient to sync or backup (ie. with or without version history) the files created by Duplicati to secondary locations. This is exactly the benefit we get from the de-duplication and static container files used by Duplicati. Drawback of course is that restore and compaction are but slower, as well as the de-duplication process itself is energy consuming (cpu, ram, i/o on the server creating the backup).

Sure, yes in this specific case. But I see the requirement of restoring the backup of the backups, as something which should be extremely rare occurrence. That’s only necessary when something else went seriously wrong. Again balancing between trade-offs. - I’ve done that once, due to one of RAID drives getting literally head crashed (I took it a part to see it) and operating system corrupting the FS on the remaining disk. Then I restored backups from the backup of the backups.

Yet this caused secondary problem. The Duplicati repair task wasn’t up to the situation, where the remote data storage got rolled back for “one version”. Well working repair task should be able to handle that. Sure, it means that one version is lost, and in case there were anything being compacted, that data might be relocated in new files in the storage directory and naturally also some old files might be missing. But in logical terms it would only mean losing one version, and some probably time & bandwidth consuming reconstruction work. But it shouldn’t mean that the whole backup set gets broken as happened in this case. Sure, the old backups were restorable, but… You couldn’t get the system to continue backing up to those sets anymore due to mismatch between remote storage and local storage. With newer versions hopefully the repair & database rebuild can handle that kind of situation.

Technically it doesn’t matter “which server” does the compaction process. It’s all about running the compaction without having the database, and the storage location getting a few new files and some files deleted in the process. After that the real question is getting the database on server A synced after that compaction so that everything won’t get messed. But there’s different thread about all this stuff here.

In the current networked world these things can be combined in so many different ways. Talking about specific server is just kind of confusing. It’s so easy to forget properly generalizing aspects, if you’re talking about your own specific use case. (This of course happens to everyone)