Understanding of backup job

teesaja · December 3, 2022, 5:46pm

Hi,

when I select that duplicati should keep the last 4 backups. What does this mean?
I think duplicati only do incremental backups, right?

So there is one big backup, the first one. But when the first one is the 5th backup, then it will get deleted and all other 4 backups are broken? How does this work?

THanks

xblitz · December 3, 2022, 8:00pm

Welcome to the forum
It means that the last four version where kept.

Now suppose the following scenario - the capital letter mean full version, lower case mean differential version.
A
bA
cbA ← c is obtained by merging b to A and c to A+b
dcbA
edcB
fedC ← e is obtained by merging C to d and C+d to e

Of course the older version is a fully version of backup data, but this is obtained merging the last version (which will be removed) and penultimate version.

JimboJones · December 4, 2022, 12:18am

Correct, Duplicati only does incremental backups. So the first backup is the base and the next backup (and all further ones) are increments of the base.

The base backup data will persist up to the retention period and the increments will be added on and/or deleted as needed. There isn’t any easy way to know what files or what versions are contained in a particular .dblock file as mentioned by @xblitz the backup files merge based on certain criteria like does the data still exists or how much data is in this file. It’s Duplicati’s job to keep track of it all, you just tell it how many versions you want to keep.

In your case on day five (presuming it’s a daily backup) the data in the oldest version that no longer exists in the source, will be removed from the backup set files. Duplicati will prune the non-existent data by itself as it goes along. If you deleted something by accident it will be removed from the backup set after the backup is run for the 4th time.

Depends on your needs but 4 version seems a bit low, keep in mind that deduplication and compression can do some wonderful things when it comes to saving space. I regularly backup 116GB of data and I keep 30 versions of it and it only uses 102GB to store it all. Results will vary depending on your data types but for myself I could probably go to 50 versions without exceeding the size of the original data.

ts678 · December 4, 2022, 4:18am

Not in the traditional way you may be thinking of. There is no full backup followed by incremental chain.
Block-based storage engine explains how Duplicati 1.3 had that plan, but now Duplicati is block based.
Backup type is incremental is Duplicati’s author getting in on discussion of that, and other posts follow.

The idea “deduplicated full backup” is probably the most technically accurate. A file is stored as a set of blocks which possibly are already in the backup (for example if file was in previous backup run). Blocks which are already in the backup are referenced without re-upload. Blocks which are not present upload.

Initial backup is big because all blocks upload. Blocks are only considered waste if no backup version needs them. Retention setting reduces need by deleting backups. Eventually compact reclaims waste.

Later backups typically upload less because most files are mostly the same, so only new blocks upload. This might be more compact than incremental backups that backup a whole file if any part of it changes.
You can look in the job log Complete log for BytesUploaded to see how fast your source is changing.
Quite possibly you can save a lot of versions (if there’s some value) without using up a lot more storage.

Nope. All the blocks needed for all files in all versions will be saved because someone might want some version of some source file restored, at which time file is reconstructed block by block from backup data.

teesaja · December 4, 2022, 10:04am

Wow, thank you for so much good informations!