Confirmation about backup retention

I think I already know the answer to this, but want to be absolutely sure.

Let’s say I have two files in a simple backup set:

  • test1.txt
  • test2.txt

test1.txt gets modified every day, whereas test2.txt already exists and is never modified.

I set Duplicati to nightly backups, and delete backups older than 180 days.

Because test1.txt gets edited every day, it will get backed up nightly.

test2.txt will only be included in the initial backup.

My question is - I assume that after 180 days, Duplicati will throw away any versions of test1.txt older than 180 days but it will also keep test2.txt, as there is only one version? i.e. it won’t delete test2.txt because it is older than 180 days?

Probably obvious but I can’t find any manual reference answering this explicitly and can’t afford to just risk it obviously.

I believe are correct - the newest version of a file, no matter how old, will not be automatically removed from a backup. Note that if the source file is deleted, THEN all versions will be deleted (unless the “keep deleted files” setting is enabled.

If you don’t care about details you can stop reading now. :slight_smile:

In your example it’s possible an archive containing parts of the SINGLE instance of test2.txt may also include “old” parts of test test1.txt. if so, and enough other parts of that archive also are old, then Duplicati will download the archive, re-compress it with the test2.txt Biggs blocks but WITHOUT the “aged out” blocks of test1.txt et. al.

If enough re-compressed archives have shrunk enough then they might all be downloaded and re-compressed into a single “full size” archive.

Hopefully that answers your question!

For the question, I think @JonMikelV has the right answer.

To clarify further, Duplicati works with “backup sets” or “snapshots” which is “a collection of files and folders as it looked at a specific point in time”.

Each backup makes a “snapshot” of your files. This snapshot includes all files currently on disk. For your example, each snapshot will include both test1.txt and test2.txt.

When you delete older versions, you delete snapshots. For test1.txt that means you will loose the ability to restore the older versions of that file. For test2.txt you will not loose anything.

If you delete test2.txt from disk, then you will be able to recover it until 180 days have passed, and the last snapshot with the file is deleted.

Thanks heaps for the replies. They answered my question but also the point about “keep deleted files” is something I overlooked, I will include that in my settings as it’s a very important requirement which I assumed would be on by default. Apart from physical disk failure another probable reason for a restore is accidental deletion.

Yep - that’s exactly why I always turn that on (I wish I could find the command line parameter for it though…)

I would also have assumed that this would be on by default. Any specific reason why it isn’t @kenkendk?

I think, I saw files specifying compressed files by extension in the source tree - I’d think these are (or, at least should be) excluded from compression at all.

Duplicati works in snapshots, and for this reason there is no option to “keep deleted files”, as in “if a file is deleted it stays forever in the backup”. It is trivial to add this function (just don’t run the delete at the end) but I am not sure what it provides?

If the user overwrites the file with random junk instead of deleting it, you will still loose the file. If the user puts a zero-byte file in, and then deletes it after a backup, you still don’t have the file.

The “right” way is to make sure you have backups as far back as you need to recover. This way it does not matter how you mangle the file, you get the copies all the way back.

Is there a use case I have missed?

Was this for another topic? Files with known compressed extensions are not compressed, they are put in with --zip-compression-method=store aka “copy”.

That’s how it almost works, but it’s a bit tricky to state that the newest version of a file will never be removed.
As @kenkendk explains, Duplicati does not delete files automatically, it deletes snapshots. A snapshot is a representation of “how your source files looked” at the moment of making the backup. This snapshot represents all files and folders selected for backup, including the unmodified ones.

If you choose a retention policy “Keep 180 days”, both files (test1.txt and test2.txt) will be available for restore. 180 days after the initial backup, the first version of test1.txt will be deleted, so any modification to test1.txt that’s been done more than 180 days ago will be lost, making it impossible to restore test2.txt 180 days after deletion.

If you delete a file, it will be excluded for restoration in the most recent backup. But in all backups before, you will be able to restore test2.txt. 180 days later, none of the snapshots will contain a reference to test2.txt, so all blocks that are used by test2.txt will be deleted from the volumes at the backend.

Short version: if you delete a file from the source, you can restore it until the moment of deletion transcends your retention policy.

@JonMikelV: Where can I find this “keep deleted files” setting? Duplicati.CommandLine.exe help advanced doesn’t reveal any information about this setting and I can’t find it in the list of advanced options in the Web UI. I’m curious what is does and how it works.

3 Likes

Hmm… I think, I posted in other discussion where the question on level of compression was brought up… Or I could have mixed up windows.

Yeah - I think I was mis-remembering an answer I got from kenkendk (where he also said it would be trivial to add) as “yep, that feature is in there”. Sorry about that.

So I’ll ammend my previous comment to be “And that’s why I pretty much always set unlimited backups.” :slight_smile:

Can I confirm, did you mean to put test2.txt in there and not test1.txt? This would mean that test2.txt, which already exists but never modified, would be removed from the backup after 180 days even though it still exists on source?

The file will be deleted from the backup 180 days after deleting it from the source. As long as a file exists in the source, it will be available for restore. From the moment that you delete the file, you still will have 180 days to restore it before it is removed from the backend (assuming you choosed “Keep 180 days” for that backup).

1 Like

So a keep deleted files setting does not exist?

Not as such, no. However the same result can be gotten by setting an infinite backup limit. Granted, this has the side effect of ALSO keeping potentially unwanted file versions so it comes with the cost of more destination storage.

I know there is at least one person working on adding the ability to reduce backup density over historical time. And I think the way they are implementing it might actually end up POTENTIALLY removing deleted files even if unlimited backups is enabled, so adding an explicit “do not remove deleted files from the backup” option might be worth considering adding.

1 Like