Retention rule question

I have set the following custom retention rule: “1W:1D,4W:1W,12M:1M,U:3M”.

Does this mean that for an unlimited time after one year, it will try (and not be able to) to keep three backups per month, or (as I hope) does this mean that after for an unlimited time after one year it will retain a backup once every three months?

Thanks!

U:3M means “for an unlimited period of time, keep one backup for every 3 months”, so it works exactly how you want it.

To follow-up a bit more on sylerner’s question…

I think I have my head wrapped around the retention system but would like confirmation :slight_smile:
One thing that’s not clear is if all retention durations start from day 1 or from the last policy change.
Not fully implemented yet but I have two jobs;
one to a local RAID-1 box (runs every 2 hours)
one to Backblaze B2 (probably 2x per day)
The idea is the local backup has more recent versions and is the only backup I will need unless everything goes to the dogs, in which case I can restore from Backblaze.

The local drive retention policy is:1W:U,4W:1D,6M:1W,1Y:1M,U:6M
Which I believe means…
1W:U For 1 week keep everything - “U = unlimited”
4W:1D For the next 4 weeks (files older than 1 week younger than 5 weeks) keep 1 per day [or is that: files older than 1 week younger than 4 weeks]
6M:1W For the next 6 months (files older than 5 weeks) keep 1 per week
1Y:1M For the next 1 year (files older than 6 months) keep 1 per month
U:6M Forever after ~1.5 years, keep one per 6 months

The current B2 plan is: 1W:1D,4W:1W,6M:1M,U:6M
1W:1D 1 week keep 1 file per day
4W:1W 4 weeks files older than 1 week yonger than 4 weeks) keep 1 per week
6M:1M 6 months (files older than 5 weeks, yonger than 6 months) keep 1 per month
U:6M Forever after 1 year, keep one per 6 months

  • How is my understanding?
  • Out of curiosity, I guess everything is based on the local client time and date?
  • Do I understand correctly that you can, at a later date, change the retention policy and all the existing files will slowly get cleared/kept to match that new timing?
  • I feel like I read somewhere that regardless of age and settings (unless ‘allow-full-removal’) Duplicati will always keep a minimum of one backups but cannot find where I saw it. Was that just related to Smart Backup Retention?
    So looking at a simple retention policy that only specifies: 1W:1D. What happens the first day of the second week and after that?

Thanks DP

You can see my attempt to explain this here, including pointers to some earlier more authoritative threads.

My main correction to your writeup is that thinning is based on age of versions, not date of a policy change, and not in a sequence. Ages start at 0 and can be given in any order, but early-to-late is easier on humans. Because it’s age, local versus UTC doesn’t enter in, and likely leap years don’t (internal time is continuous).

Originally 1W:1D didn’t do anything special at 1W, forcing one to set retention days as well. Now it implicitly deletes. If you want deleted files to hang around longer, you can put it in the retention string, even using :U however such a change obviously won’t bring back any version deleted according to some previous setting.

allow-full-removal refers to filesets, but this is also known as a version. Basically a point-in-time view of files.

If I got that wrong, maybe someone will correct me. It’s confusing. :slightly_frowning_face:

Yep - for something that at first glance seems like “it’s so simple!” retention really gets quite complicated. :slight_smile:

The link provided by @ts678 (and links from there) are good discussions, but they can get a bit long. :wink:

The things I try to keep in mind are:

  • timeframes are broken down into smaller increments (as seconds) that don’t carry implications (for example does a “month” mean the 1st to the 30th, 28 days, 29 days, 30 days, or 31 days… does a “week” mean Mon - Sun, Sun - Mon, or 7 days… does a “day” mean midnight-to-midnight or 24 hours from now… what do all those things mean in non-Gregorian calendars or countries where the “weekend” isn’t Sat-Sun).
  • durations are ALL based on local “now”, so “1D” is treated like 86,400 seconds from NOW and “1W” is treated like 604,800 seconds from NOW. In other words, they don’t “stack”
  • each plan block (in your case 1W, 4W, 6M, etc.) can be thought of as a bucket and once a backup version is placed into a bucket then it is no longer seen by other buckets
  • the OLDEST backup in a block is what is kept (at least I think that’s right). So if you have a 1D:1H rule and are doing top-of-the-hour backups, then then OLDEST backup in each 60 min chunk going back in time from NOW is kept and all the newer ones (in each chunk) flagged for removal.

The local drive retention policy is: 1W:U,4W:1D,6M:1W,1Y:1M,U:6M
Which I believe means…

  • 1W:U Keep all backups less than 7 days (aka 604,800 seconds) old
  • 4W:1D For all backups less than 28 days (aka 2,419,200 seconds) that aren’t already in a bucket (see 1W:U) keep the OLDEST backup in each 24 hour (aka 86,400 second) block starting from now (NOT midnight as one might assume)
  • etc.

I recall seeing that as well and I believe it was trying to say that as long as the file exists in your Source, the retention policy will make sure at least one version exists in your backups. HOWEVER - once the file is not longer in your Source, it could eventually be fully “thinned” out of your backups.

This is true even if you end your policy with something like U:U since by the time you get to that rule, previous rules have likely thinned down to something like 1 backup every 6 months. So if you had a file that only “lived” between your every 6 month backups then it would eventually be fully thinned even though it had been getting backed up while it was still a not-very-old file.

Hopefully that made some sense and didn’t just confuse you more. :crazy_face:

1D:1H says for the past day, thin versions to 1H apart. This is just spacing control, done in creation order, i.e. thinning is from oldest to newest. If you want to test it faster, you can use h hours, m minutes, and s seconds.

Thanks for catching that. I’ve updated my post to be less wrong, but your wording is probably better. :slight_smile:

Yea, I’ll say - I had no issues with old fashioned incremental and differential backups but this just keeps making my brain leak :hushed:

I think I can wrap my head around that - so if you ran that job 50 times in one day you would potentially end up with about 23 versions of any file, presuming files were constantly being edited and saved and closed so they could be backed up. And I say 23 as based on some of the other discussions, I think something might fall off the 1 day edge. Kind of funny that while this is all just math, it still gets kind of fuzzy at the edges…

That I think puts a pretty good nail in it except… Something that was kept during say, the 4W thinning, would eventually get passed along to the next bucket to possibly be trimmed out.

Yikes! I think I understand that - No different really than creating and deleting a file between traditional backups but have to say, never occurred to me.
I think the natural expectation is that if you have a fairly continuous backup, that you could always dig in and find at least one version of any file, regardless of if or when it was deleted from the source.
That of course goes back to backups that, once created, were never touched.

I think I need to rethink my local copy. Generally I don’t delete anything unless it really has no reason to exist so the likelihood of actually needing a file that fits that short lived category is ummm, thin. But since local space is not really an issue, I can easily open things up to reduce the hole size.

Many thanks for the continuing explanations!