Hey @D312. You can’t disable this behavior and it’s in place for a reason. The only suggestion I could make, if you want to stop this functionality is to disable the Duplicati scheduler and use an operating system scheduler to run your backups.
@samw has a good suggestion there.
You might also be able to use a
--run-script-before-required process to check the time and disallow the job to run if it’s not near/at the scheduled run time.
Either that or use the Windows Scheduler or Cron on ~NIX type operating systems.
I also would like to have this option, because sometimes when I start my computer I don’t want some backups to run until scheduled. I guess I could use the built-in scheduler in Windows, but it would be easier to just have a command line argument like --disable-run-missed-backup (naming is hard).
Since what was asked about isn’t (yet) a feature I went ahead and changed the topic category to Features. Perhaps that will bring in a few more requests for the feature.
It certainly can be - though personally I’d go for something like
--skip-missed-schedule-tasks. (Hopefully that would make it clear that if one could schedule something like a restore it would ALSO be skipped.)
Cool, let’s hope more people wants this
Anyhow, I was thinking about maybe trying to implement this by myself. But how should one determine missed scheduled tasks? Maybe if the difference between now and start (here is more than let’s say 2 minutes, then consider that as a missed task?
Please don’t do this, because probably it will break my own (and probably others) scheduled backups.
On several systems, I have multiple backup jobs that are scheduled one minute after each other. The result is, that all backups are executed right after each other. The proposed code change would skip most of my daily backup tasks.
A possible solution could be adding an extra command line option for
--skip-missed-scheduled-backups that is set to
false by default.
Yes, an extra cli option would be required as well for those that want to use this feature. I was just thinking about how a missed scheduled task should be determined.
Sorry I missed this when It came up in April. I would support this option. For large backup sets where something has gone wrong and the system is off-line for a while. It’s a PITA to have to wait for Duplicati to run through several missed backups before being able to do anything with it. Having this as “False/Off” by default is fine, just so long as it is available somewhere.
I have created a basic GitHub enhancement request for something that maybe will do what people want. Feel free to take a look and comment or, better yet, add a bounty.
The solution for this would be to make the “skip missed jobs” option smarter, instead of not having it.
For instance, the option could have a threshold for the amount of time that needs to elapse before a job is considered “missed”; I’d expect roughly >90% of the backup jobs we want Duplicati to skip would be more than an hour past due, and >80% of them would be more than 2 hours past due, so finding a “comfortable” number of minutes to populate by default here should be very, very easy (there’s almost no penalty for going ‘big’ on this - the result would be identical to the current default mode of operation).
Another option, or at least a new back-end feature that could help this and other queueing issues, would be to have Duplicati keep track of “queued” versus “missed” jobs.
- “Queued”: job A was running when job B’s execution time arrived, so job B is “queued”.
- “Missed”: job C’s execution time is already elapsed when duplicati first starts up for the day, so duplicati treats job C as simply “missed”.
It seems Duplicati treats both of these scenarios identically at the moment, which is understandable, but it seems that adding a tiny bit more complexity here would allow a bunch of other issues to potentially be resolved more gracefully.
I’m not finding it right now but I think I suggested a similar “buffer” idea related to dealing with not-always-available destinations - something like “retry every X minutes UNLESS a successful backup happened less than Y minutes ago”.
So in this case something like
--job-buffer-mins=X could mean “don’t run this job if it successfully finished less than X minutes ago”. Of course if a job run is “skipped” because of such a buffer it should definitely be logged as the reason for not running the job…
Another thought I’ve had on this (from the perspective of “I want to be able to use my computer when I turn it on after missing a scheduled backup”) is that the backup reattempt could be delayed when a missed backup is detected (e.g., when it sees that a backup missed its scheduled time, it waits 30 minutes before starting it).
The only thing I’d be worried about on this is it could push things long enough to make a cascade of late jobs later and later.
I’m curious if you use the CPU (been around a while) or disk IO settings (only newer 2.0.3.x canary?) and how much hey are NOT helping your computer to be usable when a job is running…
For me, I think Duplicati needs to be smarter in it’s scheduling.
Specifically, I know a bunch of backups were missed. However if a backup was completed WITHIN the backup frequency time ago, don’t try and do another backup!
I have a problem where I have a backup duration set to happen every 2 hours. My previous backup might have completed say 30 minutes ago, however because some backups before that previous one were missed for that data set, it will immediately try and re-run that backup that completed 30 minutes ago, despite the next one not being scheduled for another hour or more, and the last backup completing well within my scheduled frequency.
The same behavior happens if I restart duplicati - even though the last backup was less than the backup frequency ago, it will actually kick of ALL backups.
Now maybe this is a bug with having sub-1 day backups, but this should not matter. This is moronic behavior from the scheduler. Just because I missed a previous backup, if the last backup was less than ago, just wait until the next schedule time.
I’m going to try and re-jigger your scenario just to make sure I’m understanding it correctly.
- Backup D (daily) didn’t run last night because the power was off, so when the computer is turned on it starts
- Because Backup D is still running, backup H (hourly) starts late causing it to end slightly AFTER its next scheduled run time
- Backup H immediately starts up again because the previous run caused THIS run to be late
Running missed backups is not a bug - it’s by design and the scheduler was written that was specifically to be on the safe side.
Consider the opposite of your situation where the system only has a weekly or monthly backup. If that scheduled time is missed and we don’t run the job “as soon as possible” the system could end up with 2 weeks or 2 MONTHS between backups.
It sounds like what you might want is a simple (perhaps job specific) checkbox, something like “Do not run job if scheduled start time is missed.”
@JonMikelV Actually, I don’t mind it running missed backups. But only if has been longer than since the last successful backup.
In other words, say I had a daily backup, and for some reason my computer was down for 5 days.
I DO want it to run a backup as soon as it starts, because the last backup was 5 days ago. What I DONT want is for it to run 5 backups in a row. As long as the last successful backup is less than a day old, there is no reason not to wait for the scheduled time.
What I was seeing is it would repeatedly run the same backups again, even though the last successful backup was within the backup frequency and a new backup was scheduled for not long from now anyway.
In other words, run at most ONE missed backup, which does not appear to be the behavior I have observed.
I should also note, most of the missed backups in my scenario were actually because I was performing the ‘initial backup’ of some other large data sets on the same host, meaning that they took a long time to do their initial backup (and since my backup frequency is 2 hours, having an 8 or 12 hour backup is going to cause 4 or 6 missed backups of the other data sets, because Duplicati only runs a single backup a time … which I also think should be an option in multi-threaded land :P)
The way the queue works it should not add a job that already is in the queue.
Assuming things are working as they should, the worst case scenario would be:
- job A is active & running past the next start time so
- job A is added to the queue (which contains only non-running items)
- job A is on the schedule for a future run
With frequently scheduled jobs that tend to run long, this makes it like the job or just running on a loop.
On the bottom of the About -> System info page there’s some JSON formatted text that includes what’s currently in the queue. If you can paste that here showing a job listed more than once, that would be great.
Of course that’s how it’s supposed to work now,that doesn’t mean there isn’t potential for future changes.
I too would like the option of rerunning missed backups. Other backup systems that I use, bareos, support this. I agree that the default behavior of Duplicati should not change in that missed backups should be run immediately. However it would be really nice to have the option to disable this for those of us that have specific reasons.
In most cases I want the default of rerunning missed jobs immediately. However I have a system that runs backups daily and may be offline when the last backup should have run. In this particular case I just want the system to wait until the next scheduled time to start the backup, otherwise Duplicati will be much of the day, since a single backup takes 2.5 hours.