Backup order after "missed" time schedule

asman · May 5, 2025, 8:05pm

I have scheduled my nightly backups in a fixed order. I need the order to be maintained, because the “last” backup has to run last. It has a run-script-after hook that shuts down the duplicati container and unmounts the backup drive.

The jobs are ordered by their time schedule in the GUI. This works like a charm in normal operation. However, when duplicati misses a backup run (cause host was shut down for some days) the backup run order seems to be random. That means, the “last” backup runs too early and shuts down the container before the other jobs are done.

Are there ideas how to improve that? I am open for a better “shutdown trigger”, I just need a way to find out if all scheduled jobs are finished.

As a short summary to give an idea:
0:30: cron on the host system: backup destination drive (extern USB drive) is switched on and mounted, duplicati docker container is started
0:31 - 0:35 : several “real” backup jobs are scheduled in duplicati
(they may take longer, which is no problem because duplicati does not run them in parallel anyhow)
0:40 .. the “last” backup job is scheduled (it may be started later, this is OK as long as it comes last). After completion it triggers a script on the host, that shuts down the duplicati docker container and unmounts / switches off the backup destination drive

The approaches I have already discarded:

Put everything into one backup job: I want different retention policies for different source folders. Afaik it is not possible to have different retention settings in the same backup job.
Trigger the jobs via CLI: I like the GUI for defining the jobs and seeing the logs.

ts678 · May 6, 2025, 8:22pm

Does this really mean that several permanently-daily-scheduled jobs run after cron start, however system downtime causes missed schedules, and some missed jobs get in first?

was an earlier theory, however it’s not certain. So you’re sure yours is looking random?

They are not mutually exclusive. Duplicati 2.0 could start GUI jobs from CLI using:

A command line client for controlling the Duplicati Server (third party contribution)

ServerUtil is for Duplicati 2.1, and has more in it than the documentation mentions:

Commands:
  pause <duration>                Pauses the server []
  resume                          Resumes the server
  list-backups                    List all backups
  run <backup>                    Runs a backup
  login                           Logs in to the server
  change-password <new-password>  Changes the server password
  logout                          Logs out of the server
  import <file> <passphrase>      Import a backup configuration
  export <backups>                Export a backup configuration
  health                          Checks the server health endpoint
  issue-forever-token             Issues a long-lived access token
  status                          Gets the server status

One approach (if the “missed” run order is truly random – which maybe dev can resolve) would be to have cron start Duplicati as current, and either run the first backup or let it be permanently scheduled. First job has run-script-after to run the second, etc. Force chain.

asman · May 7, 2025, 12:36am

Does this really mean that several permanently-daily-scheduled jobs run after cron start, however system downtime causes missed schedules, and some missed jobs get in first?

all jobs are missed during the downtime.

So you’re sure yours is looking random?

Let’s say they are not run in order of the schedule times.
I would expect the job with the most “overtime” to be run first, which does not happen.

ServerUtil is for Duplicati 2.1

Thanks for the hint, I will look at this once I have updated.

ts678 · May 7, 2025, 12:48am

Of course, but I’m trying to understand your configuration. Is it a daily schedule, meaning when cron starts Duplicati at :30, those missed jobs manage to run before :31 - :35 jobs?

OK, so old theory of run in creation order (top-to-bottom on screen) might still be correct, and if so you can control run if you’re willing to export/import last job to bottom of screen.

New Duplicati Canary test versions have a sorting feature that will throw people off if they have found that screen order is run order of missed jobs, but that may or may not be true.

kenkendk · May 9, 2025, 1:28pm

The sort order is display only; it has no effect on the scheduled order.

The scheduler system in Duplicati is not really ready for anything other than simple requests.

If you need full control, you can use duplicati-server-util to start the backups when you need them to. You then configure all backups without a schedule and then externally (via cron or similar) you invoke duplicati-server-util to make backups. The scheduler in Duplicati is a queue, so you will get stable scheduling if you call them back-to-back.

There is a --wait option, but it is currently a bit broken, so you need to use the list command to see if any backups are running when you wait for the shutdown.

ts678 · May 9, 2025, 5:19pm

If this means a FIFO (first in first out) queue in order of ServerUtil run, I’m not seeing that.

Make job called test A that waits a minute and backs up A.txt. Make similar for test B.
Run a third somewhat long job to allow time to run both test jobs (into queue) as B then A.

ServerUtil status:

Active task: [Task 23]: BackupId = 1
Scheduled tasks:
  [Task 25]: BackupId = 5
  [Task 24]: BackupId = 6

Result per logs is that when the long ID 1 job finishes, test A (5) runs, then test B (6) runs.
If “stable scheduling” means something other than “run” order, what order is queue run in?

Results here could be explained as BackupId order, but I don’t know if that’s by the design.
If it’s by design (as opposed to SQLite choice), may it apply to the missed job runs as well?

kenkendk · May 12, 2025, 11:52am

Nice that you tested it!

I only looked at the queue itself, which is very much FIFO based.

But, it turns out that the API to run a backup explicitly asks to override the queue, so it will insert the new task at the start of the queue.

I think the reason for this was that if the user request to run a backup task, it should be more important than the scheduled ones. It does however prevent a reliable scheduling from an external tool, and ServerUtil uses the API without any special considerations.

I have a fix ready for this, where ServerUtil will be able to ask for next spot, but by default just append to the queue.