Backup timeout?

Hi All,

I’m using Duplicati - 2.0.9.111_canary_2024-11-14 (on Win11) with 6 backup jobs.

By chance I discovered that one of the backup jobs was stuck for over 3 days (waiting for upload ends). In the meantime no backup was created and I was either not notified about it or I overlooked it. The icon in the taskbar was still green…

My questions/suggestions:

  • Is a timeout set automatically for backup jobs?
  • Wouldn’t it make sense to start the backup processes independently of each other? That way Backup-B would start independently of Backup-A…
  • If a scheduled backup job could not be started, a permanent (or system modal) notification should be activated that cannot simply be clicked away.

Regards,
Joe

Hi Joe, thanks for posting your thoughts!

There are some backends that do not fully honor the timeout settings. Which backend was stuck here?

That is a general issue with omissions. Duplicati can only report if a backup has run, not when it did not run. This is one of the motiviations for creating the Duplicati Console.

There is not an overall timeout set, because there is no way to pick a sensible value. The backup size could change from 1GiB to 1TiB between runs for example.

The logic is that the backends are currently responsible for handling stalls, by detecting periods with no transfer progress. This strategy is not fully implemented, and will be extended to be monitored and handled by the upload manager at a later time.

The only thing that prevents that from happening is that the UI is not designed to handle multiple running backups. For that reason, the server component is constrained to queue operations and only run once backup at a time.

We do plan to introduce separated processes for each operation, and with that in place it would be trivial to run multiple backups in parallel.

The issue here is that the backup has “started” as far as Duplicati is concerned (because it has been queued). Because we do not know how long a backup should take, it is not an error to have a backup in the queue for days.

That said, I understand the problem, and I think the fix is to ensure that backups properly handle upload stalls, which points back to fixing the backend.

We could add a feature that sends a report of kinds if a backup has been in the queue for more than some pre-defined time, but there is a good chance that it will send false-positive messages. If you would like that feature, feel free to register it on Github.

Thank you for the fast reply.

Although I agree on, that servers can also tell whether or not a job was stuck, a kind of (on the client) parallel running watch dog process could do it too.

The watch dog process could be here also a solution.

Let´s say, we have (at least) following 2 processes:

  • backup (B)
  • watch dog (W)

The solution could look like this:

  • If a backup job is starting, B sends a “job x started” message to W. Until W did not receive the “job x ended” message, for W the job is still running.
  • B tells regularly to W the transferred amount of job x (T message).
  • W can easily recognize on the client´s side the stalled job, if a T message is missing or the size of transferred amount is 0 (in e.g. 3 following T messages).

This would work independently of the size of the backup and doesn’t need the cooperation of the server.

Super!

See above.

Regards

That would work, but it does require that the server emits the progress in some form. My short-term solution is to make the check on upload/download progress, to detect stalls in transfers, which is very similar to what you describe, but happens in-process.

We could add the same logic for the overall backup progress, to check that either number of processed bytes increase or number of transferred bytes increase each 5 minutes or so. There are some backends that do not support reporting transfer progress, so that would need special handling.