How to recover from aborted/stuck/crashed backup run?

ts678 · April 29, 2024, 11:47am

It’s good to hear that the huge unexplainable gap isn’t common. I’m not enough of an expert dev to guess at where your hours of delay came from, although if it comes back again, you have new things you can check.

The forum is not a good issue tracker, but you can certainly see if it’s in GitHub Issues. If not, you can open.
The way PRAGMA optimize is explained, it sounds like it’s very fast if it doesn’t actually have to run analyze. Running one at database opens might be quick, although there’s still a question of how well optimize works.

The change makes some things that used to go only into the hard-to-find server log also go to usual job log, however I think the job log can still be lost in transaction rollback. Exact commit points are not documented.

SQLite only allows one writer at a time, I think, although I’m not sure when it’s enforced. Maybe at connect?

Also, wouldn’t commit apply to the whole database, including other changes which aren’t ready to commit?

This is something that worries me currently. Lots of concurrency, and each part commits as it feels a need. What happens if a good time for part A is a bad time for part B? Still looking at how the design handles that.

The status bar is already too crowded for what it tries to show, so it has to use a variable display sequence. Posting specific internals details of SQL and such seems way too much to show, or to be comprehensible.

This is where I thought the details could be hidden. Your idea of a combination approach might be workable, however it would be up to some developer (and they are scarce) to figure out what might be implemented…

github.com/duplicati/duplicati

Duplicati 2: Display current task in status

opened 02:00PM - 02 Nov 14 UTC

rryk

enhancement

This is a feature request for Duplicati 2. Currently the backup status only sho…ws something like "137957 files (39.25 GB) to go". I would love to see what it's currently doing, e.g. archiving, encrypting, gartheting changes, uploading, downloading signatures, etc. Basically same sort of information that was available in Duplicati 1. Currently, the only way to deduce the current status is by reading live logs, but they are too short (too few messages into the past) and hardly user-friendly. ## <bountysource-plugin> --- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/5714406-duplicati-2-display-current-task-in-status?utm_campaign=plugin&utm_content=tracker%2F4870652&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F4870652&utm_medium=issues&utm_source=github). </bountysource-plugin>

shows some early thinking, although the bar of “a moment” or “a second” seems way too sensitive for large backups. The idea of “what it’s currently doing” is also old, as 2.0.3.6_canary_2018-04-23 went concurrent.

I’m still wishing there was a design document on concurrency, with transaction design that fits in properly.

Lacking that, I tried expanding my thinking in the channel pipeline topic into some pipelines. Main might be:

FileEnumerationProcess.cs → SourcePaths → MetadataPreProcess.cs → ProcessedFiles → FilePreFilterProcess.cs → AcceptedChangedFile → FileBlockProcessor.cs → StreamBlock → StreamBlockSplitter.cs → OutputBlocks → DataBlockProcessor.cs → BackendRequest

and several of those (but not yet charted) are doing database actions and occasional commits-as-seen-fit.