Initial backup should be able to be repaired

EugeneBos · March 16, 2019, 7:50pm

How to backup 0.5 TB without interruptions? I failing on about 30% each time

drakar2007 · March 16, 2019, 8:50pm

My usual technique is to start with a very small subset of the files you eventually intend to include in the backup set, and let it back up fully… then select more, let that run, then more. I have about 800GB backed up using this technique successfully.

EugeneBos · March 20, 2019, 6:26pm

Thanks! Your method working well, finally got some backups running. But its not good for the new users

drakar2007 · March 20, 2019, 6:56pm

I agree. But Duplicati will need to get much better at doing a few things first:

shutting down a backup job when requested
cleaning up after itself after such
forcing a backup job to end without breaking stuff

In my experience, these are major tripping points and I’ve unfortunately learned that I need to avoid them altogether.

EugeneBos · March 21, 2019, 9:45am

That’s true, probably should be removed from the front page:

E.g. interrupted backups can be resumed and Duplicati tests the content of backups regularly

Pectojin · March 21, 2019, 10:25am

I think there’s an additional point here about recreating a database against a backup with no dlists in it yet.

Canceling backups after the first completion usually work fine for me, but if no complete backup exists Duplicati throws an error where perhaps it should just throw a warning and then start backing up with the existing data.

ts678 · March 21, 2019, 2:51pm

Is this behavior and its messages written up anywhere? I assume the issue isn’t on cancel but on restart. Ordinarily an interrupted backup is supposed to pick up where it left off, but maybe the initial is different…

How does Duplicati Resume After Interruption?
Irritating error message These two say what’s supposed to happen, and below proposes why it does not:
Warning Message after Backup describes a suspected bug with improper parameters fed in by the code.

Despite synthetic filelist failure, resume does often work. People confuse file scanning with a re-upload…

This is all somewhat past the original topic which wanted initial backup to be repairable, but didn’t discuss why a repair was necessary or why interruption happened. Ideally I’d think interruption should only be from things like having to take a laptop and go, and should just pick up where it left off when next backup starts.

drakar2007 · March 21, 2019, 5:41pm

It might be a good idea to explore the use case I ran into recently.

My original (~800GB) backup job went haywire after I accidentally included a 200GB uncompressed video file, and only noticed halfway through that. I had to do a forced shutdown - which worked except Duplicati kept the local database locked until I shut it down and restarted it. Then after that it started reporting mismatch errors as I excruciatingly detailed in my other recent thread.

After that I decided to break up my big backup job into smaller components (pictures, music, videos, documents, etc). During the initial run of the Videos backup (~200GB at first), my connection speed with B2 plummeted due to reasons maybe unrelated to Duplicati. I wanted to stop the process and try something else, so I tried “stop after current upload”, which never worked (as reported elsewhere). So then I had to do a forced stop again (“stop immediately”), which not only messes everything up but also leaves all temporary files sitting in the temp directory. So basically I had to re-do the first 50GB or so from that backup job, after pruning the total backup set down to like one single file and letting it run through once, then gradually adding back the other pieces.

ts678 · March 21, 2019, 11:21pm

Resume after “Stop now” has been working reasonably OK here in current testing, and seemingly picks up where it left off. It does not try to create a synthetic filelist, but that might be because no new file completed. Basically, the first backup was a one-byte file, then into the source area I added an ISO that was over 1 GB. Destination is local disk but throttled both up and down to 100 Kbytes/second, so like an 800 Kbits/sec line.

Oddities included:

“Stop now” acts in the UI like it stops BUT the upload keeps happening. The job DB is held open until done, meaning an attempt to run it before the invisible transfer finishes will complain. Waiting till done solves that. To confirm whether or not a file is held open, one can run Sysinternals Process Explorer on the file’s name.

Annoyingly, there’s nothing even in a Profiling-level log to reveal that the upload is still invisibly continuing… Windows File Explorer shows it, sometimes needed some coaxing, such as right-click and Properties tab. For additional amusement, one can run Sysinternals Process Monitor and watch Duplicati copying the file.

It leaves by default 4 50 MB dblock files (with names beginning with dup-) in Windows’ temporary file area. Disk full in /tmp, dup-xxxx files not getting cleaned up might fix that. If anyone tries next canary, please test.

A “Stop after upload” got ignored as widely noted, but a subsequent “Stop now” in the same run took effect.

Correct behaviors:

It seems to resume on the same file the next time, clean up the remote dblock interrupted (wasted upload), and continue on to the next dblock (which when done is a different size, so it wouldn’t be an upload restart).

This is a backup I’m interrupting, during the uploading dblock and dindex files phase for the large added file.

Watching the database in a viewer, there are always four queued-in-temp dblock files in Uploading state files that rotate through temp as they complete. The interrupted file stays Uploading even after completion. When the backup resumes, Duplicati checks remote file states in its database as part of its clean-up work.

Testing 2.0.4.5 beta, and maybe needing a more complex (but not impossibly so) test, if anyone has one… Ideally, a good simple-as-possible test case (when discovered) should probably be filed as a bug in Issues. Forum posts for support flow through so fast that they’re not a good place to queue failure cases for repair.

I’d also still like to hear whether original poster @EugeneBos has any comments on this wandering topic.