Database recreate logs claim completion, but subsequent backups fail claiming the opposite

Environment

Windows 10 device running Canary (currently 2.0.4.19_canary_2019-06-17) backing up to an SMB share on FreeBSD 12

Goal

Fast, reliable, nightly incremental workstation backups to my fileserver

Background

I had a backup job I was running through 2018 which started complaining about missing files and failing to run. I troubleshot briefly with unsucessful DB repairs, then gave up and created a new backup job. The new backup recently started failing. Database repairs did not fix the issue, so I started a DB recreate, logging to a flatfile:

### START OF RECREATE LOG
2019-06-18 02:52:57 +02 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Repair has started
2019-06-18 02:52:58 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()
2019-06-18 02:53:05 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (626 bytes)
2019-06-18 02:53:29 +02 - [Information-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-RebuildStarted]: Rebuild database started, downloading 17 filelists
2019-06-18 02:53:29 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-20180424T212056Z.dlist.zip.aes (7.12 MB)

###    44 HOURS OF RECREATE LOGS GO HERE (SNIP!)                      ###
###    (successful get-started/get-completed cycle every 6-8 minutes) ###

2019-06-19 23:33:15 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes (249.97 MB)
2019-06-19 23:33:15 +02 - [Warning-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-FailedRebuildingWithFile]: Failed to use information from duplicati-b562363822688452c9cd251a89bce211b.dblock.zip.aes to rebuild database: Could not find file 'B:\IT\backups\duplicatia\duplicati-b562363822688452c9cd251a89bce211b.dblock.zip.aes'.
System.IO.FileNotFoundException: Could not find file 'B:\IT\backups\duplicatia\duplicati-b562363822688452c9cd251a89bce211b.dblock.zip.aes'.
File name: 'B:\IT\backups\duplicatia\duplicati-b562363822688452c9cd251a89bce211b.dblock.zip.aes'
   at Duplicati.Library.Main.AsyncDownloader.AsyncDownloaderEnumerator.AsyncDownloadedFile.get_TempFile()
   at Duplicati.Library.Main.Operation.RecreateDatabaseHandler.DoRun(LocalDatabase dbparent, Boolean updating, IFilter filter, NumberedFilterFilelistDelegate filelistfilter, BlockVolumePostProcessor blockprocessor)
2019-06-19 23:33:15 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Retrying: duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes (249.97 MB)
2019-06-19 23:33:25 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes (249.97 MB)
2019-06-19 23:33:25 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Retrying: duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes (249.97 MB)
2019-06-19 23:33:35 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes (249.97 MB)
2019-06-19 23:33:35 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Retrying: duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes (249.97 MB)
2019-06-19 23:33:46 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes (249.97 MB)
2019-06-19 23:33:46 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Retrying: duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes (249.97 MB)
2019-06-19 23:33:56 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes (249.97 MB)
2019-06-19 23:33:56 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Failed: duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes (249.97 MB)
2019-06-19 23:33:56 +02 - [Warning-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-FailedRebuildingWithFile]: Failed to use information from duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes to rebuild database: Could not find file 'B:\IT\backups\duplicatia\duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes'.
System.IO.FileNotFoundException: Could not find file 'B:\IT\backups\duplicatia\duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes'.
File name: 'B:\IT\backups\duplicatia\duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes'
   at Duplicati.Library.Main.AsyncDownloader.AsyncDownloaderEnumerator.AsyncDownloadedFile.get_TempFile()
   at Duplicati.Library.Main.Operation.RecreateDatabaseHandler.DoRun(LocalDatabase dbparent, Boolean updating, IFilter filter, NumberedFilterFilelistDelegate filelistfilter, BlockVolumePostProcessor blockprocessor)
2019-06-19 23:34:16 +02 - [Warning-Duplicati.Library.Main.Database.LocalRecreateDatabase-MissingVolumesDetected]: Found 2 missing volumes; attempting to replace blocks from existing volumes
2019-06-19 23:34:23 +02 - [Information-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-RecreateCompletedCheckingDatabase]: Recreate completed, verifying the database consistency
2019-06-19 23:39:41 +02 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Backup has started

The last two logs, from my reading, seem to show that

  • The recreate completed (although nothing further is provided verifying consistency)
  • The backup job started shortly after completion of the recreation (due to a missed scheduled job, i reckon). This job threw an error, which additional attempts to run the job also throw :
     "The database was attempted repaired, but the repair did not complete.This database may be incomplete and the backup process cannot continue. You may delete the local database and attempt to repair it again.

Questions

  1. Is it worthwhile to attempt the recreate again, rather than just start a new backup job?
  2. Are there any settings (duplicati, client OS, fileserver OS) I can leverage to make the backups more robust towards acheiving the goal above?
  3. Are there any version gotchas that I have missed that could be contributing to these failures?

Kind regards,

Hank

Hello @hangondoopi and welcom to the forum!

Nice writeup.

Have you looked on the SMB server to confirm that the mentioned files are really missing?
B:\IT\backups\duplicatia\duplicati-b562363822688452c9cd251a89bce211b.dblock.zip.aes
B:\IT\backups\duplicatia\duplicati-b397b1459dd634eeeadad75fdac5485a1.dblock.zip.aes

If they’re actually missing, you can run the affected command, e.g. from Commandline, to find out what source files would be affected by that file being missing. I think a list-broken-files and purge-broken-files could make things consistent again, however truly missing dblocks mean that some file data isn’t there.

Inventory of files that are going to be corrupted is a recovery example from (intentional) dblock damage.

If, on the other hand, dblock files are there (but not seen, for some reason), Recreate again might work.

Robustness in getting uploads done can sometimes be helped with a larger –number-of-retries, but I’m unsure whether it would fit here where Duplicati thinks it’s writing local files. It depends on SMB erroring.

There have been suggestions that backup-interrupting things like reboots and sleeps may cause issues, however I don’t think there’s been a systematic study, and I know in some cases it’s awkward to achieve.

Keeping a log file running at all times (if you can spare the space) would help answer questions about the prior history of files, when things go wrong. It’s actually in the database as well, but you’d need a browser such as DB Browser for SQLite (a.k.a. sqlitebrowser) to see RemoteOperation and RemoteVolume table.

Breakages in versions (especially in canary) can happen, but it’s unknown when your two dblocks got lost. There’s a file listing done by default before and after the backup, which seems like it “should” spot issues. Opening the database (assuming you saved a DB that showed the original problem) might be very useful.

There is a bit of nervousness about reliability and good error handling in the new parallel uploader. It was dropping index files in 2.0.4.16 canary. This was fixed in 2.0.4.17 canary. I do get nervous looking at logs, seeing things like dlist files going up before the associated dblock file goes up. That didn’t happen before.

EDIT:
Although I’m MORE worried, if Duplicati is actually dropping files, your Recreate message might be here:

and I guess there are different RecreateCompleted stages. Yours did say it needed to do some verifying.
The reason backup thinks repair did not complete appears to be somehow it didn’t get as far as line 523. Someone who’s familiar with the code might be able to work out how. I don’t know the area all that well…

EDIT 2:

purge-broken-files (probably best run after a list-broken-files) will clear the “repair did not complete” flag, although that’s just a symptom of other things in the chain that seem like they shouldn’t have happened: