Things that make you go, "Uh-oh."

Maybe it’s just me but this seems really wrong.

david@david-desktop-mate:/bkups/duplicati$ ls | grep dblock | wc -l
32355
david@david-desktop-mate:/bkups/duplicati$ ls | grep dindex | wc -l
32356
david@david-desktop-mate:/bkups/duplicati$ ls | grep dlist | wc -l
19
david@david-desktop-mate:/bkups/duplicati$ 

Any suggestions on how to correct this without starting over?

Not that starting over would be the WORST thing - I could finally fix the blocksize for this monster

ETA: All my other backups match

Did you try Repair? Do you have log-file logs? Has there been a DB recreate (if you know)? Was it slow?

Maybe you have a dindex file that’s irrelevant, redundant, or wrong, so should be deleted. But which is it?

DB recreate might spot it and complain, but it can be slow, and there are other ways to look for an oddity.

A nice tool is DB Browser for SQLite which might be in the Linux system package repo as sqlitebrowser.

image

The dindex file internally holds the name of the dblock that it indexes, but you’d have to open it up to see it.
Database repair ideally opens only dlist and dindex, and it calls an error if dindex points to missing dblock.

Or if you prefer a different way to get a lighter look at the database from Linux shell, upload-verification-file duplicati-verification.json can give it. What’s missing is the IndexBlockLink to find what’s unused.

I’m not sure if an unused dindex is guaranteed to be the one to delete though. It seems a good guess, but what happens on a DB recreate if two dindex files both claim to index the same dblock is unknown to me.

Third option is to delete all the dindex files and press the Repair button to see what Duplicati builds.
Ordinarily it can replace missing dlist and dindex files, but its records might be a little incorrect now.

It’s not clear humans can do better but at least it’s some learning, especially if logs can be checked.
Ideally we would want to identify how things got out of line, so that something can be done to stop it.

Interesting, you prompted me to check my backup sets. I have a few where the dindex count doesn’t match the dblock count, as well.

I just tested this on one of my backups with a mismatch (100 dblocks and 102 dindexes). Duplicati rebuilt the 102 dindexes, so there’s still a mismatch.

Thanks for the test (which may knock out option 3). Are you or @Kahomono up for a database look?
Posting a database bug report for others (maybe me) is another option, but less learning than the DIY.

Yeah I might be able to generate a database bug report.

For what it’s worth, I did a full database recreate which processed all 102 dindex files without error. I then repeated the earlier test just for kicks where I delete the dindex files and regenerate them. No change.

I don’t know if people organize their backup data like I do (all backups stored in the same main location, with backup sets stored in individual subfolders). But if so here’s a simple PowerShell script to show the file counts:

param($BackupName)

$Path = '\\NAS\Backups\Duplicati'

$Sets = Get-ChildItem -Path $Path -Directory -Filter $BackupName
ForEach ($Set in $Sets)
{
    $Files = Get-ChildItem -Path $Set.FullName
    
    [PSCustomObject]@{
        BackupName = $Set.Name
        dlist = ($Files | Where-Object { $_.Name -match 'dlist' }).Count
        dblock = ($Files | Where-Object { $_.Name -match 'dblock' }).Count
        dindex = ($Files | Where-Object { $_.Name -match 'dindex' }).Count
    }
}

Output looks like this:
Capture

I’m looking at the database in DB Browser. The “IndexBlockLink” table has 100 entries.

In the “Remotevolume” table when I filter by dindex I see 102 entries. Interestingly, 2 of them have a size of 541 which from memory is an empty dindex. I need to decrypt and open them to be sure.

Edit to add: confirmed, those two dindexes only have a “manifest” file inside and no “list” or “vol” folders.

Nice remembering. Since your NAS might be Linux, you might be able to fairly easily look inside them
to see how many manifest and vol you have relative to the number of input files. Ideally all match…

$ unzip -l "*dindex.zip"
Archive:  duplicati-ib3189424484248dfa5bbdce8caa8fcda.dindex.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
      148  2021-02-08 17:27   manifest
      425  2021-02-08 17:27   vol/duplicati-b1c651688cf094802bb08d0593ab0fc81.dblock.zip
---------                     -------
      573                     2 files

...

but knowing the magic sizes is easier. I also just tried making some SQL based on source examples:

SELECT ID FROM Remotevolume WHERE Type = “Index” AND State = “Verified” AND ID NOT IN ( SELECT IndexVolumeId FROM IndexBlockLink )

run in the Execute SQL tab might lead you to the “extra” dindex. If that SQL is wrong, I can try again…

I wonder how the bad files looked before they were recreated? Did you happen to save the old dindex?

EDIT:

Not looking at code, there might be something that sees dindex is missing (based on Remotevolume),
looks to see what it should have, but comes up empty because it’s not hooked into the IndexBlockLink.

The Block table has the VolumeID in the Remotevolume table for a block. Code could reverse-lookup, provided IndexBlockLink can map IndexVolumeID to BlockVolumeID, but maybe that entry is missing?

1 Like

Yes, I kept a copy of the back end files before my experiments (and also a copy of the database). I’ll try to do more digging later today and see if the dindex files have that same 541 file size before my testing.

1 Like

Remember when we were time-testing large backups and performance fell RIGHT off the cliff at 500KB blocksize?

This is (the live version of) that same backup and it was made at the current sad default 100KB

I think I have put off starting this one over, long enough.

I have six sets here and this was the only mismatch. Unfortunately it’s also the live version of the 8TB monster we were testing for blocksize effects last fall. I think it needs a full re-build.

One of my older backups has 188 (8 extra) dindex files.
When I do a test command with all and --full-remote-verification=true commands I don’t get any errors other than these Extra:'s.

I’m guessing that its only a problem when you have extra dblock files without a dindex. Should I reset this backup anyways?

This doesn’t seem to pan out as the original dindex files aren’t 514 bytes, they were a bit larger. But when I deleted them and had Duplicati recreate them, it made 514 byte files (in my case). Not sure how consistent that number is.

Nice job, this worked and pointed to the same 2 files.

That’s all I had time to test today. Will try to dig deeper tomorrow…

1 Like

I was using my previous SQL query again today in Python after doing kill testing on Duplicati which was unfortunately accumulating some extra dindex files with an actual “vol” file pointing to a dblock that was missing. This caused Recreate to error when I finally ran it. I put in a test to try to catch it earlier, but I’m pretty sure it caught the issue where an error during compact forgot dindex file deletion, getting Missing file error next run. Because I’m trying to keep the test rolling, I run a Repair on this error, and apparently what’s uploaded is this sort of mostly-empty just-a-manifest file. Makes sense, as this file was set up for deletion, and no longer has an entry in IndexBlockLink to find the dblock filename it’s supposed to index.

Running backup at 2022-05-07 18:12:57.166290
Timed out after 6 seconds
Statistics: started: 14 timeout: 2 missing: 0 extra: 0 marked: 0

Running backup at 2022-05-07 18:13:08.472690
Exit code 100
stderr was:

ErrorID: MissingRemoteFiles
Found 2 files that are missing from the remote storage, please run repair

Running repair at 2022-05-07 18:13:11.702707
  Listing remote folder ...
  Uploading file (233 bytes) ...
  Uploading file (233 bytes) ...
Statistics: started: 15 timeout: 2 missing: 1 extra: 0 marked: 0

Running backup at 2022-05-07 18:13:22.303555
Exit code 0
duplicati-ie7c5f92748ff4b5ea45e1f01f026fa52.dindex.zip
duplicati-i3c359a2d644c45f484c7e7f5d689dc52.dindex.zip
Statistics: started: 16 timeout: 2 missing: 1 extra: 0 marked: 0

These two files don’t bother the Recreate, so I might have to work harder to filter out the noise they add.

EDIT to add profiling log to confirm interrupted compact:

2022-05-07 18:13:01 -04 - [Information-Duplicati.Library.Main.Database.LocalDeleteDatabase-CompactReason]: Compacting because there is 25.52% wasted space and the limit is 25%
...
2022-05-07 18:13:02 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.BackendManager-RemoteOperationDelete]: Starting - RemoteOperationDelete
2022-05-07 18:13:02 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Started: duplicati-ie7c5f92748ff4b5ea45e1f01f026fa52.dindex.zip (18.09 KB)
2022-05-07 18:13:02 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-ie7c5f92748ff4b5ea45e1f01f026fa52.dindex.zip (18.09 KB)
2022-05-07 18:13:02 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.BackendManager-RemoteOperationDelete]: RemoteOperationDelete took 0:00:00:00.000
2022-05-07 18:13:02 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.BackendManager-RemoteOperationDelete]: Starting - RemoteOperationDelete
2022-05-07 18:13:02 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Started: duplicati-be4bfcbdea2cb43319f4025b73db8e660.dblock.zip (4.91 MB)
2022-05-07 18:13:02 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-be4bfcbdea2cb43319f4025b73db8e660.dblock.zip (4.91 MB)
2022-05-07 18:13:02 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.BackendManager-RemoteOperationDelete]: RemoteOperationDelete took 0:00:00:00.001
2022-05-07 18:13:02 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.BackendManager-RemoteOperationDelete]: Starting - RemoteOperationDelete
2022-05-07 18:13:02 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Started: duplicati-i3c359a2d644c45f484c7e7f5d689dc52.dindex.zip (18.09 KB)
2022-05-07 18:13:02 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-i3c359a2d644c45f484c7e7f5d689dc52.dindex.zip (18.09 KB)
2022-05-07 18:13:02 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.BackendManager-RemoteOperationDelete]: RemoteOperationDelete took 0:00:00:00.000
2022-05-07 18:13:03 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.BackendManager-RemoteOperationGet]: Starting - RemoteOperationGet
2022-05-07 18:13:03 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-bceb3198dc64c43fdaf4ce7d7c4c8f3e6.dblock.zip (5.09 MB)
2022-05-07 18:13:03 -04 - [Profiling-Duplicati.Library.Main.BackendManager-DownloadSpeed]: Downloaded 5.09 MB in 00:00:00.0292879, 173.73 MB/s
2022-05-07 18:13:03 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-bceb3198dc64c43fdaf4ce7d7c4c8f3e6.dblock.zip (5.09 MB)
2022-05-07 18:13:03 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.BackendManager-RemoteOperationGet]: RemoteOperationGet took 0:00:00:00.030
2022-05-07 18:13:09 -04 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Backup has started