Repair is downloading dblock files

Wim_Jansen · March 6, 2018, 7:13am

Hi,
Anyone any idea why during a repair dblock files are downloaded?
I would assume it has enough with dlist & dindex files to reconstruct the DB?

BTW: it took >24hrs to process 1 dblock file in the DB (25MB), and 116 in total to process… I’ll see if I can speed this up on another more powerfull device, at this speed it will take 4 months…

Regards,
Wim

kenkendk · March 6, 2018, 7:59am

It is most likely because there is some block/list that is not found in the dindex files. Since there is no map to figure out where the missing information is, it will just download the dblock files in random order until it finds whatever is missing.

Wim_Jansen · March 6, 2018, 8:05am

Tnx,
Strange however, as there was nothing wrong with the database, I just wanted to try a repair…

Anyway, I switched to do the repair on my laptop, let’s see if this goes any faster compared to my poor DS216j which has been suffering a lot last days…

Wim_Jansen · March 6, 2018, 8:58am

ok, on laptop it goes a zillion times faster than on my DS216j

JonMikelV · March 7, 2018, 5:47am

I wonder if the performance issue is due to memory, CPU, IO, or other issues…

kenkendk · March 7, 2018, 8:09am

Most likely CPU. The DS216 is using a low-power ARM processor.

Wim_Jansen · November 27, 2018, 9:25pm

Hi @kenkendk,

I think the compacting process creates incomplete dindex files. Still need to verify it, but I noticed that a db recreate after moving from 25mb to 1gb had to download all dblock files.

I’m now testing if deleting the dindex and doing a repair creates correct dindex files. Unfortunately this seems to be a slow process. I’ll let it run overnight to see how it progresses, but one dindex file takes over an hour to be recreated.

This could also explain part of the long db repair times mentioned on the forum.

I’ll create a ticket on github once I have verified this behavior.

Wim_Jansen · November 28, 2018, 7:31am

Have not yet been able to simulate this… Compact and repair afterwards work fine on a simulation.
But on my production backup, I’m pretty sure the compacting is what has caused missing info in dindex files.

Or: there was info missing in the dindex files (created with older releases) prior to compacting that got replicated now in the compacted backup over all the files.

Anyway, the repair is still ongoing on my production set.

Wim_Jansen · November 29, 2018, 7:54am

Repair on one of the production sets finished succesfully. All dindex files recreated with at least for some of them different size than the original ones. Have not checked the differences in content yet.

Did a recreate of the db which went fine. No dblock had to be downloaded where initially this was the case.

Another job is still creating the dindex files. >300 files need to be created at approx 1h30 - 2h30 per file. That means… a lot of time…

Wim_Jansen · November 29, 2018, 8:19am

left = new, right = orig dindex files.

Seems like the list folder does not exist in the orig files…

kenkendk · December 4, 2018, 2:52pm

Yes, I have a suspicion about that as well. But not managed to set up a test that produces the error.

Strange, because the list folder is where the indirection blocks are stored. If these are missing, then the full download will start.

But since they are missing in the “orig” that should make problems with recreates from the original data?

Unless… they are added in error to the “new” such that the restore looks for extra stuff that it does not need? I know that the recreate algorithm is blindly following the data, causing it to fail when it rebuilds the database.

I have started the rewrite, of the algorithm, but I need a day or two with nothing to do before I can complete it.

Wim_Jansen · December 4, 2018, 3:08pm

Did not succesfully simulate this behaviour either. It could be I was running the compact with an old version of duplicati (2.0.2.1_beta_2017-08-01), but not sure. I think I blindly followed some how-to when installing duplicati on a VPS to run the compact.

Maybe some clarification on the screenshot…
Orig is from after the compacting process.
New is the dlist created after removal of all dlist files, and running a repair.

Wim_Jansen · December 4, 2018, 3:11pm

this one is still ongoing BTW… 21.988% completed

ts678 · December 4, 2018, 9:06pm

One way to get this is to backup an empty file which makes only a metadata block in dlist, dindex, and dblock, yet recreate adds a row to the Block table with Hash 47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= and Size 0 and VolumeID -1. It also downloads all the dblocks (in the test case of a new backup, only one) as the countMissingInformation query keeps thinking it’s not done. The query still returns that row after recreate. Backing up a 1 byte file made a data block, and a recreate didn’t show any dblock BackendEvent downloads. This explains some of my test-backup dblock downloads. I’m not sure whether it explains issues in “compact”.

https://github.com/duplicati/duplicati/blob/79d839546b238be79967bf8532267bfab2b08aa6/Duplicati/Library/Main/Database/LocalRecreateDatabase.cs#L551

I’ll say that I didn’t get in the Profiling log ALL the SQL I expected, but it did look like third pass log messages:

2018-12-04 15:19:06 -05 - [Verbose-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-ProcessingAllBlocklistVolumes]: Processing all of the 1 volumes for blocklists: duplicati-b028adca88d0446569e1a841f9f8dfd19.dblock.zip
2018-12-04 15:19:06 -05 - [Verbose-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-ProcessingAllBlocklistVolumes]: Processing all of the 1 volumes for blocklists

EDIT: To elaborate on the failure sequence that I began describing at link above, a countMissingInformation uses missingBlockInfo which gets upset by the VolumeID -1 that got written into the database by the backup. Some of my databases still have this -1. I haven’t found how to clear it. Some others have cleared somehow. Regardless, when countMissingInformation hits this sticky problem, it never passes the check if we are done which means it goes to the third pass and does ProcessingAllBlocklistVolumes which then downloads in vain.

I think one design question would be how an empty file is supposed to be represented. I’ve seen many ways, including one where I added an empty file late to a backup without this problem (hoping to induce it). Instead what happened was that the ID of the empty Blockset got no BlocksetID in BlocksetEntry, breaking the chain, whereas on the troubled database, the BlocksetID in BlocksetEntry has a BlockID leading to -1 in Block table, assuming I’m correctly understanding how these things link together. I wonder if there’s good documentation?

drwtsn32 · September 20, 2019, 3:38pm

Is this confirmed to be the correct or best way to fix the issue where database recreation has to download some dblocks? Any other way to repair/rebuild dindex files with missing information?

Finally, how confident are we that the root problem was the compaction process and that it has been fixed?

Thanks!

drwtsn32 · October 26, 2019, 1:13pm

I have tested this on a few machines now with 2.0.4.32. Deleting dindex files and running repair will regenerate them (quite quickly, I might add). Now when I run a database recreation test, only dlist and dindex files need to be downloaded.

Previously on these machines a database recreate would trigger some dblock downloads which would increase the time required.

ts678 · January 8, 2020, 4:23am

I’m seeing this on 2.0.5.0 Experimental. When compact actually rebuilds a dindex it omits the list folder. Reproducible in under a minute on Windows. Open a File Explorer then run backup until file list shrinks.

Between backups, you can (if you like) click on a dindex file to unzip it to see whether it has a list folder. Running an unencrypted backup at 1MB dblock,1KB block, retention of 5, threshold of 10 for compacts.

Test script below is quick and dirty. If your batch skill is better than mine, feel free to improve. Goal is to make files large enough to need more than one block, and variations over time that eventually compact.

%random% after dividing by 1000 looks like it would give a value from about 0 to 32. Using raw random would pretty much guarantee entire old dblock would get tossed. Need to have a reason to do a merge.

@echo off

set length1100=01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789

:loop

setlocal enabledelayedexpansion
for /L %%i in (1,1,40) do (
        set /a value = !random! / 1000
        echo !value! %length1100% > C:\random\%%i.txt
)

timeout /t 1

goto loop

EDIT:

While original script reproduced the seeming problem, it unexpectedly did not change the files as intended because the for loop apparently expanded at least some variables once for entire body. Original quick-and-dirty script was a lot of lines made in an editor – harder to get such surprises…

EDIT 2:

Was it filed, or did chase pause, I wonder? I did try searching, but came up with this topic instead.
Additionally, I should note that testing on 2.0.4.23 with originally posted, or revised script also fails assuming this is a failure and not some attempt at optimization for special case of a single dblock.

Here’s a log of 2.0.4.23 which seems to confirm missing list folder makes it download only dblock:

Jan 8, 2020 8:43 AM: The operation Repair has completed
Jan 8, 2020 8:43 AM: Recreate completed, and consistency checks completed, marking database as complete
Jan 8, 2020 8:43 AM: Recreate completed, verifying the database consistency
Jan 8, 2020 8:43 AM: Backend event: Get - Completed: duplicati-ba8418552f72341048515332330ded740.dblock.zip (68.22 KB)
Jan 8, 2020 8:43 AM: Backend event: Get - Started: duplicati-ba8418552f72341048515332330ded740.dblock.zip (68.22 KB)
Jan 8, 2020 8:43 AM: Processing required 1 blocklist volumes
Jan 8, 2020 8:43 AM: Backend event: Get - Completed: duplicati-ie1a5f5a984664e579e8ca3744b59bff2.dindex.zip (9.61 KB)
Jan 8, 2020 8:43 AM: Backend event: Get - Started: duplicati-ie1a5f5a984664e579e8ca3744b59bff2.dindex.zip (9.61 KB)
Jan 8, 2020 8:43 AM: Filelists restored, downloading 1 index files
Jan 8, 2020 8:43 AM: Backend event: Get - Completed: duplicati-20200108T132605Z.dlist.zip (3.76 KB)
Jan 8, 2020 8:43 AM: Backend event: Get - Started: duplicati-20200108T132605Z.dlist.zip (3.76 KB)
Jan 8, 2020 8:43 AM: Backend event: Get - Completed: duplicati-20200108T132600Z.dlist.zip (3.69 KB)
Jan 8, 2020 8:43 AM: Backend event: Get - Started: duplicati-20200108T132600Z.dlist.zip (3.69 KB)
Jan 8, 2020 8:43 AM: Backend event: Get - Completed: duplicati-20200108T132557Z.dlist.zip (3.62 KB)
Jan 8, 2020 8:43 AM: Backend event: Get - Started: duplicati-20200108T132557Z.dlist.zip (3.62 KB)
Jan 8, 2020 8:43 AM: Backend event: Get - Completed: duplicati-20200108T132553Z.dlist.zip (3.63 KB)
Jan 8, 2020 8:43 AM: Backend event: Get - Started: duplicati-20200108T132553Z.dlist.zip (3.63 KB)
Jan 8, 2020 8:43 AM: Backend event: Get - Completed: duplicati-20200108T132425Z.dlist.zip (3.56 KB)
Jan 8, 2020 8:43 AM: Backend event: Get - Started: duplicati-20200108T132425Z.dlist.zip (3.56 KB)
Jan 8, 2020 8:43 AM: Rebuild database started, downloading 5 filelists
Jan 8, 2020 8:43 AM: Backend event: List - Completed: (7 bytes)
Jan 8, 2020 8:43 AM: Backend event: List - Started: ()
Jan 8, 2020 8:43 AM: The operation Repair has started

drwtsn32 · January 8, 2020, 9:46pm

Nice investigation. I will see if I can reproduce this when I have some time. Hopefully it will help devs fix the flaw.

ts678 · January 10, 2020, 2:54pm

compact may fail to write useful list folder of dindex file #4048 has a very step-by-step that should be better for debug work, but worse than just running and watching it, using the test script shown above.

The GitHub issue also goes the other way, giving some thoughts on how to find this in usual backups.