Repair is downloading dblock files


#1

Hi,
Anyone any idea why during a repair dblock files are downloaded?
I would assume it has enough with dlist & dindex files to reconstruct the DB?

BTW: it took >24hrs to process 1 dblock file in the DB (25MB), and 116 in total to process… I’ll see if I can speed this up on another more powerfull device, at this speed it will take 4 months…

Regards,
Wim


#2

It is most likely because there is some block/list that is not found in the dindex files. Since there is no map to figure out where the missing information is, it will just download the dblock files in random order until it finds whatever is missing.


Strategies for recreating a backup
#3

Tnx,
Strange however, as there was nothing wrong with the database, I just wanted to try a repair…

Anyway, I switched to do the repair on my laptop, let’s see if this goes any faster compared to my poor DS216j which has been suffering a lot last days…


#4

ok, on laptop it goes a zillion times faster than on my DS216j :slight_smile:


#5

I wonder if the performance issue is due to memory, CPU, IO, or other issues…


#6

Most likely CPU. The DS216 is using a low-power ARM processor.


#7

Hi @kenkendk,

I think the compacting process creates incomplete dindex files. Still need to verify it, but I noticed that a db recreate after moving from 25mb to 1gb had to download all dblock files.

I’m now testing if deleting the dindex and doing a repair creates correct dindex files. Unfortunately this seems to be a slow process. I’ll let it run overnight to see how it progresses, but one dindex file takes over an hour to be recreated.

This could also explain part of the long db repair times mentioned on the forum.

I’ll create a ticket on github once I have verified this behavior.


#8

Have not yet been able to simulate this… Compact and repair afterwards work fine on a simulation.
But on my production backup, I’m pretty sure the compacting is what has caused missing info in dindex files.

Or: there was info missing in the dindex files (created with older releases) prior to compacting that got replicated now in the compacted backup over all the files.

Anyway, the repair is still ongoing on my production set.


#9

Repair on one of the production sets finished succesfully. All dindex files recreated with at least for some of them different size than the original ones. Have not checked the differences in content yet.

Did a recreate of the db which went fine. No dblock had to be downloaded where initially this was the case.

Another job is still creating the dindex files. >300 files need to be created at approx 1h30 - 2h30 per file. That means… a lot of time…


#10

left = new, right = orig dindex files.

Seems like the list folder does not exist in the orig files…


#11

Yes, I have a suspicion about that as well. But not managed to set up a test that produces the error.

Strange, because the list folder is where the indirection blocks are stored. If these are missing, then the full download will start.

But since they are missing in the “orig” that should make problems with recreates from the original data?

Unless… they are added in error to the “new” such that the restore looks for extra stuff that it does not need? I know that the recreate algorithm is blindly following the data, causing it to fail when it rebuilds the database.

I have started the rewrite, of the algorithm, but I need a day or two with nothing to do before I can complete it.


#12

Did not succesfully simulate this behaviour either. It could be I was running the compact with an old version of duplicati (2.0.2.1_beta_2017-08-01), but not sure. I think I blindly followed some how-to when installing duplicati on a VPS to run the compact.

Maybe some clarification on the screenshot…
Orig is from after the compacting process.
New is the dlist created after removal of all dlist files, and running a repair.


#13

this one is still ongoing BTW… 21.988% completed


#14

One way to get this is to backup an empty file which makes only a metadata block in dlist, dindex, and dblock, yet recreate adds a row to the Block table with Hash 47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= and Size 0 and VolumeID -1. It also downloads all the dblocks (in the test case of a new backup, only one) as the countMissingInformation query keeps thinking it’s not done. The query still returns that row after recreate. Backing up a 1 byte file made a data block, and a recreate didn’t show any dblock BackendEvent downloads. This explains some of my test-backup dblock downloads. I’m not sure whether it explains issues in “compact”.

https://github.com/duplicati/duplicati/blob/79d839546b238be79967bf8532267bfab2b08aa6/Duplicati/Library/Main/Database/LocalRecreateDatabase.cs#L551

I’ll say that I didn’t get in the Profiling log ALL the SQL I expected, but it did look like third pass log messages:

2018-12-04 15:19:06 -05 - [Verbose-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-ProcessingAllBlocklistVolumes]: Processing all of the 1 volumes for blocklists: duplicati-b028adca88d0446569e1a841f9f8dfd19.dblock.zip
2018-12-04 15:19:06 -05 - [Verbose-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-ProcessingAllBlocklistVolumes]: Processing all of the 1 volumes for blocklists