Recreating database logic/understanding/issue/slow


#1

Hi,

So I am also having issues with the recreate database feature. I triggered the recreate as part of my assessment of the software. It is taking forever (+5 days and I do not think it is close to finish).

From the log it looks like it is downloading all the dblock files. I hope it is not.

Is there a place where the logic this is applying is documented? I couldn’t find it.

I am trying to make sense of the files in the storage folder. I am guessing that the dlist one is created when a backup session completes, and it contains a list of of dblock/dbindex/files or similar.

The dindex/dblock files seem to be in pairs, though, they do not share the same hex hash in the name (I think they are in pairs because of the datetime is the same). Some images to show what I am talking about.

The recreate process is downloading dblock files that seem to have a dindex file. See images.

That’s what is really puzzling me. Per the other entries I have read, the dblock files should only be downloaded if the dindex file is missing. Well… I have 1938 dblock and 1938 bindex files in my storage. I do not think any of my dindex files are missing. Why is this downloading the dblock files?

Please advise. thanks.

Technical details:
I am running Duplicati - 2.0.4.15_canary_2019-02-06 on Windows 10 (x64). My target storage is Minio S3 (hosted in the remotely). I have been heavily testing it for +1 month now.

I went ahead and created a backup selection of about 500Gb and +100k files. For the initial backups I used a block size of 150mb I changed the block size to 256mb.

My storage folder has 3881 files, dbindex, dblock (in pairs), and some dlist ones (I only have 5).


#2

BTW: I just tested recoverying from another computer that has local access to the remote storage. That computer is able to read the files information (“recreate” when you restore directly from path) in maybe 20 minutes. That’s where you get to choose what to restore.

By monitoring the logs and the operating system it doesn’t look like it is accessing the .dblock files, only the .dlist file.

After that I restored 1 folder just to test. It did a “partial temporary database” recreate. It took maybe 1 hour to complete, it read a bunch of dindex files and seems that only a few dlock ones (I guess where the files actually were).

So, maybe the issue is related to the S3 access? seems to be ignoring the dindex files in the rebuild.


#3

Another possibility is that the partial temporary database for a single version doesn’t run into whatever the full database recreate ran into that makes it download dblocks. Possibly if you tried all the versions with a direct restore, you could find one that will download dblock files. You could rule out S3 by testing a similar “direct restore” of the same version over S3 instead of local file access.

I’ve got an incomplete theory on what causes dblock downloads, but to see if it fits your recreate will need you to look at your database with an SQLite viewer (or post a link to a database bug report). For example:

image
image

The theory is that a -1 VolumeID sometimes happens with empty files, and causes a wasted full search. Some other times, though, empty files are stored in a remote dblock volume, just as usual (just shorter).

Above sees the -1 VolumeID, decides information is missing, and just keeps on fetching all the dblocks:


#4

How the backup process works has some documentation, including for filelist.json that’s in the dlist file.

Unfortunately I don’t think documentation gets as far down as details of recreate. If you can read source (which also has a few comments), the links I point to might help. It looks to me like it gathers dlists, then dindexes, then (as said earlier) goes to fetch dblock files if data is still missing. Testing with new backup having an empty source file caused dlist to have this entry in filelist.json – but not in the dindex or dblock:

{“type”:“File”,“path”:“C:\emptytests\length0.txt”,“hash”:“47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=”,“size”:0,“time”:“20190305T000452Z”,“metahash”:“TZm2AFXRe8Y4ja/tCPQ7NU/QSv9mqhpnUcht89kWldU=”,“metasize”:137}

I tested a new backup of a 1 byte file, then added a 0 byte file and did another backup. I used a Database Delete, then used Commandline to run a repair, changing the Commandline arguments to –console-log-level=information and (on next line) –version=(1 for first backup or 0 for second). The options can also be added at screen bottom Add advanced option. Remember to delete database as you change versions. You’ll see (if you get the same results) that the version with only the 1-byte file ends nicely after the dindex files are read, but the second backup (known as 0 because 0 is newest) continues to read all dblock files.

The recreated database also has the VolumeID 0f -1, so possibly that’s at least one way to get that oddity. Getting deep in terminology, the 0 length file ended at Blockset table, without BlocksetEntry or Block table. Earlier I had described one that got to the Block table with an empty block. There are 3 ways to show this.

If you prefer to test using a full database Recreate (instead of specifying --version), live logging as you did originally should be fine. You could also examine your source area to see if you even have any empty files, because there might be some other ways (not involving empty files) to get all the dblock files downloaded.

Usually though, a test case is better than none, so I hope this aids development. I might soon file an issue, however I’d certainly encourage you to do it if you can replicate something similar to the above test results.


#5

Thanks. I’ll check it out. I can read code, but I do not know if I have the time/energy to go into full debug mode.

Honestly, I cannot imagine why the recreate is such a brute force algorithm (or at least it seems to be). I would think there would be a way to do a “light recreate” that just takes the dindex/list files and go from there. Downloading the whole backup to restart the backup client is unfeasible, unless you’re backing up just a few files.

That’s another reason I am looking for documentation. I am shocked that there are no options for the recreate (ie: it is “let me download the 500gb of historical backups OR stop, delete all your settings and backups and restart from 0”).

Needing to download the whole thing to continue in case of issues is a major drawback.


#6

Sounds like this might be a request to be able to opt out of the extreme measures, at risk of maybe some less-than-max-results but in a tolerable time. There is probably a rewrite of repair/recreate underway now, so who knows what’s due to come? From another point of view, a repair or other method can sometimes avoid the need for a full recreate, and a partial recreate is a lighter option if it comes down to a recreate…

That’s not what it always does. There are three passes (presumably increasingly painful) and stop-points:

On top of that, there are other repair tools. Unfortunately, repair has its specific task, as do the others such as purge, list-broken-files, purge-broken-files, and repair in it’s non-recreate behavior. Lack of detailed documentation on what to use does make things worse, but having any manual at all is only about a year old, and possibly one of the many things that could improve with more volunteer help.

What about a manual? explains original author point that troubleshooting is covered by the forum, and I sometimes agree (until computers are better than people at self-healing their ails) but sometimes don’t. There’s a definite volume problem from a large user base having even very occasional issues per user. There’s also a scaling problem (as with code) of not being able to replicate expertise as fast as desired.

There are options for “recreate” in a loose sense, but the other one isn’t called “recreate”, and there’s no single button for restart from 0 (though I’d have thought you’d want the old settings, different destination).

The decision on which way to go now uses experiment and human discussion, but sadly often only after someone finds that a full database recreate (in the tight sense of reading backend) is taking too long. It’s sometimes asked how long it will take, and answers are hard because they vary too much on situations, and simplistic formulas such as download speed and total destination file size don’t help because you’re not supposed to have to download everything, and one can’t know what’s missing without some looking. Still, to your point, it might someday be possible to do estimates, based on factors, for informed options. There are currently plenty of other hot issues to handle, so I can’t forecast when this idea might happen. Possibly you can write a request in forum Features or GitHub issues so that it’s at less risk of being lost.


#7

Hi ts678,

Just to be clear. I am testing the solution - so I am trying to emulate what I would consider a proper scenario where I need to restore files.

So my hypothetical test is: the computer had a total HDD failure. I bought a new HDD and need to restore some key files, then everything and then continue as normal. The computer had the failure while running a backup job. I do not consider any of this a long stretch of a very possible situation.

To simulate this, I created a big (but real) backup job for my computer. It is about 500gb. I let it complete a few times.

Then I started the job and in the middle of it I just cancelled. Went ahead and deleted the .sqlite database and asked to recreate.

As mentioned before: in my remote storage everything looks fine (dblock and dindex files are in pairs, there is one dlist per completed job).

That’s where the behavior is very hard to understand (what duplicati is doing). I do not know the internal format of the files, but I assume that the dindex files say what the dblock files have. So, it is hard to understand why any dblock file is needed at all. As mentioned before, that shouldn’t be forced and the system should allow to rebuild based on the dindex files only - forcing in this case just makes it unfeasible (I cannot wait an undertermined amount of time with an uninterrupment, high unlimited bandwith connection).

I killed my test. It ran for a week and it continued to download (old) dblock files. I just restored the .sqlite database manually (which wouldn’t be possible in my hypothetical scenario).

Thanks.


#8

I have also experienced problems with Recreate Database. I was using Duplicati to backup to a USB connected WD 4 Tb drive. After several hours the computer locked up forcing me to do a hard reboot. When I restarted the backup, Duplicati warned me about problems so I did a delete and repair database.

Duplicati ran for 10 days recreating the database and did not look close to finishing so I aborted it. The backup source is 357 Gb. I can do a total backup from start in less than 3 days.

Recreating the database should not take longer than a full backup.

I am running 2.0.4.5_beta_2018-11-28 on Windows 7. I am backing up to a WD drive connected through USB3.


Database stuck recreating?
#9

Wow. That is even worse considering you are using local storage.