Need help resuming local database repair

Recently, I found out that my local database was corrupted, and I needed to recreate it from remote storage. It had been running the repair for like a week when it appeared to stop making progress. Given that I wasn’t seeing any new log messages, I restarted the duplicati service.

Performance problems are documented in some of these issues:

According to the documentation for the repair command:

If no local database is found or the database is empty, the database is re-created with data from the storage. If the database is in place but the remote storage is corrupt, the remote storage gets repaired with local data (if available).

Obviously, I want to avoid restarting the repair from scratch. Is there any way to resume the database repair? Is any way to do the repair manually, or download the relevent dblocks to attempt a manual repair?

Welcome to the forum @dyc3

Any messages or details on symptoms? Did you save a copy? If not, then you’re stuck with recreating it.

What log and at what log level, and what were you seeing before? What was the last thing that you saw?

I doubt it. As per quote, the “recreate” side of repair starts from scratch. The “repair existing” is probably designed for smaller tweaking than starting from scratch. It may depend on how far you got (see below).

Without any information, can’t say. Some things can be fixed, but full recreate may be millions of dblocks.

Did the log show it downloading dblocks? That requires verbose level or higher. Ideally it’s dlist and dindex. You can rough-guess where it was if you watched the progress bar. Beyond 90% is an exhaustive search.

How big is the backup, and is there enough room on local drives for a copy? That might speed future tries.

There was a error saying that the local database couldn’t be opened. I do not have a copy of it, so yeah recreating is my only option.

I was on the profiling log level in under the “Live” tab, and left it for 2 days with no new messages.

As far as I can tell, my remote is intact. Size is ~2 TB. I’ve ordered a couple drives, so I should be able to put all of it on one of those if needed.

I’ve done a little bit of debugging, and the database inserts take 2-3 minutes. I’ve mentioned this on one of the related issues on github. I’ve looked at the related code briefly, and I can’t really figure out what part is creating those massive queries.

From what I’m hearing, there’s no way to have duplicati resume the recreate process? I have a partially recreated database from the previous attempt that is readable.

I’m not sure if this will run forever, without limit. A log file may be more reliable, but profiling log is big.
Regardless, I guess we don’t know anything about where it ended (e.g. was it close or very far off?).

If you wish to try, DB Browser for SQLite could inspect the database, maybe comparing the known
(by inspection) number of dindex files at the remote against how many got into IndexBlockLink table.

The default 100 KB blocksize is too small for that big a backup. but there’s no auto-tune or config aid.
2 MB would have made fewer blocks to track. Some DB operations grow too slow with many blocks.
Unfortunately one can’t change blocksize on an existing backup. Too much code requires it constant.

Nobody’s figured out the scaling-with-size slowdown, but I’ve been trying to get an experiment up here.
You’re probably not in a measuring mood at the moment, but there were some tools watching the files.

I think not, but I don’t think anybody here can give absolute answer. There’s certainly no specific option.

So if you like, you can browse it, or if it’s in-place now, post a link to a database bug report along with the destination files counts so that maybe someone else can figure out if it got anywhere near end of create.

I suppose you can give whatever you got a bit of a sanity check with the Verify files button which will download a few files and integrity-check the database. To maximize message catching, set up live log at Warning level or higher, and also see if any info is in job log or server log at About → Show log → Stored.

So in case it’s at all helpful I’ve gone through something similar, and my 1.4TB rebuild took two weeks: it will eventually complete. The thing I would definitely suggest is trying to rebuild your database on an SSD, not a mechanical drive. That wasn’t an option for me in that instance, but in other Duplicati-related issues it has improved performance by a factor of 20 or more.

1 Like

Thanks for the suggestion @conrad

Do you happen to happen to know if both the database and Temp folder were on an SSD?
By default for a Duplicati non-service install, both of them are in user profile local AppData

tempdir shows how to move Temp. I don’t know if Recreate uses that temporarily or goes
directly to usual DB location at Database screen Local database path while recreating.
If it does go directly to usual path, the usual path can be moved to some faster location…

Theoretically a RAM disk might be even faster, but may be too small and is kind of volatile.

Measuring can find the bottleneck, e.g. Task Manager can watch how busy the drives are.
CPU is best looked at per-logical-processor (right-click graph). SQLite is single-threaded.
Backups suddenly taking much longer to complete got quite deep into system monitoring.

Above is simply looking at performance (which matters), and not so much on why it stops,
however if somehow a file download hung, pre-downloading files might avoid that problem.
Logs and error messages would of course also be helpful in figuring out where it fell apart.

Yes, they would have been: I mainly run Duplicati on an all-mechanical-drive NAS, but have on occasion verified or repaired its database from an all-SSD PC.

I would absolutely highlight the note on your tempdir link that the --tempdir switch on its own will not affect SQLite, so always use the TMPDIR environment variable as well, myself.

I’m not sure whether there’s something about this in the setup/introduction docs now, but it would be great to ensure that there is. I suspect I’m in the same camp, with a too-big backup locked in to a too-small blocksize, and even with a fairly relaxed retention policy still find every backup that does a compact takes twelve times longer than the rest.

Neither the manual’s search nor Google search finds one. I don’t recall one, but docs change sometimes.
There is a need for volunteers in docs (and everywhere…), but pull requests and issues are accepted. :wink:

Some people object to relying on docs (which might not be read), but I think it’s better than not mentioning.

There is only very rough data on performance behavior in large backups, and how blocksize affects this…
Performance testing by a volunteer with enough equipment would help. Otherwise it’s still a bit of a guess.

Another path is if a volunteer with good database expertise joins the team to try to iron out the scaling pain.

Suggest better values for dblock-size and block size #2466 was a thought from original author 5 years ago.

I see that the issue above was going to try to measure the source size. This already happens on a backup, however after first backup is run, it’s too late to change. One issue with measurement is that a first backup sometimes is a partial one because it’s good to establish at least one version before adding more folders…

There IS a manual article about Choosing sizes in Duplicat which is probably similar to issue’s article here but speaks in rather general terms. Still, this might be enhanced and pointed to in the installation directions.