Recreating database logic/understanding/issue/slow

ts678 · August 6, 2019, 1:19pm

Specific cases of “why is it downloading” can’t be explained without a DB bug report and much work but roughly estimating whether it’s in the 70%-80%, 80%-90%, or 90%-100% pass will give some insight on why it had to be done. The question of how the need for your dblock downloads began is hugely deeper.

You can look at history of the issue in this topic by pressing Control-F and searching for the word empty.

Searching for the word passes will find discussion of the three levels of dblock search, if any is needed.

To cover it further, v2.0.4.18-2.0.4.18_canary_2019-05-12 has a fix announced in the forum as following:

Ignoring empty remote files on restore, which speeds up recovery, thanks @pectojin

Empty source file can make Recreate download all dblock files fruitlessly with huge delay #3747
is the GitHub issue on this, and probably describes the experience of many (one usually has empty files).

Check block size on recreate #3758 is the fix for this probably widespread but very specific problem case.

The fix is unfortunately not in a beta, but is in beta candidate v2.0.4.21-2.0.4.21_experimental_2019-06-28 which was not suitable for a beta due to an FTP problem. Instead, Release: 2.0.4.23 (beta) 2019-07-14 is basically 2.0.4.5 plus a warning that had to be done. Click that release announcement for more about that.

It would probably be worth trying 2.0.4.21 experimental on the separate machine for the restore test, but if installed on an existing backup system, it will upgrade databases and make it difficult to revert to 2.0.4.23.
Downgrading / reverting to a lower version covers that. Though it’s DB-centric, even systems without DB also have potential downgrade issues from design change. At least databases group issues in few spots.

Backing up the DB in a different job that runs after a source file backup would be a fine safeguard in case fixes so far (such as mentioned) still leave you in dblock downloads (last 10% may be especially lengthy).

Keeping more than one version of DB would be best because sometimes the DB self-checking at start of backups and other operations finds problems introduced in prior backup somewhere, so prior DB backup would have the problem, whereas the one before it might be good, but old DBs also tend to remove newer backup files from the remote if one runs repair – it’s never seen the files – fix for issue is being discussed, and repair/recreate is being entirely and very slowly redesigned anyway, so I don’t know what future holds.

I’d note that some of the backup checking is between the DB and the remote, e.g. is everything still there, with expected content that hasn’t been corrupted on upload or on remote? Things do corrupt sometimes.

Keeping duplicate records has advantages over single-copy records, but it does lead to messages about unexpected differences. It also requires reconstruction of the duplicate records (i.e. the DB) if they’re lost. The flip side of that is that lost remote dlist and dindex files can be recreated from the database’s records.

Local DB is a tradeoff IMO, with pros and cons and (for the time) beta bugs that need to be shaken out…

Another tradeoff IMO is the slicing and tracking that any block-based deduplicating backup has to do, but direct copying of source files (which some people do feel more comfortable with) is just hugely inefficient.

Be super-sure never to have two machines backing up to the same destination, doing repairs, etc. Each will form the remote into what it thinks is right, and they will step all over each other. Direct restore is OK.

For best certainty on test restore from machine to itself, add –no-local-blocks to its backup configuration.

I’m not clear on all the machines being used, but I think Linux backup needs similar UNIX restore system.