Verifying Backend Data

The “no-backend-verification” option doesn’t seem to affect the “Verifying backend data …” behavior, either.

That query is a consistency check that verifies that the database is in a sane state and that no blocks as missing or dangling.

There is currently no option to disable this verification step.

1 Like

Another victim here. I had an unexpected reboot while a backup run was finishing up, and now I am stuck in “Verifying backend data” for over an hour.

The chatter here does not give me optimism that it will finish soon. (Ever?)

The sole Profiling log entry is:

Dec 14, 2017 4:38 AM: Server has started and is listening on 0.0.0.0, port 8200

4:38AM being the time of the reboot. Here is my export command line

mono "/usr/lib/duplicati/Duplicati.CommandLine.exe" 
backup "file:///var/duplicati/" 
"/var/log/" "/home/david/" 
--snapshot-policy="Auto" 
--full-result="true" 
--backup-name="DCF_Desktop_To_Var" 
--dbpath="/home/david/.config/Duplicati/YKLEULKMPB.sqlite"
--encryption-module="aes" --compression-module="zip" --dblock-size="64MB" 
--keep-time="3M" --passphrase="<redacted>" --send-http-url="<redacted>" 
--disable-module="console-password-input" 
--exclude="/home/david/.pcloud/" 
--exclude="/home/david/.cache/" 

line breaks added for readability.

Update: after almost two hours, it finished with a warning:

Warnings: [ Expected there to be a temporary fileset for synthetic filelist 
(79, duplicati-ib0bb3aea22da49ddab20ecd986681f63.dindex.zip.aes), 
but none was found? ] 
1 Like

I believe the temporary fileset warning is almost expected after a power interrupt event and not something you need to worry about as it really is a temp file.

1 Like

I’m seeing this as well with my largest backup set. Enabling verbose logging it seems to be stuck at

2018-02-04 12:07:47Z - Profiling: Starting - ExecuteReader: SELECT "A"."Hash", "C"."Hash"
FROM (
    SELECT "BlocklistHash"."BlocksetID", "Block"."Hash", *
    FROM  "BlocklistHash","Block"
    WHERE  "BlocklistHash"."Hash" = "Block"."Hash" AND "Block"."VolumeID" = ?) A,
"BlocksetEntry" B,
"Block" C
WHERE "B"."BlocksetID" = "A"."BlocksetID" AND
    "B"."Index" >= ("A"."Index" * 3200) AND
    "B"."Index" < (("A"."Index" + 1) * 3200) AND
    "C"."ID" = "B"."BlockID" 
ORDER BY "A"."BlocksetID", "B"."Index"

I think the problem here is I didn’t change the default 50MB chunk size and this is a 3.5TB backup set composed of files that average 10GB each. The results is my sqlite DB for this set is 8.2GB. I’ve since increased the chunk size to 500MB but that doesn’t help with the existing 3.5TB of backup data.

The chunk (dblock / “Upload volume size”) is the approx max size of the files saved to your destination - they have little effect on your local database size. I suspect you’re thinking of the --blocksize parameter which specifies the size of each chunk a file is broken into. With the default of 100KB one of your 10GB source files would end up be broken into about 104,858 individual blocks, each with it’s own hash that gets stored in the database.

So this is probably the reason your verification SQL is on the slow side. I don’t know if that SQL has been targeted for performance improvements yet, but if it hasn’t already then it might be a while before it gets looked at.

Unfortunately, --blocksize (similar to encryption passphrase) is something you can’t change on an existing backup in Duplicati. So assuming this is the actual source of your performance issues, I’m afraid your options are pretty much:

  • live with it and hope a performance improvement gets rolled sooner than later (you could make it sooner by coding it yourself or supporting a bounty for it)
  • start over with a fresh backup using a larger --blocksize (you might want to do some tests to see how different sizes affect performance)
  • disable backend verification by enabling --no-backend-verification (which should skip the process that includes the slow step above, HOWEVER has the side effect of no longer making sure your backend files are all good - meaning if things go missing or get corrupted at the destination you won’t automatically be notified about it)
1 Like

Thanks for the detailed response. It is “Upload volume size” that I increased to 500MB.

If I understand the docs correctly the tradeoff of a larger blocksize is that if say 1 byte of a 10GB file changes the blocksize value is the smallest amount of data that must be uploaded to back up that change? With that in my case where I’m backing up large files that essentially never change setting a much larger blocksize is probably a reasonable approach.

Is there an upper limit to blocksize I should be wary of? Is 100MB okay? 250MB?

I figure at this point I’ll just re-backup all that data with a reasonable blocksize, if nothing else VACUUM on a 8.5GB sqlite DB is not exactly fast :smiley:

Are there other parameters that I should be aware of for large backup sets or when dealing with large files?

We’re getting a little bit off topic here, but I’d recommend looking at this post:

And here Kenkendk says in THEORY a block size of up to 2GB should be supported (though there’s no mention on whether or not that’s a good idea). :wink:

Overall I’d say a jump from a --blocksize of 100KB to 100MB should work just fine, but might be a bit drastic. At the default 100KB you’re looking at 10.7 million block-hash rows per 1 TB of source data.

Shifting to a more modest 1MB block size would take that down to just over 1 million block-hash rows. That won’t necessarily return a 10x smaller sqlite file and 10x faster performance, but should show a fair bit if improvement…

I’m jumping in here with a largely similar problem. I’ve got a 1.1T backup stalled at “Verifying backend data”. I’ve been through the repair/retry loop a few times.

Profiling gives me a last query of

SELECT “A”.“Hash”, “C”.“Hash” FROM (SELECT “BlocklistHash”.“BlocksetID”, “Block”.“Hash”, * FROM “BlocklistHash”,“Block” WHERE “BlocklistHash”.“Hash” = “Block”.“Hash” AND “Block”.“VolumeID” = ?) A, “BlocksetEntry” B, “Block” C WHERE “B”.“BlocksetID” = “A”.“BlocksetID” AND “B”.“Index” >= (“A”.“Index” * 3200) AND “B”.“Index” < ((“A”.“Index” + 1) * 3200) AND “C”.“ID” = “B”.“BlockID” ORDER BY “A”.“BlocksetID”, “B”.“Index”

The profile logs stop here. It stops after a couple of minutes after the “verifying backend” status is posted.

Looking up in the thread shows that this isn’t really new information. However, I don’t think anyone has tried this: I opened the local database in DB Browser and issued the offending query, which completed in a bit over 3 minutes, returning 0 rows.

The fact that after that final log record showing the “offending” query was posted, there were no new logs posted after running for days. We know that the query finishes in 3 minutes, so it’s not that it’s stuck in the query. It was asked and answered. Therefor, it’s likely that we’re in some sort of tight loop after this “sanity check” explained by @kenkendk earlier.

The error doesn’t play nice with CPU scheduling, as it’s consuming 100% of a core. Memory impact is light and not growing. I can’t break the process by any means other than killing mono or restarting the machine. While stuck, the GUI does respond to edits, etc. It just seems to affect the backup engine.

I hope this provides enough clues to someone familiar with the code to figure this out. It seems to be affecting a number of users.

I am happy to provide more information as time allows.

I could be completely off on this, but assuming I’m reading the code right, if no records are returned from the query then the GetBlocklists method in which it’s being called ends up not returning a Tuple at all.

From what I can tell this method is called from both RecreateMissingIndexFiles and RunRepairRemote both of which use the result of GetBlocklists in a foreach loop, but the RecreateMissingIndexFiles files loop is NOT in it’s own try / catch block, however the caller is in a “Fatal error” try / catch so from what I can tell any actual error should be getting handled.

Might it be possible to inject a few diagnostic profiling log postings into the canary build to try to figure out where this is going wrong? I’m not a C# guy and don’t know how the logs are being processed. Just seems like a good idea from the outside.

One more possible bit of info. I’m not sure whether this is a standard message or built from current operations, but I got this when pressing the “X” on the status bar.

image

Duplicati seems to be indicating that it’s uploading something. No evidence of it, though.

That is a standard message whether an upload is going or not.

What it’s really asking is “Should I ABORT what I’m doing RIGHT NOW” (will generate a System.Threading.ThreadAbortException error message) vs “Should I kindly stop at my next opportunity” (generally no error message generated, but might be a few minutes before it ACTUALLY stops).

Yeah, I thought that was going to be the answer.

I’ve tried answering the question both ways and it doesn’t stop. As I said, it’ll run till the power goes out.

Where in the project is the source code for the GetBlockLists function and its calling functions? I’m just scanning from github and haven’t installed the IDE to make searching easier.

If you search for GetBlockLists (upper left of screen) at the Duplicati GitHub page you should find that the tuple version is in Duplicati/Library/Main/Database/LocalDatabase.cs.

Thanks! I had no idea I could search for functions in github. Cool!

1 Like

Hi everyone,

Add my name to the list of people experiencing this problem.

My backup set is ~3.5TB. I’ve had the initial backup running for 6 days, but after a reboot I am stuck in Verifying Backup Data, which has now been running for about 12 hours.

I am desperate to get this fixed soon, as I am trying to get all of this backed up within the same Comcast billing cycle. If this issue causes the backup to head in to May, it’ll cost me another $200 in overage charges :frowning:

Is there absolutely anything I can do to help diagnose this issue? My symptoms are the same as everyone else: one full CPU core pegged and it doesn’t respond to any stop commands. I am running inside a Docker container, and I can see that it’s doing only a very small amount of disk and network I/O:

Edit: Update: After about 16 hours, the backup has continued, but it’s not going well. The backup is only sending up chunks once every 5 minutes or so, with huge periods of inactivity between them, as you can see in this screenshot:

The CPU is pegged the whole time, so it does seem to be doing something. For the first week or so of this backup, I was getting a constant 3MB/s upload, which is what I have the cap set to, so it does seem like something changed.

Hi @OverloadUT, welcome to the forum!

This is still part of your initial backup, right?

If so, I suspect your initial 3MB/s upload speed (which I assume was fairly constant / frequent) was because pretty much every block found needed to be backed up.

But as more and more blocks are sent, it becomes more and more likely that new blocks are duplicates of existing content so you could now be spending more time identifying duplicate blocks than uploading new ones.

If that’s what’s going on, then there’s not much you can do to speed things up. However, if it’s just a general performance issue related to compression then you could try using a lower compression setting (of course that means more files and this more bandwidth and destination requirements).

At this point it’s probably to late to try, but if you have the local storage you could “seed” your entire backup to a local drive then manually upload it to your destination during whatever billing periods bake sense. Then point your backup job from the local destination to the cloud one and let it continue with deltas only from there.

Oh - and you didn’t mention what version of Duplicati you are using, but a new Canary was just released with a lot performance improvements. Of course they’re mostly designed around concurrent processing and since your CPU is already fully utilized it may not make a difference…

2 Likes