Test command produces "Extra" errors

dm413 · January 19, 2022, 4:27pm

Duplicati version 2.0.6.3
Windows 10

I use Duplicati to back up to two local external USB drives. I have separate backup configurations for each drive, and I have 3 separate configs for each drive, because I use different retention rules for different sets of files. So I have 6 configs.

One of my external drives recently failed and I replaced it. I could have just started backing up to the new drive, but I wanted to preserve the history (old versions) on the new drive if I could. Here is what I did:

Copied the entire back end duplicati folder for each backup config from the old but working drive (call it drive S) to my new drive (call it drive T).
Used WinMerge to do a bit for bit comparison between these files on drives S and T to confirm that they are identical.
In Duplicati GUI, for the three configs on drive T, I did Advanced > Database > Recreate to rebuild the database for that config.
In Duplicati GUI, for the three configs on drive S and the three configs on drive T, I did Advanced > Commandline and then configured a “test” command with arguments “all” and “full-remote-verification” checked.

For all three of the drive S configurations and two of the configurations on drive T, this worked fine. The test reports no errors. But for the third drive T configuration, I get “extra errors”. For example:

duplicati-i647896ef3b4347c093c19ccc6578f06> a.dindex.zip.aes: 611 errors
Extra: +2BS4TYqLHBK5XBrbcZgf1D3YPaxhxIpVD3KJd/GQJM=
Extra: +39M6k6qC1X8BZR7C7a1qhs2RKo5adQgDNy7QbNlEA0=
Extra: +3VYSxx32ldoTO4F6hQJiWxG1ehwMTcFCYAsrkAt7HA=
Extra: +C7AxZjSrWQ26GoVKDZ2aq3NjaLf5Oy/vpjJPa0WliY=
Extra: +E4E+vyz+Z/wrXmm8Q2LPWF7vqV2wVN+kV1qpCEGrxY=
Extra: +IsV5Cf2RP17rJpsILDwWoU6pINbH7g4q4p39xAF6F4=
Extra: +i0Q4jwhe/vATlnApoB2uk6al5DJaWNGaSnkTIAR9Y0=
Extra: +kv5NRAN7MOcQMv/PgnNUGzi7FNY1VyBz7auYQj6P+c=
Extra: +oiUawFK9is8BuTDKW49KbuDL1K4EPnx6nU6wStBDxY=
Extra: +xHRCXnxuL0Ws241CYRstSQEqvrHQLJkQ8MJZ67A1sg=
… and 601 more

Note that I do not get these errors running test on drive S, but I do get them running test on drive T, even though the the duplicati files are identical between the two drives.

What could cause this, and how can I fix it?

Thanks,

ts678 · January 20, 2022, 12:46am

Welcome to the forum @dm413

Maybe all excludes files in certain unusual states that are only in the old database?
To really see that would possibly need links to database bug reports to be compared.
DB Browser for SQLite can be used if you want to do it yourself with some guidance.

If you’ve gone ahead already and begun backups, you can set upload-verification-file.
This will make a duplicati-verification.json file in destination folder. You can read it for
the file with the "Extra"s, and see if state (and maybe more) is same between drives.

A good place to see how many files were examined would have been at end of a test:

Running commandline entry
Finished!

            
  Listing remote folder ...
  Downloading file (643 bytes) ...
  Downloading file (643 bytes) ...
  Downloading file (567 bytes) ...
  Downloading file (607 bytes) ...
  Downloading file (509 bytes) ...
  Downloading file (677 bytes) ...
Examined 6 files and found no errors
Return code: 0

but one can also look at the logs and take a guess from the Complete log section about the activity:

“KnownFileCount”: 6
“FilesDownloaded”: 6,

If it looks like it actually tested the file in the no-error run, then database investigation may be needed.

dm413 · January 20, 2022, 6:13pm

FWIW, I did a test restore of both the S and T backups to different locations. I used no-local-blocks to make sure I was retrieving files from the external drives. I then compared the restored files and they compared identical. So both backups are there, and restore the same files. Of course this only compared the current files, not old versions.

The end of the test report for drive S (the drive without any “extra” errors) said:

Downloading file (249.90 MB) …
Downloading file (249.96 MB) …
Examined 341 files and found no errors
Return code: 0

There are in fact 341 files in my back-end folder on drives S and T – nice to see they agree.

I created database bug reports for the two drives. I opened the resulting sqlite files in DBeaver, but I don’t know what to look for. The block table For the T drive has more rows than the S drive (1,044,419 vs 1,031,401). Can you tell me what to look for? Or if I upload them to dropbox and provide a link is there anyone who would look at them? They are about 230 MB in size (each) zipped. Can I provide that link via PM or email rather than posting it in a public forum?

Thanks for your help,

ts678 · January 20, 2022, 8:27pm

This worries me a bit. How’s your ability to follow C# code? I think it’s here called from here.

Complete or a sample?

How deep do you want to go? You found the Block table, which is great, so maybe you’re comfortable.

User-level overview of backup tech is here, database diagram is here and a textual description is here.

What I had in mind for an easy start was to just check the State of that dindex in Remotevolume table.
You can filter Type field to see if Files and Blocks came in same amount (maybe in State Verified).
There was a debug try due to uneven external file counts, and the DB browser portion of it began here.

I don’t have a complaining dindex file handy, but I would guess that the output is the hash in Block table.
A dindex file is associated with a dblock file via IndexBlockLink table, and it lists hashes that it contains.

The question is whether you got some sort of bogus dindex to be ignored, or blocks are really missing.
Blocks form blocksets which form files (among other things), so one wants all the blocks to assemble.

You can PM me, and may other Duplicati staff look at it too if need be?

dm413 · January 20, 2022, 9:23pm

The test restore was complete, 45 GB.

The Remotevolume Table show 341 rows (same as the number of files in the back end), and all have state Verified. There are 5 with type File, 168 with type Block, and 168 with type index. This is the same for both backups (S and T).

What I find confusing is that it’s the backup where I rebuilt the database (T) that has the extra errors. The original backup (S) is fine. I would have guessed it would be the other way around.

The DeletedBlock table for drive S has 18,362 rows. The DeletedBlock table for drive T is empty. Is this significant?

I sent you a PM with a link.

ts678 · January 20, 2022, 9:42pm

Empty might be due to its definition:

github.com

duplicati/duplicati/blob/82c14c5cb77166393f73e0cfcb77e075faecb8c3/Duplicati/Library/Main/Database/Database schema/Schema.sql#L194-L204


      
          /*

          The deleted block hashes,

          mapped to the containing file,

          used for wasted space computations

          */

          CREATE TABLE "DeletedBlock" (

          	"ID" INTEGER PRIMARY KEY,

              "Hash" TEXT NOT NULL,

          	"Size" INTEGER NOT NULL,

          	"VolumeID" INTEGER NOT NULL

          );

because wasted space computation sounds like the after-backup-after-version-delete computation to decide whether or not to compact to reduce the wasted space. I don’t know code well enough to say whether that is why Block tables differ, but it’s a clue to keep in mind.

BTW I’m not sure rebuild was needed. The database is destination folder independent, with just file names. Still, you might have found something interesting, and been helpful in supplying some materials to look into.

dm413 · January 20, 2022, 11:49pm

I don’t know. Wasted space compaction would affect the back-end files, I would think, and that hasn’t happened here. The S and T back ends are still identical.

That’s interesting. I think I had to do something, because my database before the rebuild was for the backend on the old drive T that crashed, so that database would be wrong for the backend after copying it from drive S to the new drive T. But it sounds like I could have copied the database file (in %LocalAppData%/Duplicati) that corresponds to the drive S back end onto (replacing) the file that corresponds to the drive T back end. Do I understand you correctly? I could try that and then run a Test command again if you’d like.

This seems to imply that the rebuild is at fault and didn’t get correctly generated. That’s a little scary. And it’s not a consistent failure, because I had two other backup configs where I did the same thing, and they worked correctly with no “extra” errors after the rebuild.

dm413 · January 21, 2022, 12:20am

Also, this backup configuration has a retention policy of “keep all backups”. How can it have any deleted blocks at all, ever? Clearly I don’t understand something.

ts678 · January 21, 2022, 1:57am

First let me make sure I understand you. You lost a drive, but want to seed its replacement with whatever history remains on the surviving drive even if it’s not the exact history (backup dates, etc.) of broken drive.

To do this, you clone all the destination files, but instead of cloning the database that manages those files, you recreate the database. The database that managed the broken drive is, of course, no longer useful…

I’m trying to think of why clone of database for surviving drive couldn’t manage the clone of surviving drive. Having a hard time with that. One bonus (maybe) is that you will see the original log files until they age off.

If all that makes sense, feel free to give it a try.

Another possibility is the mysterious (to me) DeletedBlock information makes a difference to the Compare:

github.com

duplicati/duplicati/blob/82c14c5cb77166393f73e0cfcb77e075faecb8c3/Duplicati/Library/Main/Database/LocalTestDatabase.cs#L358-L367


      
          public IEnumerable<KeyValuePair<Duplicati.Library.Interface.TestEntryStatus, string>> Compare()

          {

              var cmpName = "CmpTable-" + Library.Utility.Utility.ByteArrayAsHexString(Guid.NewGuid().ToByteArray());

              var curBlocks = @"SELECT ""Block"".""Hash"" AS ""Hash"", ""Block"".""Size"" AS ""Size"" FROM ""Remotevolume"", ""Block"" WHERE ""Remotevolume"".""Name"" = ? AND ""Remotevolume"".""ID"" = ""Block"".""VolumeID""";

              var delBlocks = @"SELECT ""DeletedBlock"".""Hash"" AS ""Hash"", ""DeletedBlock"".""Size"" AS ""Size"" FROM ""DeletedBlock"", ""RemoteVolume"" WHERE ""RemoteVolume"".""Name"" = ? AND ""RemoteVolume"".""ID"" = ""DeletedBlock"".""VolumeID""";

              var create = @"CREATE TEMPORARY TABLE ""{0}"" AS SELECT DISTINCT ""Hash"" AS ""Hash"", ""Size"" AS ""Size"" FROM ({1} UNION {2})";

              var extra = @"SELECT ? AS ""Type"", ""{0}"".""Hash"" AS ""Hash"" FROM ""{0}"" WHERE ""{0}"".""Hash"" NOT IN ( SELECT ""Hash"" FROM ""{1}"" )";

              var missing = @"SELECT ? AS ""Type"", ""Hash"" AS ""Hash"" FROM ""{1}"" WHERE ""Hash"" NOT IN (SELECT ""Hash"" FROM ""{0}"")";

              var modified = @"SELECT ? AS ""Type"", ""E"".""Hash"" AS ""Hash"" FROM ""{0}"" E, ""{1}"" D WHERE ""D"".""Hash"" = ""E"".""Hash"" AND ""D"".""Size"" != ""E"".""Size""  ";

              var drop = @"DROP TABLE IF EXISTS ""{1}"" ";

(which I find hard to read – consider this highly tentative – but you can see DeletedBlock in all that SQL)

While source-searching for DeletedBlock, it looks like (as expected) version deletions put entries in table.
What’s different with yours compared to my backup is that my block hashes are in either there or Blocks.
Your database has at least some (maybe all) of the hashes named in the error message in both tables…
Before becoming alarmed, remember that I don’t understand the table. I’m just documenting a difference.

There were some purge commands run after the first backup. Those sound like they might delete blocks, produce wasted space for compact to clean up, etc., but I haven’t found actual source code to prove that.

I suppose you could look at the databases that tested OK. What’s in DeletedBlock and Operation tables?

ts678 · January 21, 2022, 2:10pm

Actually, I think the code I saw earlier was it. When I tested purge, it moved blocks from the Block table into DeletedBlockTable. It rewrote the dlist file (as expected, because there were fewer files after the purge), but incremented the timestamp in file name by 1 second (not too surprising – it does this on upload retries too).

Basically the purge deleted the old version and created a purged version which is technically a new version.

Running test on all files with --full-remote-verification was fine, and (unfortunately for trying to catch the bug) was also OK after database Recreate which filled the Block table but (not surprisingly) not the DeletedBlock.

Before I go see if having a compact run will cause trouble, do you have --no-auto-compact option set or not?

dm413 · January 21, 2022, 2:28pm

I think you’re on to something.

I do not have --no-auto-compact set.

I had forgotten about the Purges until you mentioned them. As I recall, when I first set up this backup configuration, I did a backup and then realized there were some large files in the backup that I didn’t need backed up. I modified the config to exclude them, and then purged the files from the backup. I don’t think I did any purges with the other backup configs, but I haven’t yet created bug reports to verify that.

I am testing using a copy of my S drive database. Test with --full-remote-verification reports no errors. I am currently doing a full restore to verify that it works as expected.

ts678 · January 21, 2022, 2:52pm

The bug reports are just sanitized (hopefully) versions for more public sharing. For your own use, you could work on a copy of the database, or it’s probably safe to open the original database read-only to look around. Mine is that way a lot. I do need to refresh to stay current, and let go if I ever want Duplicati to delete the DB.

dm413 · January 21, 2022, 3:28pm

Oh, that’s good to know, saves me some time. I’m paranoid, so I made a copy of the database and then opened that in read-only mode.

I confirmed that the other backups do not have any purges in their operation tables. The S drive (original backups) show only Backup operations. The T drive (rebuilt database) show just 3 operations: Recreate, Repair and Test.

So it’s only the backup database with purges in it that has a problem.

ts678 · January 21, 2022, 4:05pm

Nice progress on trying to assemble clues to search for path to a reproducible problem. Help is more than welcome. I’ve been using a contrived test setup with small files (easy to recognize blocks from a file when they’re only 1 byte long - the blocks that are about 137 bytes are metadata such as file/folder timestamps).

I’ve also tried compact, and haven’t found the right sequence of steps yet. Duplicati also ages out old data such as in RemoteOperation table after 30 days. I looked for origins of your bad dindex. Record was gone.

dm413 · January 21, 2022, 4:18pm

I did a complete restore from the new drive T, using a clone of the S database. I have --no-local-blocks set in settings. This restores all files and those files match exactly when doing a compare with previous restores. Again this doesn’t test old versions of files, but it tests most of the files in the backup, and the test command (with full-remote-verification) does not report any errors.

I think this means I can use this database going forward. Do you agree?

I have also been thinking about making a smaller reproducible problem. My thought was to try:

Create a backup config which backs up a folder with two files of say 1 MB each. Or even 200 KB, just so long as it’s bigger than the block size.
Do a backup.
Modify the config to exclude one of the files.
Purge that file from the backup
Do a test, confirm no errors
Rebuild the database.
Do a test, do I see extra errors.

I will try this. Does this sound useful? Do you have any suggestions for the procedure?

ts678 · January 21, 2022, 4:33pm

It passes all the tests (including some unusually thorough but good restore testing), so I don’t know why not.

An interesting test might be to see if deleting DeletedBlock table content brings back test “Extra” complaints This shouldn’t change anything on the destination, and after that test you can put the “quiet” database again.

That’s exactly what leads to fixes. It needs to be something a developer can set up easily and go debugging.

You’re kind of retracing what I did, except I did a purge, then instead of an exclude I deleted the source file to keep it from coming back in on the next backup. I did tests or recreates here and there. Exact steps matter, and your results could easily be different. I would say try to approximate what you did on the actual backups unless you happen to find the magic formula with the contrived test. Great logs can also be hugely valuable, however can also be huge, but if a reproducible case is found, someone else can do that during the debug.

Thanks so much for helping chase this.

dm413 · January 21, 2022, 6:36pm

Good idea. I tried this, and it does produce “Extra” errors in the test.

For the record, I:

Opened the database in DBeaver
Ran DELETE FROM DeletedBlock on the database.
In Duplicati, ran Test all with --full-remote-verification on the same database.

However, I also checked the DeletedBlock tables in my backups that don’t produce errors. On drive S, they contain records (not because of a purge, but because those backups have a retention policy). On drive T, with the rebuilt database, the DeletedBlock tables do not contain any rows, but I do not get Extra errors when testing them.

dm413 · January 22, 2022, 5:50pm

Unfortunately, I have not been able to reproduce the problem. I ran the test I outlined above and did not see any “extra” errors after the rebuild. I tried variations of that sequence, without seeing any errors.

I think I have run out of things to try. After cloning the “good” database I no longer get the “extra” errors, and I am able to restore all files, so my immediate problem is solved. But it is concerning that a database rebuild causes errors to appear.

ts678 · April 26, 2022, 10:52pm

FYI

test all with full-remote-verification shows “Extra” hashes from error in compact #4693

is my explanation for this issue in my backup, unnoticed until I looked for it after each backup.
Tracking it down fairly well took quite a bit more work. Bottom line is it looks like benign noise.

Even though it appears harmless, it has the ability to survive Recreate because it’s in backup.
You might find that these errors come and go because future compact operations can do that.

Sami_Lehtinen · April 27, 2022, 10:39am

I just personally seriously dislike that the Extra are classified as errors. It just should be notification, unnecessary extra blocks in file. - Totally normal, and technically meaningless, will be ignored.