Problems with repair


#1

Hi,

I’ve just finished a very long upload to S3. At the end of it, it turned out that one or two volume files had problems - no doubt due to our slightly unreliable internet connection. IIRC duplicati reported something about invalid headers for those files. Foolishly, I thought the best thing to do would be to delete them manually from S3 and then rerun. Obviously this caused an error in duplicati and prompted me to repair to fix the “missing remote file”. I tried this, but the repair failed and just reported the missing remote file again. So then I progressed to running recreate, but this fails with exactly the same error. I’m puzzled because I thought the whole point of repair/recreate was to fix these problems! What do I do now? I don’t have to have to upload all 180GB over our shakey Wimax connection again!

Here’s the result of the recreate operation:

MainOperation: Repair
RecreateDatabaseResults:
MainOperation: Repair
ParsedResult: Error
EndTime: 09/04/2018 08:05:08 (1523253908)
BeginTime: 09/04/2018 07:39:43 (1523252383)
Duration: 00:25:25.1435990
Messages: [
    2018-04-09 07:39:43 +02 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Repair has started,
    2018-04-09 07:39:43 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  (),
    2018-04-09 07:39:48 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (6.12 KB),
    2018-04-09 07:40:40 +02 - [Information-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-RebuildStarted]: Rebuild database started, downloading 3 filelists,
    2018-04-09 07:40:40 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-20180106T133944Z.dlist.zip.aes (765 bytes),
...
]
Warnings: []
Errors: [
    2018-04-09 08:00:53 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-bec141752409545c1b6bd36f65f734722.dblock.zip.aes, but not found in list, registering a missing remote file
]
BackendStatistics:
    RemoteCalls: 3136
    BytesUploaded: 0
    BytesDownloaded: 125242963
    FilesUploaded: 0
    FilesDownloaded: 3135
    FilesDeleted: 0
    FoldersCreated: 0
    RetryAttempts: 0
    UnknownFileSize: 0
    UnknownFileCount: 0
    KnownFileCount: 0
    KnownFileSize: 0
    LastBackupDate: 01/01/0001 01:00:00 (-62135596800)
    BackupListCount: 0
    TotalQuotaSpace: 0
    FreeQuotaSpace: 0
    AssignedQuotaSpace: 0
    ReportedQuotaError: False
    ReportedQuotaWarning: False
    ParsedResult: Error
    Messages: [
        2018-04-09 07:39:43 +02 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Repair has started,
        2018-04-09 07:39:43 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  (),
        2018-04-09 07:39:48 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (6.12 KB),
        2018-04-09 07:40:40 +02 - [Information-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-RebuildStarted]: Rebuild database started, downloading 3 filelists,
        2018-04-09 07:40:40 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-20180106T133944Z.dlist.zip.aes (765 bytes),
...
    ]
    Warnings: []
    Errors: [
        2018-04-09 08:00:53 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-MissingFileDetected]: Remote file referenced as duplicati-bec141752409545c1b6bd36f65f734722.dblock.zip.aes, but not found in list, registering a missing remote file
    ]
ParsedResult: Error
EndTime: 09/04/2018 08:05:09 (1523253909)
BeginTime: 09/04/2018 07:39:43 (1523252383)
Duration: 00:25:25.8491800
Messages: [
2018-04-09 07:39:43 +02 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Repair has started,
2018-04-09 07:39:43 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  (),
2018-04-09 07:39:48 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (6.12 KB),
2018-04-09 07:40:40 +02 - [Information-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-RebuildStarted]: Rebuild database started, downloading 3 filelists,
2018-04-09 07:40:40 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-20180106T133944Z.dlist.zip.aes (765 bytes),
...
]
Warnings: []
Errors: [
    2018-04-09 08:00:53 +02 - [Error-Duplicati.Library.Main.Operation.RecreateDatabaseHandler-    MissingFileDetected]: Remote file referenced as duplicati-bec141752409545c1b6bd36f65f734722.dblock.zip.aes, but not found in list, registering a missing remote file
]

#2

Do you recall with what error the repair failed?

There have been a few reports lately of database recreates NOT actually solving a missing file issue, but I’m pretty sure all the successful database REPAIRS have handled it correctly.

Did you try a repair again after the recreate finished? Hopefully that will do the trick. (And yes, we realize this isn’t the best user experience - once we can pin down the exact cause of the issue we’ll hopefully be able to fix it.)


#3

Hi,

Unhelpfully the repair just fails with an “error” but records nothing in the log - whether or not I run the recreate first. However when I turn on verbose logging to a file, I get this from the repair operation:

2018-04-11 08:30:58 +02 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Repair has started
2018-04-11 08:30:58 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()
2018-04-11 08:31:05 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (6.12 KB)
2018-04-11 08:31:05 +02 - [Error-Duplicati.Library.Main.Operation.RepairHandler-CleanupMissingFileError]: Failed to perform cleanup for missing file: duplicati-bec141752409545c1b6bd36f65f734722.dblock.zip.aes, message: Unexpected empty block volume: duplicati-bec141752409545c1b6bd36f65f734722.dblock.zip.aes
System.Exception: Unexpected empty block volume: duplicati-bec141752409545c1b6bd36f65f734722.dblock.zip.aes
  at Duplicati.Library.Main.Database.LocalRepairDatabase+MissingBlockList..ctor (System.String volumename, System.Data.IDbConnection connection, System.Data.IDbTransaction transaction) [0x000ac] in <3ff9b0af6fdf48d2af885dde3389fb14>:0 
  at Duplicati.Library.Main.Database.LocalRepairDatabase.CreateBlockList (System.String volumename, System.Data.IDbTransaction transaction) [0x00000] in <3ff9b0af6fdf48d2af885dde3389fb14>:0 
  at Duplicati.Library.Main.Operation.RepairHandler.RunRepairRemote () [0x00bb9] in <3ff9b0af6fdf48d2af885dde3389fb14>:0

Any ideas how I sort this out?


#4

Unfortunately, I haven’t run into this error before…and while I see it in the code, but it’s not an area I’ve worked on before. Perhaps @kenkendk will have a suggestion…


#5

This happens because the file in question appears to have no content, which should not happen (meaning, Duplicati should not create such a file).

Does the file exist on S3?
If yes, what size is it? What happens if you remove it from S3 (please keep a copy elsewhere)?
If no, I think the problem is that Duplicati somehow “knows” that the file should be there even though it is not. When it then tries to figure out what contents are in that file, it fails (should be fixed in code then).


#6

Hi @kenkendk,

The file did exist, and was causing errors - I think something about a bad header but I’m afraid I don’t remember exactly. On inspection it turned out that the file was significantly smaller than any of the other volumes - possibly zero bytes, but again I’m afraid I don’t remember exactly. I decided to just delete it in the hope that duplicati would work out what would need re-uploading afterwards. Since then I’ve been getting the errors mentioned in this thread. Right now, the file does not exist on S3 according to the AWS web interface. I don’t have S3 versioning turned on so I can’t see any reason why S3 should report it as present even when it has been deleted. The files are all stored with the “infrequent access” policy if that makes a difference.

Let me know if there’s anything else I can do to help debug this.

Cheers,


#7

Sorry to bump @kenkendk, but is there any chance of some advice on what to do next on this? I have a non-functional backup right now that will take a month to recreate if I have to delete and start again. Should I raise a bug for this or would you prefer it handled on here? I’m just keen that it doesn’t end up getting lost because right now I’m totally stuck :frowning:

Thanks,


#8

@kenkendk @JonMikelV Hey guys, any suggestions about what I should do next here? Should I be filing a bug? I’ve just tried with the latest canary and I still can’t fix the problem with any combination of repair/recreate.

Is there some SQL I could run against the DB to purge the relevant record(s) for the missing file to get my backup working again?

Cheers,


#9

Is the duplicati-bec141752409545c1b6bd36f65f734722.dblock.zip.aes still causing problems or are there move files involved?

If the issue now is that duplicati-bec141752409545c1b6bd36f65f734722.dblock.zip.aes is missing, a database recreate (or maybe repair) might resolve it.

If not, then you could try purging the files associated with the deleted dblock file from the backup so Duplicati won’t be trying to use it anymore…


#10

Hi,

It’s still that same file that is causing the problem. I’ve tried every combination of recreate/repair I can think of and I always get the same problem (as outlined upthread). I’m not sure what you mean by “purging the files associated with the deleted dblock file from the backup” - can you elaborate on what this entails and how I’d go about doing it?

Thanks,


#11

Ok, so I think I may have solved this (for the time being). I broke out sqlite3 and actually started digging in the local database. It turns out that the Remote volume duplicati-bec141752409545c1b6bd36f65f734722.dblock.zip.aes had no blocks associated with it, and consequently no files either. So I took a copy of the sqlite file for my backup and just deleted the relevant entries from Remotevolume and IndexBlockLink.

My backup now runs to completion, which is good, but I have some residual concerns here:

  1. Obviously it’s possible to get into this state without doing anything too crazy, and the app seems incapable of recovering from it. It would seem like a good idea to put some cleanup code in to cope with these empty volumes if they somehow happen, rather than having it break the whole backup
  2. I’m worried that the next time I have to recreate/repair I may end up with the problem again. I’m guessing that the volume is somehow referenced in the encrypted file/block lists stored on the remote storage (there was an entry in IndexBlockLink for the empty volume), and I don’t understand enough about how duplicati works to know how/when these remote indices get updated. I guess addressing point (1) would probably be a good way to ensure that this doesn’t become a problem in the future.

Let me know if there’s anything else I can do to help address any of this.

Cheers,


#12

Doooh. We should detect and remove such files when recreating the database, and possibly also during the repair step. To prevent it, we need to figure out when Duplicati would create such files.

The dindex files are created at the same time as the dblock files, and are just a map of what is inside the dblock file. Having an empty dblock is a strange situation, but my best guess is that it happens as part of the compact operation, where it somehow ends up with the last file being empty, and instead of just ignoring it, it uploads the file, causing trouble.

If you can figure out when duplicati-bec141752409545c1b6bd36f65f734722.dblock.zip.aes was created (look at the OperationID in Remotevolume and then look in the Operation table to see what operation created it) we can perhaps zoom in on where it happens.


#13

Hi @kenkendk,

Thanks for the response. As I said in my previous message, I’ve found a temporary solution by manually editing the SQLite database to remove the offending record, but I remember checking on the operation that created it - which was a “Recreate”. Unfortunately, given that I’ve done many rounds of recreation to try and fix this, we’ve lost the information about the original creation of the bad volume, but what is clear is that it is being regenerated during each recreate. Obviously the ideal thing here would be to prevent the original problem from happening, but unless there are backups of the previous SQLite databases from before the various recreates I’ve done, I’m not sure we’re going to be able to get the information about what happened there. If I see the same behaviour on a different backup I now know more about where to look so I’ll try and get some more debug info on that.

In the meantime, it would seem like a good idea to put some sort of check in to prevent the recreate operation re-making these empty volume records. I presume (from very limited knowledge) that there must be something in one of the files on S3 (presumably the dlist files?) that references the missing dblock file which is causing the remotevolume record to be recreated even though there’s no file corresponding to it? I’m going to try and do another recreate now to see if the problem comes back and I’ll report back.

Cheers,


#14

Ok, that indicates that the empty data is stored in the dindex files.

Yes. We should both remove them on recreate, but also add a check so that we can catch it when they are created created and prevent it.


#15

Hi,

I can confirm that after running another recreate, the problem recurs, so there’s clearly still something awry with the data on S3. For the time being I can remove the offending record from the DB manually again, but obviously for the future it would be good to have something that can fix the remote data in this case so that future recreate operations work correctly.

Cheers,


#16

If you delete (or better yet, move to another folder) the “bad” fileset (or at least dlist and the empty dblock files) then did a repair wouldn’t that recreate things without the empty file?

Of course I’m no expert and this is just MY laymans theory, so if you plan to try this before @kenkendk comments on the idea be sure to back up the dlist, your local database, and anything else you might be worried about losing. :slight_smile:


#17

Hi,

My concern about doing this is that I’m not sure how many files each dlist references and I really don’t want to risk having duplicati try to re-upload a meaningful proportion of my backup since it took many weeks to get all the data up to S3 in the first place. Right now I have a functional workaround by hacking the local db, and can happily wait until a new version arrives that cleans the problem up more correctly. Leaving it as it is also has the secondary benefit of providing a real testcase when such a new version does arrive…

Cheers,


#18

Sounds like some pretty good reasons to wait for an official solution. :wink: