Without changing any settings (still at 50mb), last night’s scheduled backup failed:
System.InvalidOperationException: Previous volume not finished, call FinishVolume before starting a new volume
at Duplicati.Library.Main.Volumes.IndexVolumeWriter.StartVolume (System.String filename) [0x00010] in <ae134c5a9abb455eb7f06c134d211773>:0
at Duplicati.Library.Main.Operation.BackupHandler.UpdateIndexVolume () [0x00043] in <ae134c5a9abb455eb7f06c134d211773>:0
at Duplicati.Library.Main.Operation.BackupHandler.FinalizeRemoteVolumes (Duplicati.Library.Main.BackendManager backend) [0x0016b] in <ae134c5a9abb455eb7f06c134d211773>:0
at Duplicati.Library.Main.Operation.BackupHandler.Run (System.String[] sources, Duplicati.Library.Utility.IFilter filter) [0x008e0] in <ae134c5a9abb455eb7f06c134d211773>:0
at Duplicati.Library.Main.Controller+<>c__DisplayClass17_0.<Backup>b__0 (Duplicati.Library.Main.BackupResults result) [0x00036] in <ae134c5a9abb455eb7f06c134d211773>:0
at Duplicati.Library.Main.Controller.RunAction[T] (T result, System.String[]& paths, Duplicati.Library.Utility.IFilter& filter, System.Action`1[T] method) [0x0014b] in <ae134c5a9abb455eb7f06c134d211773>:0
at Duplicati.Library.Main.Controller.Backup (System.String[] inputsources, Duplicati.Library.Utility.IFilter filter) [0x00068] in <ae134c5a9abb455eb7f06c134d211773>:0
at Duplicati.Server.Runner.Run (Duplicati.Server.Runner+IRunnerData data, System.Boolean fromQueue) [0x00494] in <670b0e4a9ae144208688fcb192d20146>:0
At this point it’s running every night, and has failed four of the past five days. Is there additional logging or something I can turn on to diagnose this, so it can get fixed for real?
Probably the easiest way to approach this is to generate a log file (for your version I think you’d use --log-file=<path> and --log-level=profiling) then attach it to a post.
The profiling logs helped. The error (not in the logs, but in the generated email body):
C: Failed: Found inconsistency in the following files while validating database:
C: /private/var/folders/zz/zyxvpxvq6csfxvn_n00000s0000068/T/ProductMetadata.plist, actual size 1120007, dbsize 0, blocksetid: 490127
C: .. Run repair to fix it.
The query which seems to generate this, from the logs:
SELECT "CalcLen", "Length", "A"."BlocksetID", "File"."Path" FROM (
SELECT
"A"."ID" AS "BlocksetID",
IFNULL("B"."CalcLen", 0) AS "CalcLen",
"A"."Length"
FROM
"Blockset" A
LEFT OUTER JOIN
(
SELECT
"BlocksetEntry"."BlocksetID",
SUM("Block"."Size") AS "CalcLen"
FROM
"BlocksetEntry"
LEFT OUTER JOIN
"Block"
ON
"Block"."ID" = "BlocksetEntry"."BlockID"
GROUP BY "BlocksetEntry"."BlocksetID"
) B
ON
"A"."ID" = "B"."BlocksetID"
) A, "File" WHERE "A"."BlocksetID" = "File"."BlocksetID" AND "A"."CalcLen" != "A"."Length" took 00:01:10.704
So I started poking into the sqlite db:
sqlite> select * from File where Path = '/private/var/folders/zz/zyxvpxvq6csfxvn_n00000s0000068/T/ProductMetadata.plist';
192852|/private/var/folders/zz/zyxvpxvq6csfxvn_n00000s0000068/T/ProductMetadata.plist|118275|30435
570675|/private/var/folders/zz/zyxvpxvq6csfxvn_n00000s0000068/T/ProductMetadata.plist|118275|97969
572932|/private/var/folders/zz/zyxvpxvq6csfxvn_n00000s0000068/T/ProductMetadata.plist|490127|98805
sqlite> select * from Blockset where ID in (118275, 490127);
118275|1120009|mM3nZjV9aw/nk2nAZkqn/aMT1kzzY6Z9rHpwK5O0u6Q=
490127|1120007|ucXqV2XWpYqD7bGTnI99H80oPIlqG9HS8EpPpCq3E+A=
sqlite> select * from BlocksetEntry where BlocksetID in (118275, 490127);
118275|0|211318
118275|1|211319
118275|2|211320
118275|3|211321
118275|4|211322
118275|5|211323
118275|6|211324
118275|7|211325
118275|8|211326
118275|9|211327
118275|10|211328
Oh look. There appear to be no blockset entries for BlocksetID 490127.
I ran repair, as suggested. After, there are still no blockset entries for this BlocksetID. So the DB is corrupt, and repair doesn’t repair it.
That error is removed in the latest canary build AFAIK
If you have the original database (not the recreated one), can you perhaps see when the empty blockset is created?
It could happen if there was some weird hash collision, I guess, but more likely something else has happened.
If you just want to recover, you should be able to just use purge to remove the file, and then run the backup again, but I would really like if we could zoom in and find the cause of the empty set.
I do have the original database. But I’m not sure how to see when the empty blockset was created. The BlockSet table doesn’t seem to have any kind of timestamp column. What should I look for?
I tried to use purge to remove the broken file using the cli, but now I get this:
/Applications/Duplicati.app/Contents/MacOS/duplicati-cli purge 'b2://...' /private/var/folders/zz/zyxvpxvq6csfxvn_n00000s0000068/T/ProductMetadata.plist --dry-run --dbpath=/var/root/.config/Duplicati/WCTNTGAABX.sqlite
Enter encryption passphrase:
No certificates found, you can install some with one of these commands:
cert-sync /etc/ssl/certs/ca-certificates.crt #for Debian based systems
cert-sync /etc/pki/tls/certs/ca-bundle.crt #for RedHat derivatives
Read more: http://www.mono-project.com/docs/about-mono/releases/3.12.0/#cert-sync
Starting purge operation
Unable to start the purge process as there are 527 orphan file(s)
Update "2.0.3.3_beta_2018-04-02" detected
I searched in the forum and see some very old reports about orphan files and the cleanup command, but there is no such command in my cli. I tried adding --auto-cleanup=true to my command line, but that didn’t help.
I don’t know if it’s related, but I notice the output says there’s an update available. When I run duplicati-cli help, it says at the end Version: - 2.0.2.1_beta_2017-08-01 which is surprising since the web UI says You are currently running Duplicati - 2.0.3.3_beta_2018-04-02. I ran duplicati-cli test --auto-update=true which fetched the update. And coincidentally, this happened:
So there seems to be a problem with the upstream files, too. This is all feeling a lot like at some point, a backup was interrupted, and duplicati did not recover well at all.
Since I figured I was likely to just have to start all over again, I took a look at the code. I think the sanity checks in the repair logic were keeping the code from actually repairing my damaged database, so I downloaded the dlist file above, deleted it from b2, moved my sqlite file out of the way, then did repair. I did it from the cli, which gives a little more detailed status output. This forced a much longer repair process than I’ve observed before, different from what the UI “recreate” process seems to do:
Rebuild database started, downloading 3 filelists
Downloading file (8.48 MB) ...
Downloading file (40.60 MB) ...
Downloading file (40.61 MB) ...
Mismatching number of blocklist hashes detected on blockset 488284. Expected 0 blocklist hashes, but found 1
Mismatching number of blocklist hashes detected on blockset 488453. Expected 0 blocklist hashes, but found 1
Filelists restored, downloading 707 index files
Downloading file (339.56 KB) ...
Downloading file (255.95 KB) ...
Downloading file (171.62 KB) ...
Downloading file (140.06 KB) ...
...
Recreate completed, verifying the database consistency
Recreate completed, and consistency checks completed, marking database as complete
After this, I let the normal nightly backup run, and it completed successfully (with a few warnings) for the first time in almost a month. I’m hopeful it will complete tonight as well.
My take-away is that the tools need a way to force various kinds of checks and recovery, since not all the consistency checks run at the same time everywhere.
As one very concrete example, I was in a catch-22 where I couldn’t purge due to orphans, but I couldn’t clean up the orphans because of a validation failure due to the file I needed to purge. PreBackupVerify is responsible for cleaning up orphans, but the failure at VerifyConsistencyAsync meant I had no way to get there, which is why I forced the system to recreate the db from scratch.
Failed: Previous volume not finished, call FinishVolume before starting a new volume
Details: System.InvalidOperationException: Previous volume not finished, call FinishVolume before starting a new volume
at Duplicati.Library.Main.Volumes.IndexVolumeWriter.StartVolume (System.String filename) [0x00010] in <ae134c5a9abb455eb7f06c134d211773>:0
at Duplicati.Library.Main.Operation.BackupHandler.UpdateIndexVolume () [0x00043] in <ae134c5a9abb455eb7f06c134d211773>:0
at Duplicati.Library.Main.Operation.BackupHandler.FinalizeRemoteVolumes (Duplicati.Library.Main.BackendManager backend) [0x0016b] in <ae134c5a9abb455eb7f06c134d211773>:0
at Duplicati.Library.Main.Operation.BackupHandler.Run (System.String[] sources, Duplicati.Library.Utility.IFilter filter) [0x008e0] in <ae134c5a9abb455eb7f06c134d211773>:0
at Duplicati.Library.Main.Controller+<>c__DisplayClass17_0.<Backup>b__0 (Duplicati.Library.Main.BackupResults result) [0x00036] in <ae134c5a9abb455eb7f06c134d211773>:0
at Duplicati.Library.Main.Controller.RunAction[T] (T result, System.String[]& paths, Duplicati.Library.Utility.IFilter& filter, System.Action`1[T] method) [0x00072] in <ae134c5a9abb455eb7f06c134d211773>:0
I deleted the backup set, deleted the databases, deleted the files on B2, restarted duplicati, created a new backup by hand, and the backup failed with the “Previous volume…” error the very first time through. So we can be sure, at least, the issue isn’t corruption or inconsistency from a previous backup.
I believe this is a bit of a catch-all error were something else goes bad but that error isn’t caught which causes THIS to go bad, where the error is caught.
Are you still using the 50MB upload volume size? It should’t be needed (I know some people have gone up to 1GB) but it seemed to resolve the issue for you last time…
In the past, it sounded like one of the sources of this error was a “dinosaur” file - though this should have been fixed long before 2.0.3.3 beta… though it’s possible this specific bug snuck back in…
I’m observing the same “call FinishVolume” issue here: after a downgrade (including database schema changes) from canary to beta, every backup produces these four errors (or similar):
30 Jun 2018 06:14: Message
Expected there to be a temporary fileset for synthetic filelist (106, duplicati-b00913e73e805478e8ae9e019c14f67c7.dblock.zip.aes), but none was found?
30 Jun 2018 06:13: Message
scheduling missing file for deletion, currently listed as Uploading: duplicati-ie8bc4ccd0031487aa7807fe78e42c81d.dindex.zip.aes
30 Jun 2018 06:13: Message
scheduling missing file for deletion, currently listed as Uploading: duplicati-b2772ab46f2ea4fe49c87efc604c4f236.dblock.zip.aes
30 Jun 2018 06:13: Message
removing file listed as Temporary: duplicati-20180629T132605Z.dlist.zip.aes
While it’s running I frequently can’t view the logs because “the database is locked” (or similar).
Oddly, it also claims that the backup has succeeded, until quite some time after it has finished. So for example on today’s run, while it was still going, Duplicati said that the last successful backup was yesterday (which failed), and while it was tidying up it I think it said today’s was the last successful backup, but it’s now saying that the last successful run was in May. Is there anything I could do to help dig into the cause of this?
I’d suggest you first try a restore and see if the any of the “failed” backups are actually available for restoring. I suspect what’s happening is that the backup part of the job runs just fine, but them something catastrophic happens during a later step like retention or compacting.
Beyond that, since this involves a database downgrade I’m going to see if maybe @Pectojin has any thoughts.
Only the most recent failed backup ever seems to be available to restore, but there is always very little material contained within it. Therein lies a clue: I had moved my server directory away from a too-small partition a while back, as a possible solution to problems I was encountering, and the only contents of these ephemeral failed backups were a couple of files from the Duplicati server directory. As it turns out, I’d foolishly moved the server directory to a partition which was getting backed up. Backing up the 9GB backup database, with all its churn (and perhaps also file locks?) might be the problem. Retrying with revised configuration now.
@conrad Glad to hear that you sorted it out. Is this the beta build? I would like to believe that the problem is fixed in the canary, but if you see the problem there, I need to do more digging…
Hi Ken, no, that’s the April beta still, so not canary & nothing to worry about.
I did try running canary on this backup, but it seemed to have extreme database performance issues with my 9GB database: my backups on canary back in March were taking ~6h, on April’s beta they’re now ~13h, but canary (would’ve been April-era, with version 5 database) failed to complete one backup in three weeks: it was backing up new files (mainly photos around 10MB) at a rate of only a few per day.