Scheduled backup hangs on Put to S3 - "disk I/O error"

mewa · October 23, 2024, 1:47am

Hello everyone, I need some help with the disk i/o error below.

My scheduled backups have recently started to get stuck (waited 12+ hours, then force-closed the service to terminate). “204664 files (2.69 TB) to go at 140.57 KB/s”, “Current action: Backup_ProcessingFiles”, and “Current file: C:\Users\***\AppData\Local\Duplicati\VUVBXNELZV.sqlite-journal” displays in the web page. There is no network activity in resource monitor, and the only disk activity from Duplicati is in the SQLIte DB file associated with this backup task. This disk activity slowly decreases and eventually ceases, but the backup task is still running with the message above, and the KB/s slowly ticking down. The other weird part is that the backup occasionally succeeds.

Setup:

Backup from local computer to Amazon S3
Duplicati 2.0.8.1_beta_2024-05-07 is installed on the boot SSD (Windows 11 23H2), temp folder is on a separate SSD, most data in on an HDD.
The scheduled backup has been running successfully for many months now, this error started to appear very recently. The only change I can identify is that I upgraded to a faster internet (from 30Mbps to 1Gpbs up), and changed the backup schedule from weekly to daily. Have not yet gotten public IPv6 to work correctly, rest is good.

Here’s what I have tried:

Running chkdsk /f on all drives - no errors (did not run surface test)
Check SMART diagnostics for all drives - no errors, no uncorrectable/reallocated sectors on HDD
Turned on verbose logs - only error is below
Opened duplicati job database (VUVBXNELZV.sqlite) in a SQLite browser, ran PRAGMA integrity_check; - no errors.
Tried to locate the file “duplicati-bcc7d5a88d8e9412cbd01576c7eaf738a.dblock.zip.aes” on any disk - not found (dir /s /a)
Downloaded and reinstalled duplicati (did not remove old install)
Searched these forums for “disk I/O error”

2024-10-22 06:57:53 -04 - [Retry-Duplicati.Library.Main.Operation.Backup.BackendUploader-RetryPut]: Operation Put with file duplicati-bcc7d5a88d8e9412cbd01576c7eaf738a.dblock.zip.aes attempt 1 of 6 failed with message: disk I/O error disk I/O error code = IoErr (10), message = System.Data.SQLite.SQLiteException (0x800007FF): disk I/O error disk I/O error at System.Data.SQLite.SQLite3.Reset(SQLiteStatement stmt) at System.Data.SQLite.SQLite3.Step(SQLiteStatement stmt) at System.Data.SQLite.SQLiteDataReader.NextResult() at System.Data.SQLite.SQLiteDataReader..ctor(SQLiteCommand cmd, CommandBehavior behave) at System.Data.SQLite.SQLiteCommand.ExecuteReader(CommandBehavior behavior) at System.Data.SQLite.SQLiteCommand.ExecuteNonQuery(CommandBehavior behavior) at System.Data.SQLite.SQLiteTransaction.Commit() at Duplicati.Library.Main.Operation.Common.DatabaseCommon.<>c__DisplayClass7_0.<CommitTransactionAsync>b__0() at Duplicati.Library.Main.Operation.Common.SingleRunner.<>c__DisplayClass3_0.<RunOnMain>b__0() at Duplicati.Library.Main.Operation.Common.SingleRunner.<DoRunOnMain>d__21.MoveNext()
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Duplicati.Library.Main.Operation.Backup.BackendUploader.d__24.MoveNext()
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Duplicati.Library.Main.Operation.Backup.BackendUploader.<>c__DisplayClass20_0.<b__0>d.MoveNext()
— End of stack trace from previous location where exception was thrown —
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at Duplicati.Library.Main.Operation.Backup.BackendUploader.d__21.MoveNext()
`

kenkendk · October 23, 2024, 9:29am

I found a similar post:

The gist here is that another process is locking the file, causing errors for Duplicati.
Your error message has code 0x800007FF but the mentioned one has code 0x80004005, so it may be different.

I found one SO thread mentioning the same error code but no solution.

Could it be that the disk is filled so the write fails?

The journal file is where all pending transactions are written, and it can get quite large.

That file is not found. It will have another name in your temp folder, but it should exist on the remote destination (assuming a later PUT) succeeds.

mewa · October 23, 2024, 2:08pm

Thanks for the quick response. I found that thread as well; I do not have CrashPlan, and can’t think of any other process that may be locking the file. I’ve also tried running the plan immediately after a reboot, while preventing the computer from sleeping, with same result. The backup runs for a while before the error is thrown, so if locking was the problem then it happens sometime during the backup run.

The local disks are definitely not full, and I’ve watched to make sure the available space was not substantially changing while the backup was in progress. The target is S3 so there should be no space issues there (I think).

Next step I’m going to try - a manual compact/optimize on the SQLite database (outside of duplicati), and overwrite the duplicati version with the compacted version.

Any further suggestions are welcome!

ts678 · October 23, 2024, 3:20pm

2.0.8.1_beta_2024-05-07 - slow backup is a similar issue with backing up one’s own DB journal. Suggestion there is – don’t, although I still wonder if we should revive the pull request to block it.

Excluding things by clicking them from green to red in the tree view is probably the simplest way.

Probably some sort of timing condition that may work or may not – making this hard to examine.
Great job on the exam so far BTW. To do more (if willing) you can look at the journal activity with
Sysinternals Process Monitor to see if maybe it’s being read for backup at same time it’s written because of some database change. Different processing threads might interfere with each other.

If you find files there beginning with dup-, one of them might be what was unreadable for upload. Possibly this is a separate problem, but I don’t ever like seeing systems saying disk I/O error.

tempdir explains how you can test with some other temporary folder that might work successfully.

kenkendk · October 24, 2024, 9:45am

A more “out-there” suggestion is that this is perhaps not related to SQLite, but instead to an internal deadlock issue. Issue #3205 is tracking the issue, and the remedy seems to be to set --disable-piped-streaming.

mewa · October 24, 2024, 1:10pm

Ok here are some updates from last time:

I had previously changed the duplicati tempdir (not system temp environment variable) as part of troubleshooting - no change
There are definitely files beginning with dup- in the temp folder. I was able to open all of them in Notepad++ and scroll to the bottom, no errors
Manual compact on SQLite database and overwrite existing file - no change
Excluding all files matching expression C:\Users\***\AppData\Local\Duplicati\*.sqlite-journal - SUCCESS! Scheduled backups ran normally last night. I will continue to monitor and post and update if anything changes. for reviving the pull request.

Adding to the weirdness, the .sqlite-journal file was being backed up properly for many months with no errors. Not sure why this changed recently.

On a related note, I would suggest duplicati should terminate the job with an error, when this disk i/o error is encountered that does not resolve after retry. Right now it seems like it’s going into a state where it’s not doing anything (no disk activity, no network activity, nothing written to verbose log) yet still showing “backup in progress.”

Thank you to @ts678 @kenkendk for the excellent help!

kenkendk · October 28, 2024, 8:01am

Excellent!

That file should be excluded, as it is the temporary storage of updates to the database, and has no value until committed (which then also deletes it).

I have added a PR that automatically excludes this file.