Duplicati lock up

I think it’s hardcoded, just to say don’t expect an instant stop (as some people do) when button is used.
There’s been a good point made that the “Stop now” button is misleading unless you read the fine print.
Stop now doesn’t stop immediately (and I’m not sure that Issue was actually filed suggesting new text).

Wonderful find. I’m not a Duplicati dev, but this may be an excellent thing to file as Issue or pull request.
That will get it a little closer to the developer team, although the right expert might need to be brought in.
Duplicati is always looking for help (more things to do than people to do them), so thank you for helping.

If you think the lowest-level SQL statements might help, turn on profile-all-database-queries, but currently it sounds like that might be overkill. Regardless, good luck to you and @Xavron in debugging.

With the debug level set to Profiling, this is where it got to when it failed:

I have highlighted the last line that was displayed when profiling was set to Verbose… so a little more information came through.

Based on this additional information I have narrowed it down further. The issue is occurring in the RunAction method in controller.cs on line 411. The last debug statement I receive is from this method on line 429, the next debug statement should be from sending an email via the OnOperationComplete call on line 438, so somewhere in-between there it fails. I can also say that m_options.ConcurrencyMaxThreads used on line 431 is 0 according to my configuration.

The lines 429-438 represent where the problem can be seen, but the cause might be elsewhere, e.g. an invalid object being passed in and causing an issue between these lines.

I can’t find any system issues in the Windows event logs around this time.

These lines are surrounded by a try-catch handler but no exception is seen. This could be because:

  1. the code is running in a thread and the exception is triggered but logging for the thread doesn’t work;
  2. the thread is killed silently without error; or
  3. an exception does occur, but due to the complexity of the catch block an exception occurs in the block which is not subsequently logged which consequently kills the thread.

Suggest placing a log statement at the beginning of the catch block to lessen the likelihood of item 3.

I think this is a job for the debugger, but I’ll need to set up an environment for Duplicati to run and hopefully I’ll be able to reproduce the issue in that environment. It’ll take a while for me to get to it.

Fantastic effort so far. Much appreciated. I don’t know what to suggest meanwhile for a workaround.
There are a lot of configurable options, and I don’t know if getting closer or further from default helps.

If it helps, I think rclone can do Google Drive, but it’d be nice to see a Duplicati bug tracked down too.
On this run, though, it looks like it got past Google Drive (what happened before), so is bug generic?

1 Like

It’s looking very much like a generic bug.

I’m running it again, but this time with email turned off as it might be a very rare email bug. If it’s still working in a few days then I’ll be very suspicious of the OnOperationComplete method. I’m snowed under at the moment so I won’t get a debugger onto it for a while.

There’s another product called Free File Sync which I can use for a while for backup. It’s another very good product but for file syncing rather than backup. I’m not just looking at “free” products as I’ve already donated to Free File Sync and I fully intend on donating to Duplicati once I get it working. I tested 17 products during a weekend a couple of weeks ago and some of the commercial products really aren’t very good. Duplicati is an extremely well conceived product and is exactly what I’m looking for with respect to how it handles cloud backups. It’s also one of the fastest products on the market with respect to Google Drive.

Thank you to you and Xavron for your help. I’ll keep you up-to-date with my investigations/progress.

Xavron, I’m happy to help with your bug (assuming it’s not the same one) if I can be of assistance to you.

1 Like

@ts678… would you be able to help me out with something?..

I have turned off email notifications, and at the moment Duplicati is working nicely, so I thought I’d have a look at OnOperationComplete as I thought that’s where the emails were being sent. I can’t see the email functionality in there so I’m not sure why I thought that’s where it was, I need to look further, regardless it led me to look at nearby logging statements.

In the following code (ignore the highlighted line it’s just the middle of the code block) the start of an operation is logged on line 426. It logs as ‘(@“The operation {0} has started”, operationname)’… and I can see plenty of entries for “The operation Backup has started” in the logs.

The end of an operation is logged on line 440. It logs as ‘(@“The operation {0} has completed”, operationname)’. The problem is, I cannot find, in any of my logs any of these completion messages regardless of whether the backup seemingly worked correctly or not. I think all backups are having a problem and this is demonstrated by missing completion messages.

Could you please search your profiling logs for “The operation” and tell me if any of them ever end in “has completed”?

Thank you.

I’m not sure I follow. I posted a live log screenshot and a profiling log sample of that above.

I’m away from my logs now, but Google and similar can probably find some other samples.

“The operation backup has completed” “profiling” site:forum.duplicati.com

Note that you don’t need to log at profiling level to see these, so a wider search finds more.

EDIT:

An old (I hope still relevant) sample of profiling log sendmail detail, then backup end is here.

Thank you. Yes, I agree, you only need logging set to “Information” for this message… but you mentioned earlier that you run logs with profiling on.

I have found that my logs never display the message “The operation Backup has completed”, regardless of whether the backup operation “completes” and displays the next scheduled time or whether it locks up.

I think this is a large piece to the puzzle… essentially the backup operations on my server never 100% work… it just appears that they do until you look closely at the logs.

It’s good to see that your logs do display this message.

Are you using Duplicati - 2.0.6.1_beta_2021-05-03 or one of the other variants?

It’s interesting to note that now that I have turned off email notifications that my backups have run without a lockup for the longest period yet. I’m not saying that the bug is in the email functionality, rather that having the email functionality turned on exacerbates the lockup.

Most recently on 2.0.5.114 Canary whose code should be identical to 2.0.6.1 Beta.

I posted an edit to last reply citing log with email detail, email info, and backup info.

It sounded like you had a pretty solid problem. Maybe email transcript gives clues?

Thank you.

In looking at it further the missing “The operation Backup has completed” log entries might be a misdirect. The entries are definitely missing from my profiling logs (and the earlier verbose logs) that I’ve captured to file… strangely through, I turned on Live logs in the Web UI and they happily display “The operation Backup has completed”. So there’s definitely a bug in there, but it mightn’t be showcasing the locking issue… it might just be a file stream flushing problem.

FYI… now that I have email notifications turned off Duplicati continues to work.

I just had a look at the email transcript. Unfortunately it doesn’t provide any clues. The problem referred to in the transcript was that Duplicati wasn’t able to send emails due to configuration/parameter errors. In my case Duplicati happily sends emails, in fact because I was sending emails after every backup (and backups were every 15 minutes) I could easily receive 60 emails from Duplicati before a problem occurred.

Was that one of the rare (less than one in 60?) failures? Where did it end on way to:

although earlier I guess you wondered if you got all output. Did you get “Email sent”?
Sorry I don’t have profiling logs today to compare logs more closely with your logs…

Attempting to look at source on GitHub, maybe OnOperationComplete runs OnFinish
which runs SendMessage. After that it says (missing?) “Operation backup complete”.

send-mail command could be used in a loop if you’d like to see if it can have trouble.

1 Like

Sorry, did I miss something in that thread that you sent me… I’ll go back and look a little later when I have time?

Yes, it is a very rare failure.

The code does not produce the output “Email Sent”.

You are correct… OnOperationComplete runs OnFinish which calls SendMessage… that’s where I joined the dots earlier :grinning:

I’ll have another look at that thread you sent through tonight.

I got all the default logs I could out of this and feel like I have to add some to Duplicati source and install that.

There’s enough jumps, loops, and variables involved that its easier to add logs than it is to eyeball it. But will quickly try eyeballing it one more time before that.

Added more logs. Unfortunately, when I went to add logs to DoPut() it would no longer sync and requested a required db repair before it allowed it. Ran the db repair and so far its continuing to sync fine. (Noticed manual sync was failing to complete every other time at that time but no more.)

However, it looks like there are multiple problem points involved here.

For me it was getting stuck often at the following with the top being the last log entry:

Backend event: Put - Failed: duplicati-be18777c522d74254bc730f3409a93b67.dblock.zip.aes (49.93 MB)

Operation Put with file duplicati-be18777c522d74254bc730f3409a93b67.dblock.zip.aes attempt 6 of 6 failed with message: disk I/O error disk I/O error

Additional logs show the loops are never exiting when this happens. Further, DoPut() appears to never exit as no additional logs were being written.

Either that or an Exception without any further logs and with a catch received a big jump out of all the loops but unless I missed something this one doesn’t appear to be the case. And it would have to be not caught by Exception.

So I have to wait now until the problem surfaces again. It is interesting that a db repair has improved that situation.

Still narrowing down but looks close now.

Now at BackupHandler RunAsync() duplicati/BackupHandler.cs at master · duplicati/duplicati · GitHub

Ran into a different end of logs stuck UI that triangulates to here.

But it goes from BackendManager class directly to BackendUploader class and that’s what’s found in the above function.

So next step is to add logs to RunAsync() and wait again.

Just got this. Not much time on analyzing but at least with

Operation Put with file duplicati-ib4e6856fa8a6408e9c406330441f0f2c.dindex.zip.aes attempt 6 of 6 failed with message: disk I/O error disk I/O error

It is exiting from DoWithRetry() with an Exception thrown:

if (retryCount == m_options.NumberOfRetries)
    throw;

Finally {} block runs and for loop exits

if (cancelToken.IsCancellationRequested)
    return false;

The sync UI never exits despite that and 6 out of 6 fails. Nothing else shows up in the current logs after that.

From there I will need further logs and/or locate code that will stop the sync. Depending on where that’s done at.

Additional. This tracks back to:

case DuplicatiOperation.Backup: in /Server/Runner where it starts but the log for it exiting is never reached. Exception log there also doesn’t appear to show up.

Third issue here to track from me. So at least three or four causing the end result so far.

This backup is still “running” and the UI speed is only slightly dropping but its very far past the normal completion time and doesn’t appear to really be doing anything.

Current end of log with UI stuck far past normal backup times...
CommitAddBlockToOutputFlush took 0:00:00:01.640
Starting - CommitAddBlockToOutputFlush

DoPut() ends: Backend event: Put - Completed: duplicati-i85dd6ff4d4944abeb0e50d8edf8cb677.dindex.zip.aes (30.62 KB)

CommitAfterUpload: CommitAfterUpload took 0:00:00:00.046…

DoWithRetry() returns from UploadFileAsync() with finally {} block executing

File changed, adding dir, etc until CommitAddBlockToOutputFlush then
CommitAddBlockToOutputFlush took and that’s it.

I will not be adding more to this thread until I find something that solves these.

Sorry Xavron, I’m snowed under at the moment (more than usual) with tight timelines… I’ll get back to this thread when I can but it will be at least a few days.

No problem :wink:

In the mean time, I narrowed down one problem point to what looks like the final issue for that one anyway. Testing a test fix and don’t really want to post anything until I can confirm a fix works and the problem is as suspected.

Don’t know how it will relate to others. Could only hope it solves all of them but probably won’t. Could cause another problem lol.

Edit - scratch that. It just happened of course after I post no longer being able to reproduce it lol.

This is first occurrence during a backup and where its last seen in my logs (last entry for 31 minutes): duplicati/DataBlockProcessor.cs at master · duplicati/duplicati · GitHub

Preceded by a bunch of loops through the while loop there then CommitAddBlockToOutputFlush & CommitAddBlockToOutputFlush took and then hits that Exception.

31 minutes later it does a few of these: Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-IncludingPath]: Including path as no filters matched:

The transfer rate continues to drop for hours with no other UI movement.

Also, the code execution according to the logs goes from:
var newBlock = await database.AddBlockAsync(b.HashKey, b.Size, blockvolume.VolumeID);
b.TaskCompletion.TrySetResult(newBlock);

And then right to

catch (Exception ex)

This is also going back to another issue I noticed. The throwing of an Exception is going right back to duplicati/BackupHandler.cs at master · duplicati/duplicati · GitHub

The Task.WhenAll() may not be dealing with Exceptions in a useful way for that.

Either way, the Duplicati gets into a broken state right here.