Duplicati lock up

No problem. There definitely could be multiple things going on.

I was looking for that which you found also. Some of the searches don’t go so well :slight_smile:

Also, there could be other places where the code is coming from as command line looks to have some of its own things going. Gets a bit messy to trace without better understanding.

I think I’ve found the general area where my issue is occurring…

The backup is run back the BackupHandler which is called from line 128 here:

The backup handler finishes because it sets the state to “Backup_Complete” causing the “Backup complete!” message to be displayed in the progress bar.

After the backup is handler runs then lines 131 to 133 are called… issue could be somewhere in there.

Alternatively, the RunAction method which calls 126 to 133 is called from here (ignoring all the possible overloaded methods to get here):

Further down in this method on line 438 “OnOperationComplete” is called which sends the completion email that is never sent. So the bug occurs somewhere between these two points (or in OnOperationComplete).

I’ll up my debugging to Profiling as there are quite a lot of profiling messages in this area… hopefully I don’t kill the server by running it out of disk space :thinking:

While I had my back turned, reading through code, it happened again at the same spot. It completes the get on dblock.zip and then just stops. Hmmm, I need to look for dying threads, but more importantly I need to go to bed… it’s 3am on the East cost of Aus at the moment! :yawning_face:

1 Like

Try out the Profiling debug level at some stage, it provides far more detail. It will help narrow down where the application is falling over for you (if you have the disk space for it).

1 Like

Ah so you’re going by that. Enabled it yesterday. The problem for me only happens about once or twice a week as far as I’ve seen so far. The timing for that isn’t as good and I don’t always have access with others using it.

You’re like me or I’m like you. Gone through entire nights solving problems but I try not to do that anymore. Its fun though.

1 Like

I think it’s hardcoded, just to say don’t expect an instant stop (as some people do) when button is used.
There’s been a good point made that the “Stop now” button is misleading unless you read the fine print.
Stop now doesn’t stop immediately (and I’m not sure that Issue was actually filed suggesting new text).

Wonderful find. I’m not a Duplicati dev, but this may be an excellent thing to file as Issue or pull request.
That will get it a little closer to the developer team, although the right expert might need to be brought in.
Duplicati is always looking for help (more things to do than people to do them), so thank you for helping.

If you think the lowest-level SQL statements might help, turn on profile-all-database-queries, but currently it sounds like that might be overkill. Regardless, good luck to you and @Xavron in debugging.

With the debug level set to Profiling, this is where it got to when it failed:

I have highlighted the last line that was displayed when profiling was set to Verbose… so a little more information came through.

Based on this additional information I have narrowed it down further. The issue is occurring in the RunAction method in controller.cs on line 411. The last debug statement I receive is from this method on line 429, the next debug statement should be from sending an email via the OnOperationComplete call on line 438, so somewhere in-between there it fails. I can also say that m_options.ConcurrencyMaxThreads used on line 431 is 0 according to my configuration.

The lines 429-438 represent where the problem can be seen, but the cause might be elsewhere, e.g. an invalid object being passed in and causing an issue between these lines.

I can’t find any system issues in the Windows event logs around this time.

These lines are surrounded by a try-catch handler but no exception is seen. This could be because:

  1. the code is running in a thread and the exception is triggered but logging for the thread doesn’t work;
  2. the thread is killed silently without error; or
  3. an exception does occur, but due to the complexity of the catch block an exception occurs in the block which is not subsequently logged which consequently kills the thread.

Suggest placing a log statement at the beginning of the catch block to lessen the likelihood of item 3.

I think this is a job for the debugger, but I’ll need to set up an environment for Duplicati to run and hopefully I’ll be able to reproduce the issue in that environment. It’ll take a while for me to get to it.

Fantastic effort so far. Much appreciated. I don’t know what to suggest meanwhile for a workaround.
There are a lot of configurable options, and I don’t know if getting closer or further from default helps.

If it helps, I think rclone can do Google Drive, but it’d be nice to see a Duplicati bug tracked down too.
On this run, though, it looks like it got past Google Drive (what happened before), so is bug generic?

1 Like

It’s looking very much like a generic bug.

I’m running it again, but this time with email turned off as it might be a very rare email bug. If it’s still working in a few days then I’ll be very suspicious of the OnOperationComplete method. I’m snowed under at the moment so I won’t get a debugger onto it for a while.

There’s another product called Free File Sync which I can use for a while for backup. It’s another very good product but for file syncing rather than backup. I’m not just looking at “free” products as I’ve already donated to Free File Sync and I fully intend on donating to Duplicati once I get it working. I tested 17 products during a weekend a couple of weeks ago and some of the commercial products really aren’t very good. Duplicati is an extremely well conceived product and is exactly what I’m looking for with respect to how it handles cloud backups. It’s also one of the fastest products on the market with respect to Google Drive.

Thank you to you and Xavron for your help. I’ll keep you up-to-date with my investigations/progress.

Xavron, I’m happy to help with your bug (assuming it’s not the same one) if I can be of assistance to you.

1 Like

@ts678… would you be able to help me out with something?..

I have turned off email notifications, and at the moment Duplicati is working nicely, so I thought I’d have a look at OnOperationComplete as I thought that’s where the emails were being sent. I can’t see the email functionality in there so I’m not sure why I thought that’s where it was, I need to look further, regardless it led me to look at nearby logging statements.

In the following code (ignore the highlighted line it’s just the middle of the code block) the start of an operation is logged on line 426. It logs as ‘(@“The operation {0} has started”, operationname)’… and I can see plenty of entries for “The operation Backup has started” in the logs.

The end of an operation is logged on line 440. It logs as ‘(@“The operation {0} has completed”, operationname)’. The problem is, I cannot find, in any of my logs any of these completion messages regardless of whether the backup seemingly worked correctly or not. I think all backups are having a problem and this is demonstrated by missing completion messages.

Could you please search your profiling logs for “The operation” and tell me if any of them ever end in “has completed”?

Thank you.

I’m not sure I follow. I posted a live log screenshot and a profiling log sample of that above.

I’m away from my logs now, but Google and similar can probably find some other samples.

“The operation backup has completed” “profiling” site:forum.duplicati.com

Note that you don’t need to log at profiling level to see these, so a wider search finds more.

EDIT:

An old (I hope still relevant) sample of profiling log sendmail detail, then backup end is here.

Thank you. Yes, I agree, you only need logging set to “Information” for this message… but you mentioned earlier that you run logs with profiling on.

I have found that my logs never display the message “The operation Backup has completed”, regardless of whether the backup operation “completes” and displays the next scheduled time or whether it locks up.

I think this is a large piece to the puzzle… essentially the backup operations on my server never 100% work… it just appears that they do until you look closely at the logs.

It’s good to see that your logs do display this message.

Are you using Duplicati - 2.0.6.1_beta_2021-05-03 or one of the other variants?

It’s interesting to note that now that I have turned off email notifications that my backups have run without a lockup for the longest period yet. I’m not saying that the bug is in the email functionality, rather that having the email functionality turned on exacerbates the lockup.

Most recently on 2.0.5.114 Canary whose code should be identical to 2.0.6.1 Beta.

I posted an edit to last reply citing log with email detail, email info, and backup info.

It sounded like you had a pretty solid problem. Maybe email transcript gives clues?

Thank you.

In looking at it further the missing “The operation Backup has completed” log entries might be a misdirect. The entries are definitely missing from my profiling logs (and the earlier verbose logs) that I’ve captured to file… strangely through, I turned on Live logs in the Web UI and they happily display “The operation Backup has completed”. So there’s definitely a bug in there, but it mightn’t be showcasing the locking issue… it might just be a file stream flushing problem.

FYI… now that I have email notifications turned off Duplicati continues to work.

I just had a look at the email transcript. Unfortunately it doesn’t provide any clues. The problem referred to in the transcript was that Duplicati wasn’t able to send emails due to configuration/parameter errors. In my case Duplicati happily sends emails, in fact because I was sending emails after every backup (and backups were every 15 minutes) I could easily receive 60 emails from Duplicati before a problem occurred.

Was that one of the rare (less than one in 60?) failures? Where did it end on way to:

although earlier I guess you wondered if you got all output. Did you get “Email sent”?
Sorry I don’t have profiling logs today to compare logs more closely with your logs…

Attempting to look at source on GitHub, maybe OnOperationComplete runs OnFinish
which runs SendMessage. After that it says (missing?) “Operation backup complete”.

send-mail command could be used in a loop if you’d like to see if it can have trouble.

1 Like

Sorry, did I miss something in that thread that you sent me… I’ll go back and look a little later when I have time?

Yes, it is a very rare failure.

The code does not produce the output “Email Sent”.

You are correct… OnOperationComplete runs OnFinish which calls SendMessage… that’s where I joined the dots earlier :grinning:

I’ll have another look at that thread you sent through tonight.

I got all the default logs I could out of this and feel like I have to add some to Duplicati source and install that.

There’s enough jumps, loops, and variables involved that its easier to add logs than it is to eyeball it. But will quickly try eyeballing it one more time before that.

Added more logs. Unfortunately, when I went to add logs to DoPut() it would no longer sync and requested a required db repair before it allowed it. Ran the db repair and so far its continuing to sync fine. (Noticed manual sync was failing to complete every other time at that time but no more.)

However, it looks like there are multiple problem points involved here.

For me it was getting stuck often at the following with the top being the last log entry:

Backend event: Put - Failed: duplicati-be18777c522d74254bc730f3409a93b67.dblock.zip.aes (49.93 MB)

Operation Put with file duplicati-be18777c522d74254bc730f3409a93b67.dblock.zip.aes attempt 6 of 6 failed with message: disk I/O error disk I/O error

Additional logs show the loops are never exiting when this happens. Further, DoPut() appears to never exit as no additional logs were being written.

Either that or an Exception without any further logs and with a catch received a big jump out of all the loops but unless I missed something this one doesn’t appear to be the case. And it would have to be not caught by Exception.

So I have to wait now until the problem surfaces again. It is interesting that a db repair has improved that situation.

Still narrowing down but looks close now.

Now at BackupHandler RunAsync() duplicati/BackupHandler.cs at master · duplicati/duplicati · GitHub

Ran into a different end of logs stuck UI that triangulates to here.

But it goes from BackendManager class directly to BackendUploader class and that’s what’s found in the above function.

So next step is to add logs to RunAsync() and wait again.