Thank you. Yes, I agree, you only need logging set to “Information” for this message… but you mentioned earlier that you run logs with profiling on.
I have found that my logs never display the message “The operation Backup has completed”, regardless of whether the backup operation “completes” and displays the next scheduled time or whether it locks up.
I think this is a large piece to the puzzle… essentially the backup operations on my server never 100% work… it just appears that they do until you look closely at the logs.
It’s good to see that your logs do display this message.
Are you using Duplicati - 188.8.131.52_beta_2021-05-03 or one of the other variants?
It’s interesting to note that now that I have turned off email notifications that my backups have run without a lockup for the longest period yet. I’m not saying that the bug is in the email functionality, rather that having the email functionality turned on exacerbates the lockup.
In looking at it further the missing “The operation Backup has completed” log entries might be a misdirect. The entries are definitely missing from my profiling logs (and the earlier verbose logs) that I’ve captured to file… strangely through, I turned on Live logs in the Web UI and they happily display “The operation Backup has completed”. So there’s definitely a bug in there, but it mightn’t be showcasing the locking issue… it might just be a file stream flushing problem.
FYI… now that I have email notifications turned off Duplicati continues to work.
I just had a look at the email transcript. Unfortunately it doesn’t provide any clues. The problem referred to in the transcript was that Duplicati wasn’t able to send emails due to configuration/parameter errors. In my case Duplicati happily sends emails, in fact because I was sending emails after every backup (and backups were every 15 minutes) I could easily receive 60 emails from Duplicati before a problem occurred.
Added more logs. Unfortunately, when I went to add logs to DoPut() it would no longer sync and requested a required db repair before it allowed it. Ran the db repair and so far its continuing to sync fine. (Noticed manual sync was failing to complete every other time at that time but no more.)
However, it looks like there are multiple problem points involved here.
For me it was getting stuck often at the following with the top being the last log entry:
Backend event: Put - Failed: duplicati-be18777c522d74254bc730f3409a93b67.dblock.zip.aes (49.93 MB)
Operation Put with file duplicati-be18777c522d74254bc730f3409a93b67.dblock.zip.aes attempt 6 of 6 failed with message: disk I/O error disk I/O error
Additional logs show the loops are never exiting when this happens. Further, DoPut() appears to never exit as no additional logs were being written.
Either that or an Exception without any further logs and with a catch received a big jump out of all the loops but unless I missed something this one doesn’t appear to be the case.And it would have to be not caught by Exception.
So I have to wait now until the problem surfaces again. It is interesting that a db repair has improved that situation.
In the mean time, I narrowed down one problem point to what looks like the final issue for that one anyway. Testing a test fix and don’t really want to post anything until I can confirm a fix works and the problem is as suspected.
Don’t know how it will relate to others. Could only hope it solves all of them but probably won’t. Could cause another problem lol.
(Edit - The actual Exception when Duplicati gets stuck is “Disk I/O error” in most if not all recent cases (not related to the below Exception seen). Currently drilling down into that Exception reason further and attempt some changes to see how Duplicati responds. Though the below one appears to at least be passable so far from what I can see. The I/O error is not with current code.)
Definitely Duplicati is using CoCoL project here which is throwing Exceptions (still waiting to see what the Exception is when it gets messed up).
var newBlock = await database.AddBlockAsync(b.HashKey, b.Size, blockvolume.VolumeID);
throw new Exception("Disk I/O error");
and the UI gets completely stuck and the code stops executing beyond what looks like some little itty bitty possibility of code in any other threads outside of there that can. But this completely breaks Duplicati so anything else that just might run is of no importance.
Unfortunately CoCoL is thrown in the mix which might be the problem. Almost seems like some weird issue between CoCoL, Exception throwing, and Task.WhenAll() use.
Throwing an Exception should not cause the code to stop executing but that’s exactly what it appears to do since nothing appears to happen after it.
Partially solved! Not CoCoL. Apparently, adding a timeout to WhenAll() allows the UI to continue and the backup to finish albeit in my test as a failed backup but at least it frees up the UI.
So halfway there lol
Updates added to here for now:
A test to improve Disk i/o error’s bad effect has proven successful (timeout not necessary with this one): Backup fails but completes within normal incremental backup time of a few minutes. Next time the backup runs, it is successful. Both unblock the UI. Still needs further testing.
Two steps to go at this time. To find out the real reason why that error is being thrown (if its SQL or an actual disk problem but either way Duplicati must deal with it) and to make sure all tasks have check and exit’s to the tasks where needed so WhenAll()'s use isn’t horribly negative when an Exception is thrown unexpectedly.
Could be mistaken at the moment but the disk i/o error is looking like at least one cause is by Duplicati in the block processing for use with the sqlite db. Will have to run some tests.
Definitely looks like it should be the code. Disk i/o error appears with Duplicati’s backup rename event of two of its backup files along with the sqlite-journal in there also (in other words it affects Duplicati’s own stuff and also next backup runs fine so its going to be very interesting).
The “Disk i/o error” am still waiting for Duplicati to rename files which it appears to be connected to. No rename, no “Disk i/o error”. Have a possible fix in testing for that, but until it happens again (its taking its good old time now when I want it most), can’t be sure as the fix may require a bit more.
Either way, happy to see two long standing bad issues (people complained about both of these years ago) with Duplicati coming to an end.
Updates attached to this:
The disk i/o error appearance changed when I added the attempted fix. It is definitely throwing at that spot but seemingly unknowingly now. Spotted another log after that where it throws the error then later during the same backup its all of a sudden working again. So this one will take a bit of time between waiting and testing. But looks like this is what causes the renaming to be tried as the log now shows it comes before that.
Latest log shows it happened with CommitAfterUpload. Upload was successful so the info should be committed to the db just fine. The db also works fine. I’m currently working on the theory of a race condition with lower performance hardware based on what I see happening in the logs compared to a higher performance one. That could take much longer. Did notice someone years ago felt it was file accessibility but it is not that (I tried to force that and it didn’t cause this issue) - I tested the same setup here as the other computer and it doesn’t happen here. Duplicati should deal with it anyway.
Have to put this on pause for now. Technically, its “solved” for me though with the various checks I have it. Just don’t know to what extent. Its at least not to the extent that I want it solved. The backup does complete “successfully” but haven’t looked to see what that actually fully means with the checks alone. Still, Duplicati now works like it should instead of the way it is for everyone else.
Further on the disk i/o error (normally seen as causing the lock up for me):
Looks to be only caused by very specific files (and only those files) such as Duplicati db in /AppData on certain systems as far as I can tell. One system doesn’t ever seem to have a problem backing it up but the other one often throws disk i/o on it and that’s one error thrown during the WhenAll() that causes the total lockup of the Duplicati UI.
I intend to look into why certain files do that as there’s probably something Duplicati should be looking at so as to not attempt to back it up, or in the worst case it should at least throw the disk i/o error, then remove that file from the backup and continue - since there’s a ton of stuff in the code here, it would take me a while to sort through what exactly needs to be done, and so I may not go that far with this.
For now, just blocking the files where this happens such as Duplicati’s AppData folder from the backup, appears to fix the situation until a better way to handle it is implemented. I have backups running without anything bad showing up for well over a week now and that wasn’t possible before.
Theory: At least with Duplicati’s db file, it could in theory be getting into a situation where its attempting to write and read the file at the same time. Perhaps with a slow enough system (eg the one where its happening for me), it runs into a problem and disk i/o error gets thrown. Could include or be another difference such as larger backups that I didn’t duplicate on other systems without the problem.