Automatically Restart Duplicati when it Crashes

Hi All, I’m interested in seeing if I can configure Duplicati to get relaunched when it crashes.

My Situation
I’m running Duplicati 2 on multiple Windows 2019 Server VMs. I basically stick to the default install, so it’s not running as a service. I also use Volume Shadow copy to get the backups faster. I use it to backup lots of files from hosted websites and other “console” like applications.

Problem
In the last month or so I’ve been having a case where my Duplicati 2 instances, will all just die/crash and close. I’ve checked the stored logs and also now tried to log to a file with “Profiling”, but can’t see any error that occurs before it dies.

Manual Solution
So in every case, I basically have to log onto each VM and run Duplicati again from the desktop shortcut. So I’m now looking at how it would be possible to get Duplicati to auto-start again when it has crashed.

Automated Solution
There is no built-in feature like this or a simple “Watch Dog” application that I can run that will do this that I know of, so I’m looking at using Windows Task Scheduler to do this.

I can easily set up a Task to start Duplicati when the user logs in and then re-run this task each 15min. But with the option that the Task should only run one instance, and also that the Task should not kill the process if it runs longer than X hours.

Community Request
If anyone has experience with this, or have suggestions of why this will not work, or things that may go wrong. Please let me know.

Edit:
For those playing along. I’ve also deleted the Shortcut from: C:\ProgramData\Microsoft\Windows\Start Menu\Programs\StartUp which is responsible for starting the Duplicati Tray Icon and Server I believe. The reason being that if the Windows Task scheduler does not start the process by itself, it doesn’t see it as running. So it will continue to attempt to launch another instance, every 15min (my recurring schedule).

Edit 2:
I’ve also set Windows Task scheduler triggers to have 2. One is on the login of any user, and the other is repeating it every day and then every 15 minutes for 24hours. In the past, I’ve noticed that the “on Login” trigger fails to repeat every x minutes if you enable that option.

For anyone that is interested, the log file got this far and then it just stopped. So I don’t think Duplicati has a way of logging when it fails. Also, I’m expecting that it could be because my backend FTP server is sometimes offline, as it is susceptible to power failures, and in the last few months have gone offline during backups or before backups started. But I’m not aware that Duplicati will crash if the backend is not up or dies during the process.

2022-03-01 01:00:18 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteScalarInt64]: Starting - ExecuteScalarInt64: SELECT COUNT(*) FROM "FilesetEntry" WHERE "FilesetEntry"."FilesetID" = 3
2022-03-01 01:00:18 +02 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteScalarInt64]: ExecuteScalarInt64: SELECT COUNT(*) FROM "FilesetEntry" WHERE "FilesetEntry"."FilesetID" = 3 took 0:00:00:00.000
2022-03-01 01:00:18 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteReader]: Starting - ExecuteReader: SELECT DISTINCT "Fileset"."ID", "Fileset"."Timestamp" FROM "Fileset", "RemoteVolume" WHERE "RemoteVolume"."ID" = "Fileset"."VolumeID" AND "Fileset"."ID" IN (SELECT "FilesetID" FROM "FilesetEntry")  AND ("RemoteVolume"."State" = "Uploading" OR "RemoteVolume"."State" = "Temporary")
2022-03-01 01:00:18 +02 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteReader]: ExecuteReader: SELECT DISTINCT "Fileset"."ID", "Fileset"."Timestamp" FROM "Fileset", "RemoteVolume" WHERE "RemoteVolume"."ID" = "Fileset"."VolumeID" AND "Fileset"."ID" IN (SELECT "FilesetID" FROM "FilesetEntry")  AND ("RemoteVolume"."State" = "Uploading" OR "RemoteVolume"."State" = "Temporary") took 0:00:00:00.000
2022-03-01 01:00:18 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.Operation.BackupHandler-PreBackupVerify]: Starting - PreBackupVerify
2022-03-01 01:00:18 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.BackendManager-RemoteOperationList]: Starting - RemoteOperationList
2022-03-01 01:00:18 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()

These are the last few lines after which Duplicati died.

Strange, I don’t think I have ever experienced Duplicati outright crashing like that. Are you absolutely sure it is crashing and it’s not something like your user session getting logged out?

Can you provide some additional background about your setup?

  • Duplicati version
  • Size of data being backed up (front end size)
  • Destination storage type
  • Number of backup versions being retained
  • etc

I think the best thing is to try and fix the root issue.

If that isn’t possible for some reason, then I’d probably reconsider and use service-based installation. Service installs are better for Server machines anyway as its operation will not depend on a certain account being logged in. In addition, you can configure Windows services to auto-restart when they crash.

Tbh, I don’t at all understand why it would “stop running” like this. What I do know is that I have 8 installations of Duplicati. Some running on Windows 2019 Server and Some running on Windows 2012 Server.

The age of the servers are also different, where 4 are very recent (Server 2019) and of the 4, two of them are still basically empty. So the backups are backing up a few Kb or Mb (not a lot of files to get locked or generate issues during backup).

They ALL do connect to the same remote storage that are a FTP Server that do have trouble being online all the time. And they connect through the same Hamachi VPN connection. (so the FTP runs over a VPN to the remote server).

I also have all of them “stop running”. But not always at the same time, but in some cases they have all stopped during the same 24 hour time. But in some other cases, only some of them “stopped” and the others continued. When some of the are “hung up” on a backup job, they mostly don’t “stop”. With “hung up” I mean that the backup job “stalls” but don’t exit. So I’ll have no new logs generated for days, but Duplicati shows that the backup is still “verifying files” or “removing old backups”. Cancelling these backups never work, they just stay in the same state (using both immediate or “after current file” option). The only way around this is to kill the process and do a repair on the db after restarting.

I have 2 versions running:

  • 2.0.6.1_beta_2021-05-03
  • 2.0.6.3_beta_2021-06-17

Size:
Each server have 4 backup sets, some have 5. Size differ from: 7.00 Bytes to 43Gb a lot of them are below 1Gb.

Backup Retention
I’ve got them all set to “Smart Retention”. They all backup each night after 12am and they all have been spaced so that each backup set runs every hour from 1am.

I’ve run the same versions of Duplicati since last year May, and they did not do “stop running” at all. So this is something “new”. Also, the servers did not change at all, except for generating more files to backup and the backups retained growing. But the applications on them are still the same (they host specific websites).

Bottom Line
I can’t find any information as to why Duplicati would “stop running” I don’t even know if it is Duplicati crashing or failing. The logs from Duplicati don’t say anything of it closing. And other than that I have very little to go on, as I did to make any changes to the servers that I can recall. The only commonality that I see, is the backup target that goes offline, also it switched to a different internet connection and location. This was not a problem last year, as it was on a very fast Fibre line with no Power Outages (or very few).

If anyone know of any information that I could collect to help figure this out, you are welcome to share. Also @drwtsn32 if you would like more info about the setup that you can think of I can provide it for you.

Ok thanks for the additional information. This type of problem (backup job hangs) does happen sometimes and can be caused by various factors. I’m guessing that’s really what’s going on here, not that Duplicati “crashes” and the processes die (although that still may be possible in some situations).

It’s been a while since I used Hamachi VPN but I think I remember it is a user-mode service and only started at logon. Your dependence on the VPN to reach your FTP destination maybe is why you didn’t set up Duplicati to run as a service.

This may be a factor that causes the job to hang. Duplicati is usually pretty robust at handling connection issues, retrying, and failing the job but maybe not in all types of situations.

Is there any chance you can experiment on one of these servers and change its destination to one that doesn’t use FTP (probably the most unreliable of the back end types, unless you meant SFTP which is pretty good) and one that doesn’t rely on Hamachi VPN? If it doesn’t have the hang issues, it will help confirm the issue might be with the FTP server and/or VPN.

What’s this new post? Is the issue a process death (should be easy to see, even if mysterious) or not?

EDIT:

If you see a working Duplicati UI in the web browser, you still have a Duplicati process (technically two because the first one looks for the most recent update and runs it, then the child does the actual work).

Task Manager also shows processes. I think a true Duplicati death will also close TrayIcon if you use it.

Hey, is it actually crashing or is it just stalling indefinitely?

I ran into duplicati stalling at the start and end of a backup when I initially started testing duplicati. Identified another slow query during backup

I narrowed it down to slow database queries. I identified index’s that fix the issues and got them merged.

Could you share the profiling log on one of the stalled backups? That could let us see if it is a similar issue.

Hi Guys, sorry for the confusion. I mentioned 2 problems in my post, only referring to the other one (stalling of backups) so to be clear in this forum thread I’m talking about the process exist/dies/disappears. As mentioned by @ts678 the tray icon disappears, the web GUI on port 8200 disappears (“The website cannot be found”), the 2 duplicate processes in Task Manager also disappears.

Once this happen I have to double click the Duplicati 2 shortcut again, to run the process again, and then everything is backup like a fresh run of Duplicati 2 (it basically starts with any backup that should have run while it was not running).

Not really, but interesting point. I just installed Duplicati with default settings. An important fact is that my VM Servers automatically “log in”, so that the VPN goes up and for other requirements. So the VPN is always up. But just for completeness, when the destination end point is offline, Duplicati will try to run the backup and just fail with a generic “An error occurred” message on the GUI. But this is expected obviously, and a fine handling of the situation (failing the backup means that it’s ready for the next backup to run).

Agree, but this is the “other” problem I referred to (and posted about in the other thread that @ts678 pointed out).

Please elaborate here, is SFTP better than FTP? (Except from the obvious security layer included).

I do have one of the servers using a Backblaze B2 free account as a destination for 2 of the small backups, I changed them from the unstable FTP destination. They just always work, so yes, the problems are because of the unstable FTP destination I would say.

For me this is not about the destination stability, it would be great if it was perfect, but I basically accept the imperfections of it, I just don’t want Duplicati to crash/close, and rather fail and try again on the next cycle.

I’m referring to crashing, but in the other thread, I’m referring to the stalling issue. When it stalls it is always at 2 points: Verifying files or deleting old backups. Both show that they continue with a call every so often, but then they just get to a point where they do nothing for days. I have posted a screenshot of the profiling live log when it has stalled in the other thread:

If you could manually replicate this crashing we could build a way to test this edge case and see how we can resolve it in the code. Try to emulate what your unstable FTP server does when the crash occurs. You probably will have to do some logging on the FTP server to identify what is going on with it during the time duplicati crashes. You might also try logging the VPN and see if it has any connection issues on either side as well. Basically we are looking for the failure condition.

If you provide profiling logs for each of the systems that crash and at least 5 crashes worth of logs we could probably narrow down the area of the code it crashes at.

Refer to this for how to grab the logs - Advanced Options - Duplicati 2 User's Manual

@tarianjed I’ll see if I can find a sweet spot, where this happens. Luckily I can simulate the end point going offline in many ways, so hopefully, I’ll get it right.

One nice thing about FTP is that it’s a very simple but old protocol that I kind of know, and one of the few allowing an unencrypted transport (other than what VPN provides), which makes packet capturing useful.

Another nice thing is Duplicati offers two of them. One is Microsoft’s, the other is third-party (Fluent FTP).

The first log snippet crashed (this topic is on the death not the stall, right?) just when it tried to list its files.

If Duplicati is in a may-die situation (I can’t say what, but you’re trying to set it up), the Test connection might fail, as may URL from Export As Command-line feeding Duplicati.CommandLine.BackendTool.exe.

I think this is what the button does for FTP:

FTP (Alternative) has a fancy test here, almost like a small Duplicati.CommandLine.BackendTester.exe.

It depends on how uncontrolled a failure it is. An unexpected exception can make a Duplicati-crashlog.txt

Crashes can also show in Event Views → Windows Logs → Application due to Windows Error Reporting.

If you use one of the command line tools (you could even backup from your job export), there’s a chance that something relevant may just show up as text at the Command Prompt terminal window as things fail.

and first it needs to be found a little better.

Possibly the death and the stall are related. We do have a tiny bit of profiling log on the death, ending in

2022-03-01 01:00:18 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.BackendManager-RemoteOperationList]: Starting - RemoteOperationList
2022-03-01 01:00:18 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()

If the stall winds up looking different, maybe it should be chased in the other issue. If similar, stay here?

It is based on SSH instead of the very old legacy FTP protocol. Some people confuse SFTP and FTPS. FTPS is basically the same old FTP protocol but wrapped in a TLS layer for security. SFTP is quite different being based on SSH.

If you had the option to use SFTP instead of FTP(S), I would go for it.

But it’s (most likely) unrelated to your trayicon process dying.

Hi all, just to update on this again.

I’ve improved the issue of my storage destination that went offline during certain fixed time slots, it is now online as best it can be. The process (duplicate server) still dies at random times, I have it running on 6 servers and I have them die at random times.

But effectively the backup destination going offline was not the issue with the Duplicati server process dying.

I also have managed to setup a Scheduled task that starts Duplicate when I log in, and then repeat this every 15 minutes, but since the task does not finish Task Scheduler just leave it. Untill it dies, then Task Scheduler starts Duplicati within 15 minutes again.

I have also setup UptimeRobot.com to monitor the Duplicati Web Interface and it lets me know when it is offline and comes back.

This does not help much to understand the problem, but it effectively resolves the problem well enough that my backups can continue on their scheduled runs.

1 Like

Perfect suggestion! I didn’t think of Event Viewer, it’s sad but true. Ok so I found the point where Duplicati Crash and this is limited, but I have a event trace from Event Viewer:

Source: .NET Runtime

Application: Duplicati.GUI.TrayIcon.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.UriFormatException
   at System.Uri.CreateThis(System.String, Boolean, System.UriKind)
   at HttpServer.HttpClientContext.OnRequestLine(System.Object, HttpServer.Parser.RequestLineEventArgs)
   at System.EventHandler`1[[System.__Canon, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]].Invoke(System.Object, System.__Canon)
   at HttpServer.Parser.HttpRequestParser.Parse(Byte[], Int32, Int32)
   at HttpServer.HttpClientContext.OnReceive(System.IAsyncResult)
   at System.Net.LazyAsyncResult.Complete(IntPtr)
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
   at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Net.ContextAwareResult.Complete(IntPtr)
   at System.Net.LazyAsyncResult.ProtectedInvokeCallback(System.Object, IntPtr)
   at System.Net.Sockets.BaseOverlappedAsyncResult.CompletionPortCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)
   at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)

Source: Application Error

Faulting application name: Duplicati.GUI.TrayIcon.exe, version: 2.0.6.3, time stamp: 0x60cb15ba
Faulting module name: KERNELBASE.dll, version: 10.0.17763.2686, time stamp: 0x42f11d96
Exception code: 0xe0434352
Fault offset: 0x0000000000039319
Faulting process id: 0x1f74
Faulting application start time: 0x01d862a2f3aa0f7e
Faulting application path: C:\Program Files\Duplicati 2\Duplicati.GUI.TrayIcon.exe
Faulting module path: C:\Windows\System32\KERNELBASE.dll
Report Id: f9d88714-ae8b-42a8-ad2b-5239aa1ec966
Faulting package full name: 
Faulting package-relative application ID: 

Some more logs, I’ve pulled logs from a server that crashes the same way but has a constantly updating log file the file was 4.5Gb when I pulled these logs, it’s been logging for awhile now.

Notice that Duplicati was doing nothing.

2022-05-10 02:35:11 +02 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b9c42094d5acd403b8dfacecddfab57a4.dblock.zip.aes (99.92 MB)
2022-05-10 02:35:11 +02 - [Profiling-Timer.Finished-Duplicati.Library.Main.BackendManager-RemoteOperationGet]: RemoteOperationGet took 0:00:02:37.359
2022-05-10 02:35:11 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: UPDATE "RemoteVolume" SET "VerificationCount" = MAX(1, CASE WHEN "VerificationCount" <= 0 THEN (SELECT MAX("VerificationCount") FROM "RemoteVolume") ELSE "VerificationCount" + 1 END) WHERE "Name" = "duplicati-b9c42094d5acd403b8dfacecddfab57a4.dblock.zip.aes"
2022-05-10 02:35:11 +02 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: UPDATE "RemoteVolume" SET "VerificationCount" = MAX(1, CASE WHEN "VerificationCount" <= 0 THEN (SELECT MAX("VerificationCount") FROM "RemoteVolume") ELSE "VerificationCount" + 1 END) WHERE "Name" = "duplicati-b9c42094d5acd403b8dfacecddfab57a4.dblock.zip.aes" took 0:00:00:00.093
2022-05-10 02:35:12 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: DELETE FROM "LogData" WHERE "Timestamp" < 1649550912
2022-05-10 02:35:12 +02 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: DELETE FROM "LogData" WHERE "Timestamp" < 1649550912 took 0:00:00:00.031
2022-05-10 02:35:12 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: DELETE FROM "RemoteOperation" WHERE "Timestamp" < 1649550912
2022-05-10 02:35:12 +02 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: DELETE FROM "RemoteOperation" WHERE "Timestamp" < 1649550912 took 0:00:00:00.047
2022-05-10 02:35:12 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: PRAGMA optimize
2022-05-10 02:35:12 +02 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: PRAGMA optimize took 0:00:00:00.000
2022-05-10 02:35:12 +02 - [Profiling-Timer.Finished-Duplicati.Library.Main.Controller-RunBackup]: Running Backup took 0:00:05:12.203
2022-05-10 02:35:13 +02 - [Verbose-Duplicati.Library.Modules.Builtin.SendHttpMessage-HttpResponseMessage]: HTTP Response: 201 - Created: {"username":"bernarddt","ip":"102.130.118.85","backupsetname":"SVN_Repo","deletedFiles":"0","deletedFolders":"0","modifiedFiles":"0","examinedFiles":"97320","openedFiles":"0","addedFiles":"0","sizeOfModifiedFiles":"0","sizeOfAddedFiles":"0","sizeOfExaminedFiles":"6486009337","sizeOfOpenedFiles":"0","notProcessedFiles":"0","addedFolders":"0","tooLargeFiles":"0","filesWithError":"0","modifiedFolders":"0","modifiedSymlinks":"0","addedSymlinks":"0","deletedSymlinks":"0","partialBackup":0,"dryrun":0,"mainOperation":"Backup","parsedResult":"Success","version":"2.0.6.3 (2.0.6.3_beta_2021-06-17)","endTime":"2022-05-10 02:35:12","beginTime":"2022-05-10 02:30:00","duration":"00:05:11.9532422","result":"Success","backupset_id":23741,"updated_at":"2022-05-10 02:35:13","created_at":"2022-05-10 02:35:13","id":15009370}

>>>> The crash happened here at 2022-05-10 14:04:05 +02

2022-05-11 01:00:13 +02 - [Information-GetGpgProgramPath-gpg]: C:\ProgramData\Duplicati\updates\2.0.6.3\win-tools\gpg.exe
2022-05-11 01:00:13 +02 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Backup has started
2022-05-11 01:00:13 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.Controller-RunBackup]: Starting - Running Backup
2022-05-11 01:00:14 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteScalarInt64]: Starting - ExecuteScalarInt64: INSERT INTO "Operation" ("Description", "Timestamp") VALUES ("Backup", 1652223614); SELECT last_insert_rowid();
2022-05-11 01:00:14 +02 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteScalarInt64]: ExecuteScalarInt64: INSERT INTO "Operation" ("Description", "Timestamp") VALUES ("Backup", 1652223614); SELECT last_insert_rowid(); took 0:00:00:00.100
2022-05-11 01:00:14 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteReader]: Starting - ExecuteReader: SELECT "Key", "Value" FROM "Configuration" 
2022-05-11 01:00:14 +02 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteReader]: ExecuteReader: SELECT "Key", "Value" FROM "Configuration"  took 0:00:00:00.000
2022-05-11 01:00:14 +02 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteReader]: Starting - ExecuteReader: SELECT "Key", "Value" FROM "Configuration" 
2022-05-11 01:00:14 +02 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteReader]: ExecuteReader: SELECT "Key", "Value" FROM "Configuration"  took 0:00:00:00.000

I’m expecting that Duplicati is not crashing due to a backup related issue, but rather because of a URL request for the GUI that has an issue. Take note that Duplicati GUI is accessible via a public IP of the server. So it is also possible that some port scan or system/bad person is running a HTTP request scan that is hitting Duplicati GUI and causing it to fail.

Profiling log doesn’t show absolutely everything. It’s primarily to profile SQL database operations.
The TrayIcon is constantly polling server status, and those don’t show (just to give one example).
Presumably that’s safe though, and this was external. You could use Wireshark to watch traffic…

Fail might have passed through below in the web server. There are probably no available experts.
I think there’s also no access log, and if that’s true then external may be best (unless it’s HTTPS).

Regardless, one tiny step. Thanks for the update.

Good theory, you may be onto something. Can you lock down access to specific source IPs? Per the original developer, Duplicati’s web UI is not hardened and should not be exposed to the public internet.

Ok, I’ve switched to “Verbose” logging, maybe that will give us more to work with.

Verbose is a subset of Profiling, so I doubt it will show more. You can try.