Duplicati thread count constantly growing

Hello!
Duplicati is generally a great software, which I am happy to use!
However, I have noticed that Duplicati is using more and more threads, which is causing a load on the server, and then Duplicati has to be manually restarted.

I run a Debian 10/11 server (AMD64) with Xamarin Mono 6.12.0.122.

Heres monitoring images the last two days, with a restart 1 day ago:



Heres monitoring images the last two months, with a four Duplicate restarts:



My guess is that a thread is not properly terminated and then a new thread is started for that task, and the thread count continously grows and causes extra load.

I have considered scheduling a restart in cron, but that may wreck things if Duplicati is currently running a backup.

Is this anything that is noticed on other systems?

Rickard

I wonder if the setting concurrency-max-threads works on that. Release: 2.0.3.6 (canary) 2018-04-23

Actually that gives me an idea for another test for another issue =)

It by default uses many threads. I don’t think more or less thread use could be as easily identified through an assumption especially as backups change and as well as what’s done eg verification, deletion of old files, etc.

I can add that I am running the latest beta, 2.0.6.3_beta_2021-06-17, and I have not changed any advanced parameters.

I was actually saying to change it but do as you like lol :slight_smile:

It would be helpful to ID such threads, however I don’t think Linux facilities can ID threads in C# context.
If you want to try, you can ps for mono-sgen for Duplicati. Look in /proc by PID at all its task stacks e.g.
under /proc/14837/task do head */stack. For me, I see them, but unsurprisingly get no C# context.

Ideal is probably a debugger, but I have none for Linux and am not sure which one could check threads.
I’m not seeing my threads on Windows heading way high. I backup to Google Drive. What is this using?

The mono system does have a last-resort thread dump which actually looks pretty nice, but says it may crash the program, so probably best to have it running as TrayIcon, launched from a shell, and paused.
After getting the stack traces, quit from TrayIcon menu (assuming it still works, and it did when I tried it).

It is possible to obtain a stack trace of all the active threads in Mono  by  sending  the
       QUIT signal to Mono, you can do this from the command line, like this:

            kill -QUIT pid

       Where  pid  is  the  Process ID of the Mono process you want to examine.  The process will
       continue running afterwards, but its state is not guaranteed.

       Important: this is a last-resort mechanism for debugging applications and  should  not  be
       used  to  monitor  or  probe a production application.  The integrity of the runtime after
       sending this signal is not guaranteed and the application might crash or terminate at  any
       given point afterwards.

Hello!
The first test, running your suggested command lists 35 identical blocks (same as number threads), where only the “*****/stack” is differing:

root:/proc/6722/task# head */stack
==> 13457/stack <==
[<0>] futex_wait_queue_me+0xc1/0x120
[<0>] futex_wait+0x13f/0x240
[<0>] do_futex+0x3f6/0xbe0
[<0>] __x64_sys_futex+0x143/0x180
[<0>] do_syscall_64+0x53/0x110
[<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[<0>] 0xffffffffffffffff

==> 13458/stack <==

The second test, with “kill”, didn’t do anything? Duplicati survived, but there was no output to the CLI, and I tried to send stderr to a file, but it was all empty.

I used Google Drive and Jotta Cloud as backends.

Are you looking at the CLI window where duplicati was manually launched with duplicati command?
Here’s an example of what ps -lp has. You can see that it’s on a terminal, and I guess it writes to there.

F S   UID   PID  PPID  C PRI  NI ADDR SZ WCHAN  TTY          TIME CMD
0 S  1000 14837 14833  0  80   0 - 244163 poll_s pts/0   00:00:09 mono-sgen
$ duplicati
Full thread dump:

"Thread Pool Worker"  at <unknown> <0xffffffff>
  at (wrapper managed-to-native) System.Threading.WaitHandle.Wait_internal (intptr*,int,bool,int) [0x00000] in <533173d24dae460899d2b10975534bb0>:0
  at System.Threading.WaitHandle.WaitOneNative (System.Runtime.InteropServices.SafeHandle,uint,bool,bool) [0x00044] in <533173d24dae460899d2b10975534bb0>:0
  at System.Threading.WaitHandle.InternalWaitOne (System.Runtime.InteropServices.SafeHandle,long,bool,bool) [0x00014] in <533173d24dae460899d2b10975534bb0>:0
  at System.Threading.WaitHandle.WaitOne (long,bool) [0x00000] in <533173d24dae460899d2b10975534bb0>:0
  at System.Threading.WaitHandle.WaitOne (int,bool) [0x00019] in <533173d24dae460899d2b10975534bb0>:0
  at Duplicati.Server.EventPollNotify.Wait (long,int) [0x00042] in <87f95256fc6a4c3ea353de8d2aacf89b>:0
  at Duplicati.Server.WebServer.RESTMethods.RequestInfo.LongPollCheck (Duplicati.Server.EventPollNotify,long&,bool&) [0x00119] in <87f95256fc6a4c3ea353de8d2aacf89b>:0
  at Duplicati.Server.WebServer.RESTMethods.ServerState.GET (string,Duplicati.Server.WebServer.RESTMethods.RequestInfo) [0x0000c] in <87f95256fc6a4c3ea353de8d2aacf89b>:0
  at Duplicati.Server.WebServer.RESTHandler.DoProcess (Duplicati.Server.WebServer.RESTMethods.RequestInfo,string,string,string) [0x00146] in <87f95256fc6a4c3ea353de8d2aacf89b>:0
  at Duplicati.Server.WebServer.RESTHandler.DoProcess (HttpServer.IHttpRequest,HttpServer.IHttpResponse,HttpServer.Sessions.IHttpSession,string,string,string) [0x00009] in <87f95256fc6a4c3ea353de8d2aacf89b>:0
  at Duplicati.Server.WebServer.RESTHandler.Process (HttpServer.IHttpRequest,HttpServer.IHttpResponse,HttpServer.Sessions.IHttpSession) [0x000b6] in <87f95256fc6a4c3ea353de8d2aacf89b>:0
  at HttpServer.HttpServer.HandleRequest (HttpServer.IHttpClientContext,HttpServer.IHttpRequest,HttpServer.IHttpResponse,HttpServer.Sessions.IHttpSession) [0x0003e] in <bed89f1655ee48029f6d6812f54c58ad>:0
  at HttpServer.HttpServer.ProcessRequest (HttpServer.IHttpClientContext,HttpServer.IHttpRequest) [0x00121] in <bed89f1655ee48029f6d6812f54c58ad>:0
  at HttpServer.HttpServer.ProcessRequestWrapper (HttpServer.IHttpClientContext,HttpServer.IHttpRequest) [0x0000b] in <bed89f1655ee48029f6d6812f54c58ad>:0
  at HttpServer.HttpServer.OnRequest (object,HttpServer.RequestEventArgs) [0x0004b] in <bed89f1655ee48029f6d6812f54c58ad>:0
  at (wrapper delegate-invoke) System.EventHandler`1<TEventArgs_REF>.invoke_void_object_TEventArgs (object,TEventArgs_REF) [0x00074] in <533173d24dae460899d2b10975534bb0>:0
  at HttpServer.HttpListenerBase.OnRequestReceived (object,HttpServer.RequestEventArgs) [0x00000] in <bed89f1655ee48029f6d6812f54c58ad>:0
  at (wrapper delegate-invoke) System.EventHandler`1<TEventArgs_REF>.invoke_void_object_TEventArgs (object,TEventArgs_REF) [0x00074] in <533173d24dae460899d2b10975534bb0>:0
  at HttpServer.HttpContextFactory.OnRequestReceived (object,HttpServer.RequestEventArgs) [0x00000] in <bed89f1655ee48029f6d6812f54c58ad>:0
  at (wrapper delegate-invoke) System.EventHandler`1<TEventArgs_REF>.invoke_void_object_TEventArgs (object,TEventArgs_REF) [0x00074] in <533173d24dae460899d2b10975534bb0>:0
  at HttpServer.HttpClientContext.OnRequestCompleted (object,System.EventArgs) [0x00052] in <bed89f1655ee48029f6d6812f54c58ad>:0
  at (wrapper delegate-invoke) <Module>.invoke_void_object_EventArgs (object,System.EventArgs) [0x00074] in <533173d24dae460899d2b10975534bb0>:0
  at HttpServer.Parser.HttpRequestParser.OnRequestCompleted () [0x00000] in <bed89f1655ee48029f6d6812f54c58ad>:0
  at HttpServer.Parser.HttpRequestParser.Parse (byte[],int,int) [0x00175] in <bed89f1655ee48029f6d6812f54c58ad>:0
  at HttpServer.HttpClientContext.OnReceive (System.IAsyncResult) [0x00059] in <bed89f1655ee48029f6d6812f54c58ad>:0
  at System.Net.Sockets.SocketAsyncResult/<>c.<Complete>b__27_0 (object) [0x0000b] in <9c6e2cb7ddd8473fa420642ddcf7ce48>:0
  at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem () [0x00008] in <533173d24dae460899d2b10975534bb0>:0
  at System.Threading.ThreadPoolWorkQueue.Dispatch () [0x00074] in <533173d24dae460899d2b10975534bb0>:0
  at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback () [0x00000] in <533173d24dae460899d2b10975534bb0>:0
  at (wrapper runtime-invoke) <Module>.runtime_invoke_bool (object,intptr,intptr,intptr) [0x0002a] in <533173d24dae460899d2b10975534bb0>:0

(etc.)

You can apparently also get a clue from ps as to what the threads are. Here’s an example of doing that:

$ ps -L -p 14837
  PID   LWP TTY          TIME CMD
14837 14837 pts/0    00:00:01 mono-sgen
14837 14838 pts/0    00:00:00 SGen worker
14837 14839 pts/0    00:00:00 Finalizer
14837 14841 pts/0    00:00:00 mono-sgen
14837 14842 pts/0    00:00:00 mono-sgen
14837 14843 pts/0    00:00:00 Timer-Scheduler
14837 14844 pts/0    00:00:00 UpdatePollThrea
14837 14845 pts/0    00:00:00 WorkerThread<IR
14837 14846 pts/0    00:00:00 TaskScheduler
14837 14847 pts/0    00:00:00 mono-sgen
14837 14850 pts/0    00:00:00 mono-sgen
14837 14851 pts/0    00:00:00 Thread Pool I/O
14837 14854 pts/0    00:00:00 TrayIcon Reques
14837 14855 pts/0    00:00:00 TrayIcon Longpo
14837 14858 pts/0    00:00:00 gmain
14837 14861 pts/0    00:00:00 gdbus
14837 17973 pts/0    00:00:00 Thread Pool Wor
14837 18014 pts/0    00:00:00 Thread Pool Wor
14837 18016 pts/0    00:00:00 Thread Pool Wor
14837 18088 pts/0    00:00:00 Timer-Scheduler
14837 18114 pts/0    00:00:00 Thread Pool Wor

I am running Duplicati as a service, that caused the missing result. I tried running it as you suggested, but did not really get it going. That method will means that Duplicati is just started anyway, and then it’s hard to find the run away threads?

The second method gave this:
ps -L -p 1496
PID LWP TTY TIME CMD
1496 1496 ? 00:00:01 mono-sgen
1496 1497 ? 00:00:00 SGen worker
1496 1498 ? 00:00:00 Finalizer
1496 1500 ? 00:00:00 mono-sgen
1496 1501 ? 00:00:00 Timer-Scheduler
1496 1502 ? 00:00:00 UpdatePollThrea
1496 1503 ? 00:00:02 WorkerThread<IR
1496 1504 ? 00:00:00 TaskScheduler
1496 1505 ? 00:00:00 mono-sgen
1496 1506 ? 00:00:01 Thread Pool I/O
1496 1508 ? 00:00:02 Thread Pool Wor
1496 1509 ? 00:00:02 Thread Pool Wor
1496 1510 ? 00:00:06 Thread Pool Wor
1496 1511 ? 00:00:00 Thread Pool Wor
1496 1512 ? 00:00:01 Thread Pool Wor
1496 1513 ? 00:00:01 Thread Pool Wor
1496 1514 ? 00:00:01 Thread Pool Wor
1496 1516 ? 00:00:02 Thread Pool Wor
1496 1953 ? 00:00:00 Timer-Scheduler
1496 1957 ? 00:00:01 Thread Pool Wor
1496 1958 ? 00:00:03 Thread Pool Wor
1496 1959 ? 00:00:01 Thread Pool Wor
1496 1960 ? 00:00:00 Thread Pool Wor
1496 1961 ? 00:00:00 Thread Pool Wor
1496 1962 ? 00:00:00 Thread Pool Wor
1496 1963 ? 00:00:01 Thread Pool Wor
1496 1964 ? 00:00:00 Thread Pool Wor
1496 1965 ? 00:00:00 Thread Pool Wor

The interesting part is when Duplicati has been running for some time and accumulated a lot of threads.

You’d probably want to look at threads early, wait for more, then look to see how thread view changed.
Based on your graphs, you can probably see something in a week or two, and you know the baseline.
It’s seemingly possible to capture output using the systemd-cat program if you wish to keep a service.

Hello!

I let the threads grow, I’m now up to 149.

I ran the list of threads, and it’s clear that it’s the “Thread Pool Wor” thread that is not ended properly.

ps -L -p 1496
PID LWP TTY TIME CMD
1496 1496 ? 00:00:07 mono-sgen
1496 1497 ? 00:12:39 SGen worker
1496 1498 ? 00:00:32 Finalizer
1496 1500 ? 00:00:07 mono-sgen
1496 1501 ? 13:17:37 Timer-Scheduler
1496 1502 ? 00:00:06 UpdatePollThrea
1496 1503 ? 01:24:08 WorkerThread<IR
1496 1504 ? 00:00:26 TaskScheduler
1496 1505 ? 00:00:06 mono-sgen
1496 1506 ? 00:05:40 Thread Pool I/O
1496 14371 ? 00:17:28 Thread Pool Wor
1496 16427 ? 00:59:30 Thread Pool Wor
1496 18669 ? 00:17:17 Thread Pool Wor
1496 18679 ? 00:18:25 Thread Pool Wor
1496 8238 ? 00:17:28 Thread Pool Wor
1496 8239 ? 00:18:23 Thread Pool Wor
1496 8240 ? 00:17:33 Thread Pool Wor
1496 8242 ? 00:18:13 Thread Pool Wor
1496 8243 ? 00:18:57 Thread Pool Wor
1496 8244 ? 00:18:23 Thread Pool Wor
1496 8248 ? 00:18:06 Thread Pool Wor
1496 25512 ? 00:19:57 Thread Pool Wor
1496 25513 ? 00:17:20 Thread Pool Wor
1496 25514 ? 00:16:59 Thread Pool Wor
1496 29959 ? 00:16:22 Thread Pool Wor
1496 29960 ? 00:16:48 Thread Pool Wor
1496 1935 ? 00:17:38 Thread Pool Wor
1496 1939 ? 00:16:13 Thread Pool Wor
1496 3073 ? 00:16:01 Thread Pool Wor
1496 4927 ? 00:15:14 Thread Pool Wor
1496 9654 ? 00:15:35 Thread Pool Wor
1496 9655 ? 00:16:15 Thread Pool Wor
1496 21751 ? 00:14:32 Thread Pool Wor
1496 21752 ? 00:15:44 Thread Pool Wor
1496 25672 ? 00:13:56 Thread Pool Wor
1496 25673 ? 00:13:53 Thread Pool Wor
1496 25674 ? 00:13:00 Thread Pool Wor
1496 25675 ? 00:14:24 Thread Pool Wor
1496 32586 ? 00:12:26 Thread Pool Wor
1496 32587 ? 00:12:42 Thread Pool Wor
1496 32588 ? 00:13:46 Thread Pool Wor
1496 2396 ? 00:13:52 Thread Pool Wor
1496 2397 ? 00:13:23 Thread Pool Wor
1496 2398 ? 00:13:03 Thread Pool Wor
1496 8807 ? 00:12:49 Thread Pool Wor
1496 10265 ? 00:14:31 Thread Pool Wor
1496 12020 ? 00:12:27 Thread Pool Wor
1496 12021 ? 00:12:26 Thread Pool Wor
1496 19321 ? 00:12:19 Thread Pool Wor
1496 19322 ? 00:12:08 Thread Pool Wor
1496 19323 ? 00:12:11 Thread Pool Wor
1496 19325 ? 00:12:12 Thread Pool Wor
1496 29964 ? 00:12:08 Thread Pool Wor
1496 29965 ? 00:12:48 Thread Pool Wor
1496 29966 ? 00:12:24 Thread Pool Wor
1496 29967 ? 00:11:58 Thread Pool Wor
1496 29968 ? 00:12:13 Thread Pool Wor
1496 29969 ? 00:12:39 Thread Pool Wor
1496 29970 ? 00:12:42 Thread Pool Wor
1496 29971 ? 00:11:47 Thread Pool Wor
1496 29972 ? 00:12:44 Thread Pool Wor
1496 29973 ? 00:11:43 Thread Pool Wor
1496 29019 ? 00:10:24 Thread Pool Wor
1496 29020 ? 00:10:25 Thread Pool Wor
1496 29021 ? 00:12:53 Thread Pool Wor
1496 29022 ? 00:11:01 Thread Pool Wor
1496 29023 ? 00:11:29 Thread Pool Wor
1496 29025 ? 00:11:27 Thread Pool Wor
1496 29026 ? 00:11:38 Thread Pool Wor
1496 29027 ? 00:11:57 Thread Pool Wor
1496 29875 ? 00:09:05 Thread Pool Wor
1496 29876 ? 00:09:46 Thread Pool Wor
1496 29877 ? 00:09:33 Thread Pool Wor
1496 29878 ? 00:09:23 Thread Pool Wor
1496 29879 ? 00:08:54 Thread Pool Wor
1496 29880 ? 00:09:37 Thread Pool Wor
1496 29881 ? 00:09:13 Thread Pool Wor
1496 29882 ? 00:09:55 Thread Pool Wor
1496 29883 ? 00:08:58 Thread Pool Wor
1496 29884 ? 00:09:01 Thread Pool Wor
1496 4443 ? 00:05:00 Thread Pool Wor
1496 4444 ? 00:05:31 Thread Pool Wor
1496 4445 ? 00:05:36 Thread Pool Wor
1496 4446 ? 00:04:56 Thread Pool Wor
1496 4447 ? 00:05:39 Thread Pool Wor
1496 4448 ? 00:05:02 Thread Pool Wor
1496 4449 ? 00:05:02 Thread Pool Wor
1496 4450 ? 00:05:40 Thread Pool Wor
1496 4451 ? 00:05:52 Thread Pool Wor
1496 4452 ? 00:05:38 Thread Pool Wor
1496 10831 ? 00:04:59 Thread Pool Wor
1496 25217 ? 00:04:02 Thread Pool Wor
1496 15708 ? 00:03:45 Thread Pool Wor
1496 25426 ? 00:03:20 Thread Pool Wor
1496 25427 ? 00:03:00 Thread Pool Wor
1496 25428 ? 00:03:03 Thread Pool Wor
1496 25429 ? 00:02:59 Thread Pool Wor
1496 25430 ? 00:03:27 Thread Pool Wor
1496 25431 ? 00:03:00 Thread Pool Wor
1496 25432 ? 00:03:02 Thread Pool Wor
1496 25433 ? 00:03:02 Thread Pool Wor
1496 25434 ? 00:03:04 Thread Pool Wor
1496 25435 ? 00:03:03 Thread Pool Wor
1496 11485 ? 00:02:51 Thread Pool Wor
1496 11486 ? 00:03:14 Thread Pool Wor
1496 24379 ? 00:02:55 Thread Pool Wor
1496 28905 ? 00:02:41 Thread Pool Wor
1496 28906 ? 00:02:45 Thread Pool Wor
1496 730 ? 00:02:56 Thread Pool Wor
1496 7607 ? 00:02:29 Thread Pool Wor
1496 7608 ? 00:02:29 Thread Pool Wor
1496 7610 ? 00:03:28 Thread Pool Wor
1496 7611 ? 00:02:33 Thread Pool Wor
1496 7612 ? 00:02:32 Thread Pool Wor
1496 7613 ? 00:02:29 Thread Pool Wor
1496 7614 ? 00:02:29 Thread Pool Wor
1496 7615 ? 00:02:32 Thread Pool Wor
1496 7616 ? 00:02:36 Thread Pool Wor
1496 7617 ? 00:02:53 Thread Pool Wor
1496 13280 ? 00:02:32 Thread Pool Wor
1496 18531 ? 00:00:20 Timer-Scheduler
1496 27808 ? 00:01:17 Thread Pool Wor
1496 27809 ? 00:01:18 Thread Pool Wor
1496 27810 ? 00:01:14 Thread Pool Wor
1496 30797 ? 00:00:36 Thread Pool Wor
1496 12423 ? 00:00:32 Thread Pool Wor
1496 12425 ? 00:00:33 Thread Pool Wor
1496 12426 ? 00:00:32 Thread Pool Wor
1496 12427 ? 00:00:34 Thread Pool Wor
1496 12428 ? 00:00:32 Thread Pool Wor
1496 12429 ? 00:00:33 Thread Pool Wor
1496 12430 ? 00:00:38 Thread Pool Wor
1496 12431 ? 00:00:35 Thread Pool Wor
1496 12432 ? 00:00:33 Thread Pool Wor
1496 12433 ? 00:00:33 Thread Pool Wor
1496 25397 ? 00:00:30 Thread Pool Wor
1496 25398 ? 00:00:29 Thread Pool Wor
1496 31556 ? 00:00:13 Thread Pool Wor
1496 6320 ? 00:00:10 Thread Pool Wor
1496 6321 ? 00:00:10 Thread Pool Wor
1496 6322 ? 00:00:10 Thread Pool Wor
1496 6323 ? 00:00:10 Thread Pool Wor
1496 6324 ? 00:00:11 Thread Pool Wor
1496 6325 ? 00:00:10 Thread Pool Wor
1496 6326 ? 00:00:10 Thread Pool Wor
1496 6327 ? 00:00:11 Thread Pool Wor
1496 6328 ? 00:00:11 Thread Pool Wor
1496 6329 ? 00:00:11 Thread Pool Wor

This might be the only debug method left that I can suggest. What happened when you tried to get it going?
Unfortunately the issue takes awhile to show up. You also have two different destinations mixed together…

If you can come up with faster and simpler steps to reproduce, then maybe send an Issue to development. This is more than a user forum can handle, unless there are some C# developers around who can advise.

It may very well be more than developers can handle unless they can reproduce it, or figure out some way that you can get a memory dump of some sort from your mono that can be fed into a tool (if one exists…).

Not on my Windows system hourly backups to Google Drive, and I can spot no reports in forum or Issues. Because there’s quite a bit of noise (thread may refer to a discussion thread), I also ran a title-only search.

EDIT:

It would be interesting to know how many of them increase their CPU times when a backup isn’t running.
Rather than eyeball extensively, maybe capture to a file and run diff on two to see which lines changed.

I’m no expert, but I think these workers are idle when there’s no work (see above test to see if that’s true).
The CLR Thread Pool ‘Thread Injection’ Algorithm is an article on how Microsoft CLR manages pool size.
Unfortunately, that’s no guarantee that mono works the same way, but I hope it adjusts the size somehow.

I’ve been casually inspecting the output of ps -L -p <pid> the past few days and it doesn’t seem to be growing out of control for me. The number of “Thread Pool Wor” goes up occasionally, then goes back down. This is on Linux with mono 6.12.0.

Are your backup jobs completing successfully?