Mono CPU utilization is absurdly high

What version of Duplicati are you sing?

the latest beta 2.0.3.3_beta_2018-04-02.

After computer cooled down I restarted. mono is now using less than 1% CPU but Duplicati is stuck on “Starting …” for a very long time. I think all this interrupting and restarting broke something.

Duplicati is pretty good at doing things as transaction sets so hopefully it’s just taking a while to clean up the previously unfinished transactions…

So it was stuck on “starting” for many hours and now it “seems” to start all over again with the backup. It lists the whole backup size as to go, but the files do not count down and the upload speed is stuck at 1.08 KB/s. So I think it is in fact stuck.
I stopped Duplicati one more time. Restarted computer, and now it seems to work again although it does not display the speed anymore (probably heard me yelling too often at it).
Here are some of the messages from the Duplicati log:

May 4, 2018 9:24 AM: Failed while executing "Backup" with id: 3
May 4, 2018 9:24 AM: Error in worker
May 4, 2018 9:09 AM: Reporting error gave error
May 4, 2018 9:09 AM: Request for http://localhost:8200/api/v1/backup/3/log?pagesize=100 gave error
May 3, 2018 4:58 PM: Error in updater

I’m seeing this same problem, CPU utilization is quite high while the backup is initializing (i.e. while Duplicati is “Counting…”–once that’s done and the actual backup proceeds, CPU utilization drops).

Duplicati 2.0.3.3_beta_2018-04-02
macOS High Sierra 10.13.4
Source data size: ~120 GB

If you click on the actual line with a date it should expand to show details - for example, clicking on the “Failed while executing…” line should expand to show more info about errors that might have occurred.

Is there any difference if you set the --thread-priority lower?

--thread-priority
Selects another thread priority for the process. Use this to set Duplicati to be more or less CPU intensive.
Default value: “normal”

When you’ve upgraded to 2.0.3.4 or newer you might also want to look at enabling --use-background-io-priority to see if that has any effect:

--use-background-io-priority
This option instructions the operating system to set the current process to use the lowest IO priority level, which can make operations run slower but will interfere less with other operations running at the same time
Default value: “false”

I’ve played around with this issue some more. Here’s what I’m seeing:

  • The performance is when dealing with large numbers of files
    • One of my machines has backup file counts in the four and five digits and doesn’t seem to have any issues (albeit it’s also Windows and utilizing specific features there).
    • My problematic machines (Macs) have file counts in the six and seven digit ranges.

For example, my main working laptop has a backup file count of over one million files–backups take about two hours, with most of that time spent “counting” files (that’s when the CPU utilization is at 400+ percent).

However, if I enable “check-filetime-only”, CPU utilization drops to less than 200% during the “counting” process. The amount of time it takes to “count” the files is still around two hours, though.

It sounds like a combination of hashing the files and comparing those hashes to whats in the database are what’s causing the issue. My guess is it’s mostly the database side of things.

There are some optimizations coming for the database, but I couldn’t say whether or not they’d help in this particular instance…

I am getting similar ~160% of CPU utilization by “mono-sgen” process with “2.0.4.5_beta_2018-11-28”. Average system load (per TOP) peaks at ~5.2 causing the Debian Linux box to generate “overload” notification. I used to occasionally receive such warning messages from Debian 8. Now, after upgrading to Debian 9 a week back, I am getting these warnings on every Duplicati backup session.

Does the problem continue with the latest canary versions? The latest is 2.0.4.18_canary_2019-05-12
Personally I’ve found most of the canary versions are quite stable (except certain ones like 2.0.4.13).

It does help to export the current backup configs to a file, install the canary as its own setup in a different folder, and then import the previous backups but point it to a different destination folder. This helps prevent the new version from potentially messing with your current version and backups.

Running latest canary (105-1) plus latest mono (6.8.0.105)
recreate database on a local fileset of 250GB flies to 90% completion and then slows to 2-5/1000s of 1% every couple of minutes.

‘top’ shows mono-sgen gobbling 100+% of cpu. Since last comments on this issue were about a year ago, and both duplicati and mono have new versions, I think it should be re-opened.

1 Like

I’m trying to diagnose this same issue. I’m using 2.0.5.1_beta_2020-01-18 (with mono 6.12.0.122 (2020-02/c621c35ffa0 Wed Feb 10 00:51:43 EST 2021)), and the CPU usage is just through the roof, making it almost impossible to use my laptop for anything else for 5-10 minutes at a time.

I’ve already enabled --use-background-io-priority, are there any other things I can try?

Welcome to the forum @mjpieters

thread-priority

Selects another thread priority for the process. Use this to set Duplicati to be more or less CPU intensive.

If you don’t mind more destination use, you might be able to reduce CPU use by compressing less:

  --zip-compression-level (Enumeration): Sets the Zip compression level
    This option controls the compression level used. A setting of zero gives no compression, and a setting of 9 gives
    maximum compression.
    * values: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
    * default value: BestCompression

Do you have usage statistics by process?

Is that the whole backup? If not, what do you notice in the UI or elsewhere when it gets worse or better?

How large is your backup? I’m curious about what Duplicati shows for Source size, Backup size, and backup versions. Also have you customized the “blocksize” setting (default is 100KiB)?

Certain backup phase operations can be slow if you have a large backup. The evaluation whether or not to do a compaction, for instance, is CPU intensive (although I think it’s single threaded).

I’ll give that a try, now set to ‘low’. I’d like to keep the compression high (I pay for online storage and it’d cost me more if compression was less aggressive).

Any tips on how to collect such stats? I do not have monitoring installed on this laptop ATM, and a prometheus exported I’d use for other circumstances feels like overkill here.

No, the back-up takes longer than that. Duplicati CPU spikes to 200, 300% and various apps are marked as unresponsive, depending on what I was trying to use at the time.

Currently:

Source:
133.06 GB
Backup:
136.33 GB / 14 Versions

No, I did not set the --block-size option; I stuck to the advice in Appendix C and left that at the default. I did set the remote volume size (--dblock-size=100MB), to take better advantage of my home fiber upload speeds.

Depends a bit on what distro you have, and what it has (or will let you get). Most likely easy tool is top.
There is also an htop that might be a little nicer, and iotop if you think disk usage may also be a factor.
Unless you have an SSD, a mechanical drive (especially a laptop one) may slow things if it gets busy.

Troubleshooting High CPU Usage in Linux is one article, but there are many others, and other tools…
Depending on tool, Duplicati can show up as mono or mono-sgen or Duplicati. There will be several.
Only one should be eating CPU time though. The first Duplicati that starts merely starts latest version.

How do you know this without having process usage statistics? What tool are you looking at to see it?
What CPU is this? On Windows, Task Manager can never show above 100% at all cores fully loaded.
Different tools may use different presentation, e.g. if you have a quad core, maybe 400% is all loaded.

I misunderstood then what you were asking for. I thought you were looking for time-series like data of CPU utilisation across a Duplicati backup run of some sort; e.g. if this was on Linux I’d have used atop. But, this is macOS, so atop is not available. iotop won’t work either, not without disabling kernel protections. It is a Macbook Pro (15", late 2018 model), so has an SSD.

I’ve been monitoring the process with htop already, and have seen system loads over 15 minutes of 10-12 when duplicati runs.

macOS activity monitor reports CPU utilisation across all cores.

Here are some screenshots of htop and activity monitor; at this point the OS is still pretty responsive. I filtered both tools to only show processes matching mono:

I’ve been experimenting with the thread-priority and use-background-io-priority options, and with the latest beta, backups now seem to complete without too much interference.

However, I do get an error each time, triggered by the latter option I presume:

Failed to reset background IO priority, exitcode: 1, stderr: renice: 33171: setpriority: Permission denied

Also, apologies for the slow burn on this thread, I’ve been a bit swamped with project work.

Thanks for isolating it to a process. You might be able to distinguish between user and system times.
I haven’t found a nice way to get Linux to tell me current per-process usage (probably just missing it).

Usage over process run can be seen by below. The system spends lots of time looking through files.
That makes system load, but not a lot of CPU by process itself. I wonder if your load is file checking?
One way to tell would be to run two backups close together, with very little data backed up (see logs).

# time find / -mount -mtime 0
...
real    0m58.415s
user    0m1.175s
sys     0m5.767s

If you can’t find another tool to show user versus system time, and you need a command like above,
Export As Command-line will make you one, but don’t run it at the same time as the original backup.
If you do, they will collide somewhere, maybe in database, maybe in destination, and create a mess.

Can’t find prior exact reports, but code might be the below. This is in DeactivateBackgroundIOPriority() which possibly means it’s harmless to the objective of making Duplicati lighter on system’s resources.
I don’t have macOS (and that may be a developer limitation also, if you’re seeking further investigation).

Sorry for the huge gaps in between responses, there is a client project going on right now that sucks up most of my time.

Running the same larger backup, twice, gives me this time for the second run:

real	26m42.191s
user	33m54.295s
sys	10m54.429s

The command I used was (edited for privacy and clarity):

time /Applications/Duplicati.app/Contents/MacOS/duplicati-cli backup "s3s://..." \
  /path (x10) \
  --use-background-io-priority=true --thread-priority=idle --backup-name=... --dbpath=/... \
  --encryption-module=aes --compression-module=zip --dblock-size=100MB --passphrase=... \
  --retention-policy="1W:1D,4W:1W,12M:1M" --exclude-files-attributes="temporary,system" \
  --disable-module=console-password-input --exclude=pattern (x28)

I note that the sqlite database file for this backup is 2.9G small and wonder how much that has an impact on performance.

I’m sorry to say that I am now in the process of evaluating other backup solutions that don’t have as big an impact on my laptop. I frequently find myself disabling Duplicati for the duration of my work day just to make sure I can get work done. :-/

Was there a time for the first run to compare, and can you show statistics from the job logs for both, e.g.

image

If the second run has very little file change, then “Opened” should get much smaller, however “Examined” should be little changed. Need to distinguish between file-finding time and actual backup time of changes.

There are other statistics in the logs, especially Complete log where “BytesUploaded” shows how much actually got uploaded. One thing to beware of when comparing runs is that sometimes a Compacting files at the backend will run and perform extra work, but it’s visible in the job log in the Compact Phase section. Setting no-auto-compact while doing performance testing can prevent compact from messing with timing.

If that’s 28 exclude options, that might slow things. You could certainly try a test with The TEST-FILTERS command to see how long it takes just to walk through all files. You could even redirect output to /dev/null provided you run from true CLI (you can Export As Command-line and edit syntax for different command).