Backup taking long time, log doesn't show why

Aloha fellow duplicati peoples,

I’m having a rather difficult time understanding why some of my backups are taking a looooong time to finish.

We’ve got a Hyper-V machine that runs Microsoft Exchange, total size is 1.20 TB.
The machine is located on both ISCSI disk on a TrueNAS machine that’s connected with 10 gbit/s and a locally attached SSD.

The backup is taking 8 hours and 38 minutes.

However, when looking at the log the Test Phase took 12 seconds, Compact phase took 1 minute and 10 seconds and Delete Phase took 4 minutes and 48 seconds.

The backup gets placed on an SMB-share on the TrueNAS it’s connected to with 10 gbit/s (There’s one 10 gbit/s switch between the Hyper-V host and the TrueNAS, same rack, no latency issues)

So what in gods name is the server doing that’s taking such long time?
The log in it’s entirity can be found here: EXCHANGE_DUPLICATI_LOG - Pastebin.com

Another server is phasing a similar issue, it’s connected with a 1 gbit/s connection to the TrueNAS, but when looking at duplicati it’s doing:

Current action: Backup_ProcessingFiles
4 files (202.84GB) to go at 3.00 MB/s

However, often when looking at the physical server itself during “Backup_ProcessingFiles” it’s doing more recieving data than uploading, yesterday it would recieve hundreds of mbit/s a second when Duplicati was “Backup_ProcessingFiles”.

Exactly what is Duplicati during that time?

How can I troubleshoot were the slow downs are happening?

I’ve ran SSD benchmarks on the iscsi disks on the Exchange Guest OS, with the results being just fine:

bild

Welcome to the forum @emigu

I don’t run any of this, but what is where? Where is Duplicati, and where are all of its backup source files?

You have two physical servers, right? Duplicati would be on Hyper-V. If Exchange data is elsewhere (e.g. the TrueNAS iSCSI disk), then it would have to be read through in order to locate the changes to backup.

"SizeOfModifiedFiles": 1537317273600 (about 1.4 TiB, in 5 files) is what needs reading. If backup is taking say 8 hours for the backup phase, that’s about 53 Mbytes per second to read then check blocks in local database to see if they’ve changed which would mean uploading those changed blocks into backup.

"BytesUploaded": 9678289105 or about 10 MB is the actual amount of the change, so a small fraction.

You should see a nice big summary of the activity in the job log. For example, my backup shows me this:

image

and everything that’s modified is opened and read through. You can speed up block processing using the blocksize option. 100 KB is probably a bit too small for the size of backup – too many tiny blocks to track.

Choosing sizes in Duplicati talks about this some. It looks like you raised remote volume size (500 MB?).
There are no hard-and-fast rules for sizes, but I’m thinking a 1 or 2 MB blocksize might be more suitable.

How the backup process works
Processing similar data is why it only uploaded the roughly 0.6 percent of source data that had changed.

Note that I’m guessing about this configuration, so if I guessed wrong about what is where, please clarify.

Hello and thanks for your explanations.

You are also 100% correct. Duplicati is run on the physical server, which is running Hyper-V Server 2019 (Which, basically, is Windows Server 2019 Core).

From there we backup all our guests, and we send them directly to our NAS shares over SMB.

So the physical server has ISCSI drives on the TrueNAS connected to it, and those are connected to some VM’s, other VM’s have pure local storage.

All of these are then backed up to the TrueNAS via SMB.

So understandbly it’s quite a lot of data that needs to be read, but regardless 53 Mbytes / Sec is a hell of a lot too little on a 10 gbit/s connection.

I will try with changing the blocksize, much appreciated. :slight_smile:

You’d have to try to locate bottleneck, for example, is TrueNAS on mechanical drive?

Spinning disks with an SSD read cache in front of it, but data is most likely pulled from the spin disks. But as you can see, the benchmarks at least go up to 500-600 MB/s over the ISCSI, I’d give it slightly less given that it takes data from ISCSI and moves it to SMB.

They also go down to below 10. Unless the read cache does file read-ahead (which could require it to be aware of the filesystem, and I’m not sure how common that is), I wonder if using multiple default Duplicati threads is getting seek time slowdown similar to random reads with a few threads – numbers certainly fit.

concurrency-block-hashers reduced to 1 could possibly help, although it could hurt if the limit is the CPU.
Figuring out where the performance bottleneck is involves much guessing without measuring the system.
Don’t you at least have something on the NAS that would show whether or not the hard drive stays busy?

Below (I think) is where the reads happen. I’m assuming blocksize is blocksize so that might impact I/O:

After this is done, the block and its hash are sent downstream for further processing, including throwing away the block data if a database query shows it already backed up. I don’t know if Duplicati database is residing on a fast drive, but a flood of database lookups as blocks are read also uses CPU and drive I/O.
What performance tools does Windows have? Task Manager, Resource Monitor, Performance Monitor?

The database is located on a regular SSD (450 MB/s R/W)

The harddrives mostly stays busy, but not near their full capacity on the NAS. That’s what I am finding a bit strange.

On the Windows machine, nothing is even sweating. CPU is a 2950X ThreadRipper and that’s barely sweating (Stays below 10% usage, and that’s while running 10+ guests), SSD isn’t sweating much either and network is unsaturated.

From what I gather, your recommendation is to increase the blocksize option to 1/2 MB? :slight_smile:

Sounds like plenty of processing power in total across what I think is 16 physical and 32 logical cores. Duplicati uses threads, however I’m not sure how .NET Framework, Windows, Hyper-V, etc. do them.

There are so many ways to set up a NAS. I’m hoping you have per-drive info. An aggregate means less, similarly to how a single-threaded CPU workload could possibly max out at 3% total CPU use on 2950X.

Detailed performance analysis requires equipment and skills that I’m not sure any current volunteer has.

You could try some high-level experiments for understanding purposes, e.g. test speed with nearby files.
With a total of five files to back up, perhaps one big enough to test and small enough to fit could turn into local (SSD?) test file to see how quickly Duplicati can read through it to pick out a few (or zero) changes.

A few bogus dummy changes could be made by HxD - Freeware Hex Editor and Disk Editor and similar.
Even easier is to trigger Duplicati to read the file through using a timestamp or (I think) attributes change.

People do put Duplicati on FreeNAS/TrueNAS systems, but it’s not always easy, especially for FreeBSD, which has trouble getting current mono versions (5.10, fortunately, will do), and with other setup oddities. .NET 5 or beyond (Duplicati’s future) on FreeBSD doesn’t exist yet. Meanwhile TrueNAS SCALE is Linux.

One reason to keep Duplicati on Windows is that one has a chance to get Windows VSS using Duplicati snapshot-policy option. Without a snapshot, backup would probably get a very inconsistent view of files…

or maybe 1 MB. You’d have to create new backups (can’t change existing) to see which tunings help any.
Tuning concurrency-block-hashers down to 1 to enjoy more like sequential file read was another thought.

Sysinternals Process Monitor is available if you want to actually look at file activities. In addition to reading remote files, you’ll probably also see reads of your backup job database, maybe also internal SQLite files using an .etilqs suffix instead of .sqlite, and some dup-* files in a Windows user profile Temp folder.
tempdir option will let you relocate the Duplicati Temp files if they’re on a slow drive, but SQLite has limits:

Note that SQLite will always put temporary files in the system default temporary folder.

I take it that the Duplicati devs haven’t done any testing to see how fast it can go?

There’s too much info scattered on speed that I can’t get a good idea of if eg can even saturate 10 gbit.

Current thinking on speeding up Duplicati - #2 by drwtsn32

But according to this ^ (about 1 year old post), it can do about 50MB at least. I don’t know if it can get faster without a bunch of code work.

Really don’t think 10gbit was in mind when it was first coded so it probably has to be reworked to achieve high speed.

There was some SQL tuning to maximize backup speed, but the case here sounds more like read rates on hardware that no Duplicati developer has. Performance analysis can be a huge amount of work to get done, and there are plenty of competing goals, e.g. to get reliability up enough to actually get out of the long Beta…

Agree there’s plenty of possibility but I don’t see Duplicati maximizing 10 gbit.

Just saying Duplicati may be the biggest issue. It really needs a thorough look into to see where it can currently max out at.

I suspect something else will be the limit first, especially on a backup that actually writes a lot of data.
There’s first preparation of compressed .zip files of blocks in a temporary area, then encryption and transmission and storage somewhere. There are also a lot of database actions of varying complexity.

How the backup process works reminds me that for many backups where little is changed, the time required sometimes is in walking the filesystem to see which files look like they were modified. Using
usn-policy on Windows can sometimes help that, but the walk is likely not an issue here, with 5 files.

The current case is atypical IMO because it’s source-read-intensive, not really backing up that much.
BTW there’s a progress bar on the home page that should show exactly how long it takes on one file.
This would go well with the test I suggested, which is how long it takes to read big file to find changes.

Stated that way, what’s the point? Maximum hardware is not affordable, and time/skill is hugely limited
I once worked for a company where performance-crown-of-the-year helped sales. Doesn’t apply here.

If the statement is to do analysis (e.g. profiling) to tune code, that’s different, but just to get a max rate?
Main thing I can think of is it would help in chases like this, if a fastest-possible-speed-this-year existed.

There are definitely comparisons contributed occasionally of Duplicati versus other backups. There’s a dedicated forum category called Comparison that gets some of this (and others might exist elsewhere).

Big Comparison - Borg vs Restic vs Arq 5 vs Duplicacy vs Duplicati is a recent one, done on a desktop.

Anybody is most welcome to look for a max speed. Maybe we’re doing that here, but without the skills to identify the bottlenecks – which is why I suggest changing the configuration to see if local SSD is faster. Maybe it doesn’t match the actual backup needs, where the drive is remote. It would be to understand…

There are other things to potentially look into, e.g. is memory limited so that programs are page faulting. Task Manager shows these if you turn on the PF Delta column. Most are soft page faults, related to the processor capabilities (I think). You can open Resource Monitor from Task Manager to watch hard faults.

On the Disk tab, one can watch C:\pagefile.sys load when hard faults get high. How fast is drive it’s on?
The answer is easy for this desktop. Everything is local mechanical drive, so no need to ask any further.

That’s not a point when one has said hardware. As the years go on more people will have that hardware. I will probably be getting 2.5 gbit in the next few years and would want to know if Duplicati could max out the bandwidth as another point for those looking into it.

Last point is for programmatically making optimizations, one (at least I) would want an idea of how fast it can already go, so as to have a proper picture for if one accidentally slows it down instead. Thinking it really is faster would be rather funny :wink:

Yes, I fully agree there’s plenty of things other than Duplicati that can cause slow downs. That doesn’t necessarily make eg 20MB extra or whatever Duplicati might be at this time on 10 gbit as an ideal thing to do. If one knows the max it can probably do then one would know if one wants to mess around or not. If Duplicati can already saturate 10 gbit, I haven’t found anything that says it can.

This will be my final reply on this. I’m not really disagreeing with you. I just see it differently :slight_smile:

1 Like