First backup excessively slow

I am seeing excessively slow initial backup times with my new install.
Duplicati 2.0.3.4 canary x64
OS: Server 2008R2 SP1 64-bit
8 cores 2 CPU Xeon E5405 @ 2.0GHz
32GB RAM
1Gb LAN connection
Backup storage location is a local volume (on Server “A”), 4x 2TB RAID 5 (5.75TB space, 5.5TB free).
Files are on Server “B” on a larger RAID array (that uses a bunch of 1.6TB SSD drives), utilization (read/writes) on that server is showing is under 5% even with duplicati reading and copying the files as well as the rest of the company users accessing/reading/writing files. This Server B is also connected via 10Gb connection.

Current backup is handling 186GB worth of files (around 350,000 files) from Server B.

Backup started at 7PM local time last night, here it is 10:30AM and the progress bar is still only 75% of the way across. It shows 17,000 files (32GB) to go at a speed that varies from 8 to 11 KB/s.
Even taking the lowest denominator in this scenario, which would be Server A drive read/write speed at an average 100MB/s, or 1Gb/s LAN at full speed would be 125MB/s… but still far above the shown 8-11KB/s.

We have tested it by other means and found direct Windows copy of files from server A to B or vice-versa typically sees 80-110MB/s, but never below 60. Some may say Duplicati and its file processing, but the issue there is that Duplicati is only using 70-80MB of RAM (which has not changed from when it is idle) and up to 10% of all CPUs, although the first core does show 80-90% usage. I assume this means duplicati is not multi-threaded and is by default set to a low priority on the system. This is a clean OS on a server we use for development and testing, so there is almost no other software on there using the resources. With normal speeds at 100MB/s it should take approximately 31 minutes to handle 186GB (6GB per minute), not 14-16+ hours. Even with high network load or other processes using the drive dropping it to 50MB/s, that should still mean 1 hour to handle 186GB, not 16 hours.

At these speeds I am dreading the attempt on the other storage drive which has 600GB worth of files. An hour per 10GB means that one would take at least 60 hours (2.5 days). So I think there is still some optimization needed for this software on the processing side since it is using only one CPU thread, and not enough memory to show it is actually doing anything (likely relying on temp files on the storage drive instead of handling it all in the much faster RAM).
If there are other options/switches such as --use-cpu-threads=8 and --use-max-ram=30G (for examples), please let me know, that alone should seriously speed things up with the processing.

First backup finally finished. 7:00PM (19:00) until 1:00PM (13:00) the next day. Approx. 10GB per hour, 18 hours total time for just 186GB worth of files. Not sure if this is the direction the company wants to go to backup 20TB worth of storage split among 18 volumes.
The reason for this testing is the current commercial product they’ve been using is even slower, even after updating to the latest version. There is no reason for these slow speeds with any product, when the servers in use have multi-terabyte SSD enterprise level storage via RAID at 700MB/s+ (R/W speeds) across 10Gb connection (1.25GB/s) to the local NAS server with 32TB of storage via RAID at 600MB/s. Especially when partial/incremental backups are needed daily, and full backups weekly (including the additional full volume snapshots), then full backup to a remote site every 2 weeks.
At 10GB per hour, with approximately 10TB (of the 20TB) worth of data, not including the full volume snapshots, this means 1000 hours for a single full backup via Duplicati. That is 41.6 days. No very good for weekly or even monthly backups. With the LAN and processing speed to handle this, no company could stand to have their network dragging for 41 days straight.
From my perspective, it looks like Duplicati is for small scale local network home users with 50-100GB worth of data, not commercial level backups with multi-TB worth of data.

Hi @mikaitech,

It’s likely the slower speeds are mostly caused by a bottle neck in the single CPU thread. On the first run every single file has to be split into blocks and each block hashed. Along with that there are many database operations also requiring quite a bit of CPU.

Multi threading isn’t supported right now, but it’s just around the corner Implement multi-core processing option · Issue #3174 · duplicati/duplicati · GitHub

There are some flags to move the temp storage to a faster disk, or to store the block cache in memory with the use-block-cache flag, but you’re likely not seeing any performance hit from disk speeds since you’re mostly CPU bottle necked.

dblock and block size tweaks can also improve database performance on very large backup jobs. Increasing these sizes increase the overhead that has to be re-uploaded when files are changed, but usually it ends up being the database that slows down at large backup sizes. However, I’m afraid we don’t have a lot of tests or data on the best settings for various backup sizes.

I know there are multiple users on the forum backing up TB’s of data, but it’s worth mentioning that Duplicati is not entirely ready for large enterprise data yet as there are some issues with very large datasets because of the large databases they generate. It works, but if you run into trouble rebuilding the DB can take time.

It should also be mentioned that, in some way, there is no “full backup” in Duplicati. All backups are “incremental” in the sense that they just upload whatever file pieces are not yet at the backup target. So after doing the first backup each new backup will only be uploading the changed file chunks. (But Duplicati still needs to check all the files for changes until USN or an other file watcher service is implemented)

1 Like

I figured that would be the case. Too often I see these programs with either heavy resources plus fast processing time leading to quick backups (which then gets nicked on the review sites for being resource heavy), or low resources and slow processing time leading to long backup times (which then get nicked for slow speeds). Sadly I see way too many requests to throttle or limit system resources on the faster programs with the expectation that backup speeds would remain fast.
Right there on that github issue you linked is already a request for “Just a thought: Can we use the throttle to reduce the local resource usage to a friedly level?”… I am sitting here thinking that the slow backup time with only a single CPU used and low memory usage is already minimal resource usage.
Honestly I would rather have a backup program using all 8+ cores at 90-100% and 8GB+ of memory to process the 200GB backup in an hour, not 5% usage taking 18 hours. I would rather not be throttled and limited by default causing the company’s local network to bog down over 18 hours of backup time as I just dealt with. Even if it were to direct copy the files at the fastest speed it is able to directly to the local system, say 4-8GB at a time (or even the entire job, even if it is 200GB worth), and then process everything locally, this would be much more preferable for backups using local storage.
Although I also understand there is differences with how linux use multiple cores/threads versus how Windows handles multiple cores/threads so that would play into the programming aspect as well.
I just wish I knew more about coding to help out.

Being available to test other people’s code can also be a great way to contribute. :wink:

Hey @mikaitech,

I’m interested to hear about what kind of performance you’re seeing with the new multi-threading functionality in Release v2.0.3.6-2.0.3.6_canary_2018-04-23 · duplicati/duplicati · GitHub.

It should help quite a bit, especially if you’re putting it on an 8 core Xeon :slight_smile:

I just installed 2.0.3.6-canary-2018-04-23 over the prior version and noted the current backup times. I will check in the morning to see if the times improved on the larger incremental backups.
I did force one to “run now”, it is 82.4GB total size (56.61GB compressed backup), 3 different mapped network drives (which are stored on large SSD RAID arrays), and the 2.0.3.4-canary ran the previous backup with no network or file load in 19m:19s. With the 2.0.3.6 it ran in 15m:15s, so it is slightly faster on a smaller backup load. I will check the rest tomorrow morning after they’ve had a chance to run overnight (142GB and 602GB with a LOT of files in each).
I do suspect the bulk of the time is file searching and reading, as processing itself should run quickly regardless if ti is single or multiple-threads, but at least this removes one bottleneck.

Let it run overnight. I had originally set concurrency-max-threads to 8 to fit the 8 CPU cores. I have now removed that option to let it run in its own dynamic mode with tonights backups.
This is the comparison of last nights incremental backups:

142GB - run at 5:10PM (business closing time is 5PM)
3.0.2.4 - 50 minutes (49m:52s) 142.41GB
3.0.2.6 day 1 - 3h:00:50 142.55GB
3.0.2.6 day 2 - 2h:49m:49s 143.04GB
3 times longer

602GB runs at 6:00PM
3.0.2.4 - 25m:14s 602.24GB
3.0.2.6 day 1 - 48m:53s 602.35GB
3.0.2.6 day 2 - 25m:50s 602.36GB

82GB runs at 6AM (business open at 8AM)
3.0.2.4 - 19m:44s 82.39GB
3.0.2.6 day 1 - 13m:16s 82.40GB
3.0.2.6 day 2 - 13m:38s 82.40GB

So I suspect the 142GB ran over the 1 hour (50 minutes) so the 602GB started before the 142 was done causing a delay, or a few employees stayed late dealing with a lot of files, or a lot of employees kept files open causing a lock that prevented/delayed proper backup. I have altered the schedule to delay the 602GB to 7PM to hopefully not interfere with the 142GB.
Primary reason for these timeframes is Duplicati is only for testing, the primary server backups run between 10PM and 4AM.

EDIT: added another days worth

So the next item I am curious about is “volume size”. With the 142GB backup is set to 50MB, 602GB is set to 200MB, and 82GB is set to 50MB. Is there a general guideline for size to be backed up versus suggested volume size?
In my case the original files (to be backed up) are on local network locations, this server connected to 1Gb LAN connection (and original file servers via 10Gb connection), Duplicati backups/volume is stored via local drive/partition so the network is not a factor regarding 50MB or 200MB or 500MB volume size.

I don’t think there’s any general rules here. There’s an article on it here Choosing Sizes in Duplicati - Duplicati 2 User's Manual

What’s worth knowing is:

  1. The size of a block determines how much data must be uploaded if even a single byte is changed in the source. As such, larger block sizes will increase the overhead of “unnecessary” upload.
  2. Larger blocks reduce the number of files to be stored, which in turn reduces the DB size. It may improve performance on large datasets.
  3. Larger volumes are really only necessary if your backend has trouble with many files in one directory. In general they just make it more expensive to download your files because you have to download entire volumes. With unstable connections smaller volumes can be useful.

Pectojin
18h

Is there a general guideline for size to be backed up versus suggested volume size?

I don’t think there’s any general rules here. There’s an article on it here Choosing Sizes in Duplicati - Duplicati 2 User’s Manual

What’s worth knowing is:

The size of a block determines how much data must be uploaded if even a single byte is changed in the source. As such, larger block sizes will increase the overhead of “unnecessary” upload.
Larger blocks reduce the number of files to be stored, which in turn reduces the DB size. It may improve performance on large datasets.
Larger volumes are really only necessary if your backend has trouble with many files in one directory. In general they just make it more expensive to download your files because you have to download entire volumes. With unstable connections smaller volumes can be useful.

Thank you. I figure since the backups/files are stored locally that there is no upload via network to worry about. Then the question is does the volume size option affect the backup speed? Say there is only 5MB worth of file additions/changes, does it still create a 50MB block or does it compress it into a smaller block by itself and list that in the DB?

Relating to the speed mentioned, especially the massive increase since going to 2.0.3.6.

For example with 2.0.3.4, the 143GB backup used to take 50 minutes, now with 2.0.3.6 it takes closer to 3 hours. No idea where the time is actually going because the log shows the bulk of the list, block and index being done in a matter of minutes.
List started to completed per volume is a matter of seconds.
Writing the backup volume (regardless if it is 50MB or 200MB) takes 1-3 seconds.

The log shows
DeletedFiles: 38
DeletedFolders: 1
ModifiedFiles: 34
ExaminedFiles: 296567
OpenedFiles: 434
AddedFiles: 400
SizeOfModifiedFiles: 0
SizeOfAddedFiles: 0
SizeOfExaminedFiles: 153588298045
SizeOfOpenedFiles: 1449353785
NotProcessedFiles: 0
AddedFolders: 34
TooLargeFiles: 0
FilesWithError: 0
ModifiedFolders: 0
ModifiedSymlinks: 0
AddedSymlinks: 0
DeletedSymlinks: 0
PartialBackup: False
Dryrun: False
MainOperation: Backup
CompactResults:
DeletedFileCount: 0
DownloadedFileCount: 0
UploadedFileCount: 0
DeletedFileSize: 0
DownloadedFileSize: 0
UploadedFileSize: 0
Dryrun: False
MainOperation: Compact
ParsedResult: Warning
EndTime: 4/25/2018 7:59:44 PM (1524704384)
BeginTime: 4/25/2018 7:59:20 PM (1524704360)
Duration: 00:00:23.9896755
Messages: [
2018-04-25 17:10:00 -05 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Backup has started,
2018-04-25 17:10:56 -05 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started: (),
2018-04-25 17:10:57 -05 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed: (5.85 KB),
2018-04-25 17:44:39 -05 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-b1cf6b02fa4ed4a1187b6569ee5dd1883.dblock.zip.aes (199.96 MB),
2018-04-25 17:44:40 -05 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-b1cf6b02fa4ed4a1187b6569ee5dd1883.dblock.zip.aes (199.96 MB),
]

Most of the processes seem to be done fairly quickly, but something else is eating up the bulk of the time extending it an additional 2-2.5 hours.
This one is scheduled for 17:10 (5:10PM), but even though the bulk of the messages and warnings end around 17:45, as shown above the end time is 7:59PM (19:59).

Something else worth mentioning. This system has 2 CPUs x 4 cores each, even under 3.0.2.6 while running a backup, CPU usage across the 8 cores still seems pretty low with the bulk of it using the first core. System Idle Process typically sits between 75-80% even in the middle of a backup.
I also set the option “use-block-cache” to true and despite the system having 32GB of memory, java is using 725MB and duplicati is using 200MB, versus disabled it uses 80MB.