Backup from SATA to SATA really slow


#1

Hello,

I was using Duplicati (via GUI) for the first time yesterday to back up data from an internal hard drive to another hard drive also connected via SATA.
Setting up the process was wonderfully easy, but performance was a bummer.

Most files are compressed .NEF files so I edited that default_compressed_extensions.txt to also contain the following line:
.NEF #Nikon RAW Image Format
(I guess those entries are not case sensitive but I wasn’t 100% sure so I capitalized NEF anyway)

This is my backup command line:

"C:\Program Files\Duplicati 2\Duplicati.CommandLine.exe" backup 
"file://E:\bup_desktwo_duplicati\D_Daten"
"D:\Pictures\\" 
"D:\Documents"
--backup-name=D_Daten 
--dbpath="U:\bastian\AppData\Local\Duplicati\CTVDEHNYLK.sqlite" 
--encryption-module= 
--compression-module=zip 
--dblock-size=100MB 
--retention-policy="1W:1D,4W:1W,12M:1M" 
--no-encryption=true 
--compression-extension-file="C:\Program Files\Duplicati 2\default_compressed_extensions.txt" 
--disable-module=console-password-input 
--exclude="*.cof" --exclude="*.cop" --exclude="*.cot" --exclude="*.ini" --exclude="*.bak~" --exclude="*.txt~" --exclude="%MY_DOCUMENTS%\Adobe\\" 

So in the end, Duplicati spent 648 minutes to backup 235GB (Source) producing 217GB in the backup location.
That is a rate of 6.2MB/s.
By the way, the rate displayed in the backup-window was way lower, usually somewhere around 50KB/s-ish with peaks rarely hitting the 1MB/s mark.

I then did an ordinary file copy of another 254GB from the same internal drive to the same backup drive and it took only 48 minutes to complete, 88.2MB/s.

I inspected CPU and memory usage but neither appeared to hit a limit.

What am I doing wrong? How can I increase performance?

Thanks in advance, Bastian


#2

Hello @bastian, welcome to the forum!

Most likely you weren’t doing anything wrong. Remember that, unlike a straight/blind file copy, Duplicati is doing a lot more work during a backup including:

  • chopping all your files up into 50kb (default value) blocks (for deduplication and versioning)
  • hashing the blocks (for deduplication and verification)
  • checking the local database to see if the hash already exists and if not storing (for deduplication and versioning)
  • compressing a set of blocks into a file of about “Upload volume size” size (50MB by default, for space saving)
  • encrypting that file (for security)
  • sending it to the destination (local drive in your case, so likely not much time there)
  • misc. database work to track files, sizes, CRCs, etc. of uploaded content (for versioning and verification)

So there’s a lot going on during a Duplicati backup. The first backup is almost always pretty slow, but since subsequent ones only backup the file BLOCKS that have changed, they should run much more quickly.

That being said, there are some things that can be done to improve performance such as:

  • no or more simple encryption
  • no or lower compression ratios (likely not applicable to you since you already added .NEF to your default_compressed_extensions.txt file)
  • splitting large jobs into multiple smaller ones (usually this helps most with large file counts more than large file sizes)
  • storing the database on a faster drive
  • specifying a temp folder on a faster or RAM drive, or at least different than the backup drive
  • larger block (file chunk) or dblock (Upload volume size) settings

Also, when working with local drives there are some settings that will move files ready to be “uploaded” rather than copy them. So, for example, if you told Duplicati to use your destination SATA drive as your temp folder then it would create the “to be uploaded” files on that drive but you could then tell Duplicati to just move the file from the temp folder to the destination one rather than make a copy.

I’d suggest running another backup or two AFTER the initial backup has completed and seeing if performance is still an issue at that time.


#3

John already covered the details, but from my experience, a new backup usually runs around 8-10GB per hour (because of the bits Jon mentioned above). With internal single system drive to drive, it is likely a little faster but not much. The benefit is that in the future, it only has to compare what is there now versus what was there in the last series of backups.
I have one backup that was 600GB worth of all kinds of files. The files are on the storage server which has RAID 10 with multiple 1.6TB enterprise SSDs, Duplicati on an older server system that has 8 CPU cores and 32GB RAM, connected via 10Gb LAN connection. Even then it still took just over 5 days for the initial backup of that 600GB.
Now the daily backups on that same set of files only takes 45 minutes per day to do its thing. Other backups that vary from 80GB to 200GB, but files are in use by servers or computers can take anywhere from 25 minutes to 2+ hours, even though total size is smaller than the 600GB. This shows there are a lot of factors involved with these kinds of backup processes.
I’ve used other backup programs that takes over local resources so while it is backing up, any other program or PC connected to the files being backed up slow down or lose connection to the files, which is usually a bad thing since some corporate system require access at all times, even if it is not currently in use, so dropping a connection causes errors or program crashes.


#4

In my experience with Duplicati, as the database is getting populated for the first time a lot of activity takes place there. When I had a backup creating a 400MB database (with 8kb block-size (which turned out a crazy thing to do)), running on a mechanical HDD, the database was generating a ton of HDD noise as the SQL journal was being merged every couple of seconds, and this has cause a very noticeable slowness.

Moving the database to an SSD and increasing the block-size made things much easier on the hardware, and faster.