How to backup huge amount of files?

Hello,

I run duplicati in a docker container on my machine. Last month I started a backup of e.t. 7 TB of media files.
That backup needed 50 days. The write speed became slower and slower over time. I assume that my raid controller was to hot or something like that.
But I did similar backups before with simplar tar cfvz ... commands which was very quick.
I believe the long backup time is caused by some duplicati configuration of the internal compression mechanism. Maybe it did not use all cpu cores (?) or some other configuration was bad.

I chose a remote volume size of 1 gigabyte.
Now I have some other files (nextcloud stuff) which I want to update. But before that I wanted to check up if my configuration is well configured for that purpose.

Do you have some hints and tips ?

T

Hi @TiTan. Could you please provide:

  1. The destination backend type (and if a cloud/remote destination, your line speed)
  2. System specifications (CPU/RAM/Disk Types)
  3. Duplicati version
  4. How compressible is your dataset

While other community members may have valuable insights to share, having this baseline information will help us develop an effective parameter tuning strategy.

  1. Distination: Mounted external usb hard drive
  2. CPU: Intel(R) Xeon(R) CPU E3-1275 v5 @ 3.60GHz
    RAM: 8 gbyte + 8 gbyte swap
    DiskType: Backup destination 3,5 usb 3 HDD, Origin: Raid5 HDD (4 HDDs)
    At the end I had 3 Mbyte/s write speed, which is clearly to slow for the whole setup. At the
    beginning I had e.t. 30-40 Mbyte/s but it was reduce quickly to under 10 Mbyte/s and become
    slower and slower with time.
  3. duplicati version: 2.1.0.2_beta_2024-11-29
  4. A lot of video files with different codecs. I don’t know how compressible the dataset is.

The dramatic drop in write speeds (40MB/s to 3MB/s) seems unusually severe. While it’s normal for external HDDs to show declining write performance as their cache fills up, the extent of this slowdown suggests potential issues. The 3MB/s write speed is notably low for modern devices. My personal experience is usually having 20MB/s+ sustained write for my 2TB disks.

Several factors influence external HDD performance:

  1. The USB-to-SATA controller in the external enclosure
  2. The hard drive’s own cache size and management
  3. The drive’s inherent performance capabilities

Regarding Video Compression: Most video files are already heavily compressed by their codecs, leaving little room for additional compression. To improve transfer speeds, consider disabling compression in the advanced settings using ‘zip-compression-level=0’. While this will reduce CPU usage, the main bottleneck appears to be the write throughput rather than compression overhead.

Hm I would prefer not to disable the compression level because of the big size. Even a benefit of 1% would be safe 70 Gbyte space.

I oberserved that duplicati did not use all of my cpu cores while compressing. Maybe this could be one reason of the slow speed?
Further maybe the remote volume size of 1 gbyte maybe was to big or to small ?

For video, you are unlikely to see any compression benefit from the default deflate compression. You can try one of the other compression methods, but it will be slower, and likely not gain much.

Video is already encoded in a way that removes information that is not visible (causing quality loss), where general compression is lossless to ensure input == output and this general method can compress less.

You can set --concurrency-compressors to the number of concurrent compressions you want.

I think one GiB is a good choice for a local disk. I would not expect to see any gains from, say 2 GiB.

You could experiment with --blocksize and choose a large value, say 10MiB. If you have mostly video, the blocks will only be identical if it is the same file, so increasing the value should speed it up on multiple levels, at the cost of less de-duplication, which you are not likely to benefit from anyway. Note: changing the blocksize requires that you start over with the backup.

How big were those? There are plenty of buffers and you will not see the true speed until you have saturated all buffers (userspace/kernel/driver/interface/disk).

Another thing is that your disk may not deal well with multiple transfers. You can try to set --asynchronous-upload-limit=1 to only write a single file at a time. This should reduce seek time on the disk, but may cause the transfer bandwidth to not be saturated.

Thank you for the ideas. I will add them into my configuration. Where I need to set --blocksize and --concurrency-compressors, … ?
In the docker-compose file of duplicati? In the backup configuration?

You can set them on each backup configuration.
But you can also set them in the general settings, so they are applied to all backups.