Ideal settings for a 9TB initial backup (Mac -> Drime via Rclone)?

I’m preparing for a 9TB initial backup from a Mac (source: external HDD via USB 3.0) to Drime via the Rclone backend. I have a 1Gb/s connection and enough internal SSD space for a dedicated temp directory.

Since this dataset is significantly larger than what the defaults are designed for, I’m looking for community consensus on the “sweet spot” for a setup of this scale.

What would your ideal configuration be for:

  • Block size: To prevent the local SQLite database from becoming a bottleneck?

  • Remote volume size (dblock): To maximize a 1Gb/s upload without making compacting or restores a nightmare?

  • Asynchronous upload limit: Given the mechanical source drive but a fast network?

  • Any other essential flags for macOS or large-scale reliability?

Regarding Compression: I am willing to disable compression entirely if it means dramatically faster upload speeds or reduced CPU overhead.

  • Is it better to set the compression module to none?

  • Or should I keep it as deflate and set the compression level to 0?

Thanks for the help!

I can’t give exact numbers, but happy to share my experience:

For me, using 5 MB blocks a 2.5 TB backup makes a 4 GB database and a 4.5 TB backup makes an 11 GB database. Based on this experience, my latest backup of 8 TB has been configured to use a 50 MB block size. My attitude was to just say *f###k it* and choose a large enough block size to keep the DB size down. Yes, I lose fidelity to the de-duplication routine, but for me I think it’s worth it.

For local targets I use 1 GB volumes. The volume size is really only there to tune file-system limitations an if everything is working correctly it shouldn’t affect anything. Consider how large individual files can be, then think about how many files can be stored in a folder. For remote targets think about how many files can be listed over the remote interface. If your connection is fast and you don’t have filesystem limits, then 1 GB should be fine, but go to 10 GB if you want to keep the file count down.

I can’t comment on the compression and will let others pitch in here.

To speed things up, please note that volume creation occurs in your OS temp folder. If this is also on your source drive (unlikely, I know) then everything will be copied (assembled into volumes) through the temp folder and then out to the remote source. My recent experience has left me with considerable improvement to performance placing the temp folder into a RAM disk. Even when the temp folder is on a local SSD, for me, I found bandwidth would consume the SSD (it’s not just copying the file across, it has to be split into blocks, they get hashed, then searched in the database, and added, so the IO load on the database is considerable) and the RAM disk worked a lot better.

The main reality (as you no doubt already know) is that once you get to TB of data everything just takes ages to process. And no matter your specs, everything is slow. Best of luck with it.

I have some follow up numbers based on a recent large backup job that I ran. This is backing up an archive of files from one local HDD to another local HDD. The host is modest, just a mac mini with 16 GB of RAM.

The source is not quite as large as yours, 7 TB and the backup has created 6.6 TB of files.

I used a 50 MB block size and 1 GB volume size.

The database is 3.6 GB.

The first backup took a little over a week to complete, however, it has not truly finished. There seems to be a problem where the Duplicati task is hard crashing the host when it gets to the end (or near to it). I have restarted Duplicati three times and have the same result each time, it resumes the backup, processes files, appears to work for a few hours, then hard crashes the server. Log (warning level) is empty. I fear that setting the log level to something useful like “Information” will make a truly massive file, but I’ll probably have to do this.

So it’s a bust for me for now. I might just wait for the next version and try that instead…