What is a good remote file size for large media?

Been playing around with Duplicati for a few days now, and im ready to start backing up to the cloud. My main question is what do you think a good remote file size would be for mainly backing up LARGE media files. Most file sizes are between 20gb - 80gb per file, i just don’t want Duplicati to start stroking out as the storages grows.

Thanks in advance

1 Like

Remote volume size of 50MB is usually fine for most back ends. What back end do you plan on using?

A more important thing is possibly the deduplication block size (default is 100KB). What is the total amount of data you intend to back up with Duplicati?

Google Drive and about 5TB, which will grow over time.

For a 5TB backup expected to grow, I’d probably set the blocksize to 5-10MB. This setting must be in place before the first backup, and it cannot be changed later:

1

I’m not familiar if Google Drive has a file count limit on the back end. You could consider changing the remote volume size from the default of 50MB if you want to reduce the number of files on the back end, but there really is no need if Google Drive doesn’t limit file count. If you do want to change it, I wouldn’t go too crazy by setting it too large. 250MB or 500MB might be ok if the default 50MB won’t work.

More info here: Choosing Sizes in Duplicati - Duplicati 2 User's Manual

By the way - I suggest starting with a smaller set of data just to make sure things are working well for you. Once that backup succeeds you can then of course add more folders to the selection list and do another backup, working your way up to the 5TB you want to protect. Sometimes this approach can lead to better success.

Ok thanks for the input, i had to set the remote file size to 250mb, due to smaller files were being written too fast, causing an API access error on googles backend. my main concern is file integrity, as i fill up my external drive, i erase and add more data for the backup to grow.

i also have my advanced settings set like this, let me know if anything on this list will cause a slow delay in a growing data;

–file-hash-algorithm=SHA256
–backup-test-percentage=100
–upload-verification-file=true
–asynchronous-concurrent-upload-limit=5
–list-verify-uploads=true
–verbose=true
–full-block-verification=true
–use-block-cache=true
–log-level=Warning
–full-remote-verification=true
–blocksize=5MB

edit: These are my default options, meaning they will apply to every backup I create, correct?

I would suggest higher values. If your source contains only large video files, I would set both block size and remote volume size as high as possible.

if your source location contains only compressed video files, in-file deduplication is not likely to happen, regardless of the block size. File deduplication works always perfectly (for example when you move a file to another folder), regardless of the block size. A smaller block size will result in a larger local database and increase database recreation time dramatically.

For the remote volume size, I would also choose a high value, because all source files are multiple GB’s. This will result in fewer files at the backend and a somewhat smaller local database. You will not benefit from smaller remote volumes if most source files are larger than the remote volume size. If you have to restore a small file (3 MB) and have chosen a remote volume size of 1 GB, you have to download 1 GB to restore your 3 MB file, but this doesn’t apply to your scenario: to restore a 38 GB file, 80 500 MB files have to be downloaded, or if you use a 50 MB remote volume size, 800 files have to be downloaded. With a 500 MB remote volume size, you download 500 MB more than needed in the worst scenario, if the last few bytes of the video file are inside a new remote volume.
I don’t expect much difference in total amount of up/downloaded data for compacting, because I suppose that most video files are never modified after they are backed up.

So my suggestion is to use a very large block size, say 50-200 MB and a remote volume size of 500-1000 MB.

To get an indication of the consequences of different block- and volume sized, you can use this Google sheet:

2 Likes