Duplicati to B2 storage: question about autonomous 'inspection' downloads


#1

Now that CrashPlan for Small Business actually crashed my Mac by dumping cores until the disk was full, I’ve had it. I am looking into using Duplicate with B2 storage. As I understand it, Duplicati regularly downloads data from remote storage for trial checks on data integrity. I wondered that if I have a 1TB backup, how much download (and thus cost) I would get from Duplicati’s standard behaviour.


#2

I think Duplicati downloads and tests one block per backup (default 50 MB). Can someone confirm?


#3

What are the pros and cons of smaller/larger blocks in Duplicati?


#4

Bigger block size = less amount of files on the remote storage. But when restoring a tiny file, Duplicati always need to download a full block and use more data to get to that one file.

If your remote storage has a file limit, use a bigger block size. If you have an unstable or slow internet connection, it’s probably better to use a smaller block size to prevent wasting data when you have to reconnect.


#5

Please see Choosing sizes in Duplicati. Beware of naming – above comment fits –dblock-size not –blocksize.

Regarding download charges due to the sampled verification, it’s adjustable all the way down to 0 if you like, however you’d increase risks of a bad backup unless you took steps periodically to test / verify your backup.

Understanding B2 Pricing Structure makes me think doing verification in small chunks might wind up at free, and the Duplicati job report will tell you how many bytes it downloaded (can’t say other apps’ use of course). Note that downloads will also happen when the automatic but adjustable compact operation reclaims space.

Forum comments are kind of mixed on whether the –backup-test-samples counts source or destination files, however from source and tests, it looks like a sample is a triplet of a dlist, dindex, and dblock file. A dblock is capped in size at --dblock-size, the dindex indexes the dblock and is usually small, and the dlist contains the pathnames in the backup (plus information to allow restore). For a big backup, that file might be the biggest.


#6

Thank you. Can I change the block size of an existing backup set? Probably, not, I would suspect.


#7

You can change the data file size (dblock), but you can’t change the internal de-duplication block size (blocksize).


#8

A new --dblock-size will affect new dblocks, including ones made by “compact” operations (if you allow them) therefore you won’t be facing a huge complete download, repackaging, and reupload if you change the size.


#9

smaller block size = more blocks, files to work with = more api calls to B2.
B2 charges per api call too. B2 Transaction API Price
However I don’t know how much of a factor it plays in the overall cost…Might be insignificant or might be something to consider.


#10

Smaller blocks means a lot of files, and that means huge file listings that may slow things down. I think 50 MB is tiny for any medium sized backup.