Backup bigger than source


#1

Hi,

I have half a dozen backup tasks, and one of them is particularly annoying me.

It’s the only backup job I’ve ever seen that is bigger than the source. Even with one version.

image

The backup content is a bunch (more than 900, less them a thousand) ~500MB mkv video files.

I’m using a 50MB upload size, and hence, I’d expected some deduplication, because this is a very large backup, and a was hoping to save some space with Duplicati.

Nonetheless, I’d like to say that I’m a user for over 2 years, and the solution never let me down. And so I thank you guys for this amazing tool.


#2

it’s mkv, that’s already compressed, and unless there are duplicate video in your set, you will not benefit from deduplication. So all looks normal. The difference is in the overhead of dlist & dindex files.


#3

Hm, I see.

I really did not know that mkv was compressed.

But there is still a question. If Duplicati has block based deduplication, I think that should have some deduplication in this backup, since the upload size is 50MB and there is over 9680 dblocks.

Not even a single one of those 9,680 dblocks can be deduplicated?

That is what seens strange to me.


#4

Actually, the deduplication comes from chunks (called blocks) of the files themselves, the chunks’ sizes default to 100 KB.

You may think that with that many blocks, some of them must be duplicate right? Probably not, take into account that 100KB are 102400 bytes that must be equal for the hash to be the same. Perhaps the headers of the binary files may be the same and some may be duplicate, but there really isn’t much to gain from your backup scenario.


#5

Hm, that should be it.

But again, very odd that the isnt any deduplication from this. In over 400gb.

Anyway, this backup I’m not going to maintain in Duplicati, since there’s no gain in space from Deduplication, and I’m not going to use versioning.

It’s better having the actual files available.

Thanks guys, I’m going to close this topic.

Felipe Pereira;

RJ, Brasil;


#6

I agree, since you don’t want versioning it’s better to just sync the folder or similar.

Cheers from Brazil too!


#7

If it is all video, then it is highly unlikely that two blocks are the same. Even if parts of the video are the same, the encoding process makes the results wildly different.

Yes, you will not gain anything from a backup tool here, except versions which probably has no value.


#8

I’ll mark the post as solved. If you have any new inquiries please let me know