Unreasonable disk space requirements

It could be that I’m just a bit more dense than the typical user also! :slight_smile:

1 Like

I’d say the fact that the default is 50MB (Megabytes) should make it pretty clear that this is not the total space allocation for the full backup… :wink:

1 Like

what about a warning when more than 1GB (?) is selected? There could be a more detailed explanation preventing users from using this option by mistake…

I remember, though, that I was also confused about this and thought about increasing it when I first installed duplicati. Or did I actually increase it? I think I did. So I have to admit that some things that seem obvious to some are not obvious to others…

Can you change this value after a backup has been done or will that require a new backup?

For many people, the word “Volume” is synonym for “the complete disk”. Add “Upload” to it, then “Upload Volume size” is easily interpreted as “the total backend storage quota”.
If there’s confusion about this term, why not rename it? What about Block Container File?

The Upload Volume size (AKA dblock files, default 50 MB) can be changed, the block size (file fragments that are stored in the dblock files, default 100 KB) cannot be changed agter the first backup has been made.
Note that the already uploaded dblock files keep the same size. So if you choosed an initial size of 10 GB, make the first backup and change the size afterwards, you have to download 10 GB when restoring a small file that is uploaded during the first backup operation.

1 Like

Well, after playing with this a bit I found that my backups appeared to run faster when I set the value significantly higher than the default 50MB.
I currently have it set to 5Gb, which I think is reasonable enough given the average size of most of the files I want to back up (3-5Gb, pictures).
Imho I believe 50MB to be much to conservative, nothing is that small these days.

Does the Upload Volume Size have any impact on compression efficiency btw?

Otto

1 Like

A larger Upload Volume size will decrease the number of files at the remote side. A lage number of files in one folder can decrease the performance of the backend server.

Restoring files will take more time when choosing a large Upload Volume size. I guess 3-5GB is the total size of the pictures you want to backup, not the size for each picture.

Let’s assume the avarage file size of a picture is about 4 MB. With an Upload File size of 5 GB about 1000 pictures will fit in a single volume. If you want to restore a single picture, Duplicati has to download the complete volume (5 GB) to rebuild the Original picture of 4 MB. If something has changed and the needed blocks for the picture file are devided over 2 archives, even 10 GB or more has to be downloaded for restoring a single picture.

Not much, if any. All volumes are standard Zip files. Maybe there’s a little more overhead when writing to many small Zip files, but I guess this will not be a big difference.

1 Like

Ok, I see. Thanks for explaining.
I should switch it back to 50MB then once I put this into production.

Otto

BTW: A good way of showing your appreciation for a post is to like it: just press the :heart: button under the post.

If you asked the original question, you can also mark an answer as the accepted answer which solved your problem using the tick-box button you see under each reply.

All of this also helps the forum software distinguish interesting from less interesting posts when compiling summary emails.

I have an “unreliable” network connection, so I set this to 10Mb. Maybe an option (or default) to use automatic sub-folders for files would help. Adding a sub-folder with first char would reduce 1/34 the files, and using two subfolders 1/1156 (I also use backuppc and it uses 3 sub-levels to reduce files in one folder).

This could help for large lists of files at the backend, but I can think of some backdraws:

  • Implementing this structure in a new version would make existing backups unusable, unless both file structures will be supported, which will make the software more complex.
  • This would add a new requirement to the very limited requirements for backends (PUT, GET, LIST, DELETE). Some backends could potentionally not be able to create folders, which would kick them out as backend for Duplicati.
  • Internal administration would be more complex, which could slowdown backup/restore operations.

I don’t know how difficult it is to implement, so if the issues above can be addressed, it could help with large backups.

1 Like

Also, it would require multiple LIST requests to ensure that all files are found.

I wrote this document some time ago, to try to explain the tradeoff between the size settings:

1 Like

Unless I’m thinking incorrectly about how this all works, it might also be worth having a warning on the larger sizes that at least DOUBLE the selected size might be needed in free disk space to do a restore. So if 500M was selected, then when doing a restore of a 750M file will require 500M the download of the first dblock then another 500M for the decompressed contents, then ANOTHER 500M for the next dblock, and finally 250M for the remaining decompressed contents.

So in that abusive example we actually needed 1,250M free space to restore a single 750M file from a 500M dblock size.

1 Like

Yes, that is a good suggestion, comparing the volume size to the size of the local disk, and giving a warning if this is too high.

2 Likes

A “curiosity” question that’s hopefully relevant enough to include in this thread:

Does the upload volume size impact how much is downloaded when a file is downloaded for archive verification? (I hope that made sense…)… another way to put it, if I have the upload volume size set to 50mb, then does that mean that 50 mb is downloaded for each verification? If I set the volume size to 750MB, does each verification download 750MB?

Also, when restoring a single file, is duplicati able to just download the file, or does it have to download the entire 50mb (or whatever size it is) volume(s) to then extract the file from there?

Mostly I’m just trying to understand how it works so I can come up with the volume size that makes most sense for the amount of bandwidth I have.

As a side note…after playing with duplicati for a few weeks, I’m very, very impressed with the whole thing.

Tyson_Bryner, it might be easier when asking multiple questions so use bullets or a numbered list. Hopefully I got all of your questions covered:

  1. Yes - whatever the upload volume size is set to is how much will have to be downloaded for a single verification to occur
  2. Yes - the entire archive file must be downloaded so that a single file can be restored. But it’s not quite that simple as Duplicati actually chops up your files into smaller blocks and stores each of them in the archive (I believe this is how de-duplication is managed, at the block level) so ALL the archive files holding ANY of the blocks of your actual file-to-be-restored will need to be downloaded. So if you have a volume size of 50M, and a 100k block size then your theoretical 1 Meg data file might actually be stored in the destination in anwhere from 1 to 10 (1 Meg / 100k = 10) different archive files (ALL of which needing to be downloaded for restoration).

Hopefullly that clears a few things up for you!

1 Like

Gotcha… That’s kind of how I imagined it all working. So I figure it makes more sense to stick with volumes of smaller sizes in most cases. If I’m storing a Terabyte or two of data, it’s going to make a huge number of archive files, can anyone think of any downside to having such a large number of archive files in your destination?

Some filesystems don’t like lots of files, and I think some cloud providers might have some limitations on such things.

Your best bet is probably to check out the Choosing Sizes in Duplicati page and see if it gives you any ideas (sorry, I should have linked to that earlier like kendkendk did above):

1 Like

I’ve had failing backups for the last few days with my disk filling to 100%. Solve now thanks to this discussions. Come on guys, someone create a user manual for this software. It looks pretty good but people can’t guess how to use it!

1 Like