Is my initial backup too large? Thinking about Duplicacy or qbackup instead

JoeyDee · March 31, 2019, 1:49am

Long story short, I have about 5TB of mostly media that I want to backup to Azure Blob storage. My problem is I made two attempts with Duplicati so far and both ended up getting the ‘Detected non-empty blocksets with no associated blocks!’ error where I had to start over (repair never worked).

The first wasn’t a huge deal, I was only about 200GB in, but the last one was quite frustrating as I only had 1TB left.

I have 460down/20up cable internet, so I’ve been using Speedify VPN in conjunction with a couple LTE connections to boost my upload. It helped me immensly, I was averaging about 60Mbit/s upload. However, since the backup job can’t truely be paused, it seems to break and cause the ‘Detected non-empty blocksets with no associated blocks!’ error if I lost connectivity. For that reason, speedify was a huge benefit, since if one connection broke, the other one was fine. However, I had some incredibly bad luck where I was running off only my cable connection and my modem literally locked up, where I had to power cycle it. BOOM ‘Detected non-empty blocksets with no associated blocks!’

Long story short, I’ve been playing with Duplicacy and QBackup, and so far Qbackup seems to be able to handle start/stop very well, where knock on wood, no errors. My issue is that I really sold myself on Duplicati, where even though jobs couldn’t be paused, I was hoping that once I had my first backup finished, I wouldn’t have any problems anymore going forward since future jobs wont take days.

Am I doing something wrong? Should I just stick with qbackup, or should there be a reason to give duplicati another shot?

Thanks in advance!

simsim · March 31, 2019, 9:11pm

repair does not work on an initial backup set.

I suggest you start with a smaller set (e.g. 1TB) of your media files and once backup finished add another TB until all your files have been backed up.

what dblock size do you have? I assume you have rather large files therefore think about setting a dblock size >1GB

JoeyDee · March 31, 2019, 9:24pm

That would explain why repair didn’t work then. I’ve seen a lot of people say repair fixes the error I got and it was really concerning.

Block size was set to 100MB I believe. Larger would be better you think?

So if I set only a couple small folders to sync to get the initial backup done, just add additional folders over time?

simsim · March 31, 2019, 9:44pm

for repair to work it needs an index file which is created when a backup finished (the dlist file). If that file doesn’t exist - no repair possible.

If you have a 5TB backup set and block size of 100 that would result in 100’000 files (50k zip + 50k dindex). Not sure about the limits of Azure but checking contents of a directory with 100k items will be slow…even if you set block size to 1GB you will still create 10’000 files

Yes, start with a smaller set and let it finish. Those file will be “save” if ever you have to run a repair.

JoeyDee · March 31, 2019, 9:59pm

I appreciate your replies. Can you think of any negatives of significantly larger block sizes?

simsim · March 31, 2019, 10:18pm

Glad I’m able to help.

If your backup set consists of large files there is no negative impact on having very large dblock size.

However if your backup set consists of large (e.g. movies) and small (e.g. photos) files I suggest you create two backup sets: One for movies, the other for photos. Since if ever you have to restore one photo you might end up having to download 2 large files from Azure.

JoeyDee · March 31, 2019, 10:31pm

Gotcha, thanks a ton! I didn’t want to give up on Duplicati, so I’m glad that it sounds like I can work around this

JoeyDee · April 1, 2019, 1:11am

Ok, one last question. Once the initial backup is done, if I add a new folder that’s 100GB after that, and it fails after 99GB where I’d have to do a repair… if the repair works, would I have to reupload that 99GB, or will the repair ‘see’ what’s been uploaded already and I’d only have the last 1GB?

Thanks again!

simsim · April 1, 2019, 8:01pm

can’t say this for sure. documentation is scarce on this topic. a developer would have to comment.

ts678 · April 2, 2019, 5:25pm

Initial backup should be able to be repaired points to developer comments about what should happen in general. Ideally the next backup resumes where it stopped, but it may depend on how prior backup fails.

More ideally it even creates a synthetic file list at resume’s start showing where the previous one ended, however I explain there why I think that’s not working. I wonder if fixing it would help initial backup issue?

Choosing sizes in Duplicati offers some advice on the block size and remote volume size question. The –blocksize default of 100KB should probably be higher unless you want your 5 TB divided into 50 million blocks that the database has to track. Operations get slow when the database gets big. Because you’re backing up mostly media which possibly has few block-level similarities between files, deduplication loss might not be a huge loss, and very different files might even help because otherwise incremental change could even wind up in different huge dblock files. You might need to download a lot to gather the changes. There’s a tradeoff between the burden from a large set of small dblocks, and a small set of large dblocks. There’s never a partial download, so there’s some merit to the idea of smaller dblock for smaller sources.

Pectojin · April 2, 2019, 7:36pm

Any data that is uploaded will still be present after the repair, so “roughly” 1GB left

It’s roughly because you may have a couple of volumes locally that weren’t uploaded yet despite being processed so you’d be missing those, but with default volume sizes that’s maybe a few 100 megabytes to re-upload plus the remaining 1GB.

The issue with having to start over only happens because Duplicati is handling “no dlists present” poorly. This handling is good if you suddenly have no dlists left on an existing backup (something would definitely be wrong). However, this exact same scenario on a new backup is obviously not an issue that should cause a failure.

JoeyDee · April 3, 2019, 2:57am

Thanks for the help everyone, you’ve renewed my faith in this. I don’t plan on using this for single item recovery at all, it’s strictly for a DR situation where if my entire storage array turns to dust, so having 1GB blocks doesn’t sound like a bad idea. Maybe I should’ve made them 2GB? ha

Anyways, thanks again!