I recently got the dreaded ‘we expected X files but found Y files’ error. I tried to repair to no avail and eventually did a delete/recreate. I watched the machine spin for about 10 days before a brief power outage killed it. I am considering starting over w/ more manageable chunking.
I have a roughly 750GB archive with about 250k files. This is mostly coldish storage, family digital junk accumulated over several years. New data streams in on a fairly constant basis as new email and pics come in and periodically docs/etc are updated but for the most part I am not worried any significant ‘file edit’ churn.
I am on a pretty stable line of about 20MB (power outage above is probably the second in 5 years), using wasabi and b2 as offsite providers.
My observations over the last 10 days were that duplicati was mostly pegging the io of my old fileserver hammering on sqllite file for about 80-90% of the time, then it would spin CPU for a while and go back to pegging IO (a pair of mirrored spinning platters). I have read several articles on block and dblock sizes and from these, i am surmising that the problem was around my block sizes being too small, ie too many lines in the sqllite db.
So, that is my situation. A few questions if I may. I think the main goal here would be to alleviate some of the local load (mostly sqllite IO) to make operations run more smoothly.:
- Am I correct that there is no way to alter my local block size after a backup is already created, right? (In any case, that would probably just blow up my sql lines further.)
- As I am considering starting over from scratch, does anybody have any suggestions on the local block sizes? I see the default is 100k. I was thinking maybe 2MB? 5?
- I actually think 50mb seems like a fine remote block size after reading Choosing Sizes in Duplicati • Duplicati. Maybe, as Block Size for Cloud Backup suggests, I’ll go to 200 to avoid too many pagination re-requests. Open to thoughts here.
Thanks for your time
Edit to add one more question. What do we want for file retention on the remote (b2/wasabi)? For Arq backup, their support told me it was a requirement that I didn’t enable versioning support on the destination. I don’t think it should matter, really, but does duplicate have any requirements here?