Restore operation very slow and causes huge network traffic

Hi @hbl2bk! I’m the author of the reworked restore flow, and while I’m a bit late to the discussion (sorry!), I thought I’d confirm everything that’s been stated.

The outline provided by @kenkendk and @aureliandevel are both correct and unearth the underlying issue of poor volume cache utilization. I originally designed the flow to focus on increasing parallelism and sequential disk writing. This required layers of cache to keep the individual parts busy and to reduce repeated work.

For backups that had been running for quite some time, it turns out that many files are spread across many volumes, leading to a substantial amount of volumes being kept in cache at once (makes sense now, but again, all is clear in retrospective light :slight_smile: ). This in turn resulted in temp stores being filled up, which lead to the introduction of the cache size limiting option.

My initial thought was that while some volumes would have to be redownloaded, evicting some would lead to better storage space for “no longer needed”-evictions. Turns out I was wrong; the remote backups are downloaded multiple times, resulting in aggressive disk writes and internet connection usage.

I do have some ideas in mind, and I like the suggestions that’s already been proposed here. So hopefully we’ll be able to fix this issue. I’ll, however, move the discussion to the other thread.

1 Like