Restore browsing still slow on 2.0.3.3 beta

Phastor · April 9, 2018, 3:02pm

File browsing during restore doesn’t seem to have improved for me since receiving this update. Does anything have to be done on the user side to facilitate it? Are existing backups not affected by the changes?

Pectojin · April 9, 2018, 3:22pm

The improvements are in the backend so it should kick in right away and it works on all backups.

Did you go straight from last beta to the current?

Phastor · April 9, 2018, 3:25pm

I’m running the docker container maintained by linuxserver. They updated their image last Friday and the “about” tab is reading “Duplicati - 2.0.3.3_beta_2018-04-02” for the version. From what I understand, they update every Friday and I’ve been keeping mine up to date when they push out new images.

Pectojin · April 9, 2018, 3:29pm

Hmm, then it should work.

It’s worth noting that the speed is still somewhat slow. I get around 3 seconds to open a subfolder. Also, if LinuxSever was bundling the canary before this then the speed increase has been present for a few months.

Phastor · April 9, 2018, 3:45pm

According to them, they don’t do the canary builds.

3 seconds per subfolder would be a dream. Right now it takes me nearly ten minutes just for it to present a file structure to browse after hitting “restore” (getting file versions… fetching path data… etc.) and then about another three minutes per folder I try to open.

drakar2007 · April 9, 2018, 3:54pm

This also might be a good time to add that if a user has hundreds of back-versions in their backup set, they may want to consider implementing the retention policy settings and letting some version thinning happen to get their total versions under control before trying this. I’ve heard some users who have dramatic improvement in performance of the Restore functionality after they trim out a few hundred prior versions, for example.

How many versions do you have? And what’s the location for your backup file set?

Phastor · April 9, 2018, 3:57pm

I only keep 8 versions, and it’s on a local USB 3 drive.

drakar2007 · April 9, 2018, 3:57pm

Wow… then I think something else might be going on. Also I forgot to ask, what’s the total size of the backup set?

Phastor · April 9, 2018, 4:06pm

2.6TB.

The actual backup process doesn’t seem too bad. In fact, I think that part actually improved with the update. After a day with nothing added, a nightly backup will take between 15-20 minutes and I’m assuming most of that time is taken backing up my Plex database since that changes a lot throughout the day. I added about 7 GB to used storage yesterday and the backup took about 25 minutes last night.

drakar2007 · April 9, 2018, 4:13pm

It really seems like something else might be bottlenecking you on that, though I’m not sure where to go next in terms of figuring out what.
For reference - my home machine (i don’t back up my plex DB at the moment but i back up all my self-ripped media) has around 800GB backed up to B2, and when I test out the restore operation, getting the file listing in a remote test from my work laptop takes around 2 minutes, and that’s including rebuilding the database which takes an extra minute or so. After that, opening each folder takes around 3 seconds. I would expect that to scale up when considering a backup set of 2.6TB, but only linearly and not exponentially.

Pectojin · April 9, 2018, 4:32pm

I just downloaded and started the linuxserver/duplicati image for testing. It seems to be working correctly on my system when I restore. About 6 seconds for first listing and then 2-3 sec for subsequent folders.

Given that you’re backing up Plex metadata I think we might be seeing this as an extreme case. The problem with the current restore lookup is that is does regex matching on a table join that grows in complexity based on both number of files and number of backup sets that are available.

But to answer your question, everything looks correct with your docker image. It has the new patches - provided that you have the same :latest that I just downloaded (ca9f730aa9d74e407ea0a27fe1e7ea31896328ef7373149ab7e792b29c2ceb75).

The slowness is caused by the time it takes to look up the data in the database, so if you have it on particularly slow storage or if you have CPU limits on your container, that may be slowing you down further.

However, if we want to troubleshoot this, perhaps we should ask @JonMikelV to split this conversation out to it’s own thread so we can continue?

Phastor · April 9, 2018, 4:49pm

If the Plex database is possibly the culprit, I honestly don’t mind excluding it from the backups. If the number of files can have a direct impact, it makes sense since the database is hundreds of thousands of tiny files. I’m running another separate backup of my appdata share using the CA Backup plugin in unRAID about once a week anyway, so I’ve got Plex data covered in that.

Suppose after I do this, I’ll need to wait 8 days for the Plex data to get flushed out of my versions before seeing any difference. Wouldn’t mind keeping this open though in case the Plex database isn’t the issue and further troubleshooting is necessary.

Thanks to you both for the input on this!

Pectojin · April 9, 2018, 5:02pm

Yeah, the Plex database is incredibly big. I don’t even back mine up because it’s so many files. I’d say it’s a good bet at why it takes so long.

It’s possible to purge files with the --purge command from the commandline. But obviously you have to be careful when manually purging files

Phastor · April 9, 2018, 5:09pm

Just excluded it. I’ll go ahead and just let it naturally get rolled out over time and re-visit this next week. Thanks again!

Phastor · April 17, 2018, 2:40pm

The Plex database is now flushed out of all my versions and I’m still not seeing any improvement on browsing times =/

Pectojin · April 17, 2018, 7:39pm

Hmm, I’m sorry to hear that…

It’s quite poor performance, but the question is whether it’s normal and requires code changes to improve or if it’s somehow a configuration issue.

Are there any kind of quotas on the system? What kind of CPU/memory/disk usage do you see when waiting for the restore page to fetch the next level of files?

@cpo, has a backup sets containing up to 3.2 mio. files and 7 TB data, maybe he can provide another data point in this. @cpo, what kind of restore browsing performance do you see?

cpo · April 18, 2018, 1:04am

Just took a quick peek on a local backup with 1.6 mio files and nearly 2 TB data - initial loading for restore takes around 30 seconds, opening subfolders around 10-15 seconds.

Win64, i7, 4Ghz, 16 GB RAM, Databases on SSD, DUP 2.0.3.1_experimental_2018-03-16

Phastor · April 18, 2018, 2:20pm

Running unRAID here on an i7 3.3GHz, 16GB RAM. The current docker image is based on 2.0.3.3_beta_2018-04-02.

There’s now no databases present in the backup. The backup size is 2.6TB with roughly 122,000 files. Slightly larger in size than yours, but quite a bit less in file count.

System is slightly lower spec than yours, but even if my backup size and file count were identical, I wouldn’t think that difference in hardware would make such a significant difference. It’s still taking me about 10 minutes for the initial load and around 3 minutes per opened subfolder.

cpo · April 18, 2018, 3:08pm

Hmm…
122k files seems fairly normal as filecount.
I would imagine that filecount has way more impact on DB and loading times than backup size (imagine 2 TB in one file - should give an instant restore browsing response - smallest db possible).

I do not have any experience with docker images though… Sorry for not being of much help here.

Pectojin · April 18, 2018, 7:17pm

hmm, database on SSD might give a little edge to cpo, but not that much. What cpo is getting is what I’d expect to see.

The restore browsing is almost entirely CPU bound, so for troubleshooting this locally I would focus on making sure it’s allowed access to 100% of a core. But if it is, then I’m pretty stumped about what’s causing this huge skew in real world performance.

Do you have any special flags on for block or volume size?