I just took the first backup with the latest Duplicati 2.0.4.23_beta_2019-07-14 and about 2 million files (319GB), and it’s taking minutes to browse and open each folder to even just look at what’s inside (even if the directory structure is single dirs inside single dirs until you get deeper).
Would definitely love some optimizations on this front.
Have you ever vacuumed your database? If not try turning on the auto-vacuum option and let it vacuum at the end of the next backup job… See if it helps…
Very slow folder browsing during restore said it’s “caused by a large table of paths needing to be filtered”, which I think is referring to the File table where the Path for some file is kept – and a row created if the file changes. That’s a flat list to search through. There didn’t used to be folder organization, but there is now.
Local database is a somewhat obsolete simplified view showing how a fileset points to files which point to a set of data blocks and a set of metadata blocks. What’s changed, and might be relevant to fixing issue:
Changed the internal storage of paths to use a prefix method. This should reduce the size of the database significantly and enable much faster database queries later on
I didn’t write that text, but because the prefix is a folder, it possibly opens the door to faster folder opens…
I’m still backing up using and testing v2.0.4.21-2.0.4.21_experimental_2019-06-28, which I presume has this change, and in my initial smaller test, the file list speed was very good. I’ll know if it’s actually better than the beta in a few hours when the backup finishes.
Sadly, you’re right. Now that the large backup is complete, the restore list does still take as long as it did before with the beta. At least rebuilding the db on my local machine now took only 25 minutes with the experimental instead of days like with the beta.
Open issue Listing directories for restore very slow #1715 just got a note to look here. Maybe someone will volunteer to change the code to make better use of the path prefix, if the new design actually helps with this.
On the first tab, select one of your backup sets and click Restore files. When you see the file file tree with C:\ (or whatever), leave it there.
Go to the second Duplicati web browser tab and go to About / Show Log / Live / and select Profiling.
Then go back to the first tab and expand C:. When it is complete, go back to the tab that shows the Live log. Near the top (second line when I tried this) it shows the total time it took. But below that it should show several lines like “… ExecuteNonQuery … took [time]” or “… ExecuteReader … took [time]”
Just curious which one of those Execute lines stands out as taking the biggest chunk of the total time.
Takes about 40 seconds to get the root restore directory tree loaded and ready
Takes about 20 seconds to expand a directory in the tree and get it loaded and ready
25-30% CPU while backing up
What I realized while testing is that this backup has a much more complex directory structure. It contains hundreds of directories as it’s a backup of my developer workstation. I read in this thread that expanding a node in the tree needs to parse the whole directory list each time making it slower on such complex directory configuration. So that’d explain my issue IMO
Windows Duplicati Config:
Duplicati Beta (2.0.5.1_beta_2020-01-18)
2-5% CPU while browsing restore
Takes about 2 seconds to get the root restore directory tree loaded and ready
Takes about 2 seconds to expand a directory in the tree and get it loaded and ready
In addition to CPU usage, I was curious about CPU model. This is largely a CPU-intensive task, and I believe single-threaded like most (all?) sqlite queries. So single-threaded CPU performance will also affect this.
Windows box CPU (the one that has the fastest Restore browse) is a Intel Core i5 4670 Quad Core 3.4GHZ Processor LGA1150 Haswell 6MB Cache.
Linux laptop (workstation) has a 8th Generation Intel® Core™ i7-8550U Processor (8M Cache, up to 4.0 GHz).
Anyway in my book, performance is very acceptable. I used to be a Crashplan client before they messed my whole account up and their app was MUCH slower for browsing. So yay for Duplicati and open source software
I use B2. I also tried Wasabi and had no complaints. B2 offers the ability to ship a bucket snapshot on USB drive, which I thought might be a nice option for faster disaster recovery.