Best practices to decrease database slowdowns.

Hello guys,

I’m using duplicati and I really like it!

I use b2 cloud.
What are the best practices in relation to the database, as I have seen that it really takes time to create.

I thank everyone!

Welcome to the forums @Julival_Santos_Reis.

Good day,

Are you talking about the first backup or the speed when trying to do a restore?

I’m not much of an SQL person but to my knowledge, the blocksize (not dblocksize, though related) is directly responsible for the amount of entries created in the database, the larger the blocksize the fewer database entries get created, with fewer entries to read that will generally mean a faster database.

The thing is it really depends on your data, in particular how many files you have as each files details will take up a bit of space in each block file. I’ve seen suggestions to increase the blocksize to 150KB but seldom beyond that. It’s generally suggested to create a backup job with one blocksize then create another backup job with a different blocksize and see which is more performant for you.

I think the rough rule of thumb is to set the blocksize to approximately 1 millionth the size of your dataset. So if you are protecting 1TB of data, you might want to use 1MB for the block size. (The default of 100KB blocksize on a 500+ GB backup will result in a very large sqlite database and slower operations.)

You might want to use an even larger blocksize if you think the backup set will grow significantly over time, or if you want to retain a LOT of backup versions.

The downside to using a larger block size is reduced deduplication efficiency.

Now the dblocksize (remote volume size) is less important. The default of 50MB is usually fine unless your target storage limits object count (B2 does not) If you do want to increase it, just don’t go crazy and set it to 1+ GB. Remember that any restore operation will have to download one or more dblocks, even if the file to restore is tiny. If you have it set to something really large like 1GB, it will take a lot longer. Compaction operations are also slowed down by excessively large dblock sizes.

See Choosing Sizes in Duplicati - Duplicati 2 User's Manual

1 Like

Note we’re still seeking clarification on this. Create what? When? Database creation is typically gradual during a backup which has a lot of other speed limiters such as file uploads (depending on destination).

There are a few other times where concentrated database work happens, such as recreating database, and maybe some situations of the database sanity checking before the backup is allowed to truly begin.

For those who really like fast database work, sometimes SSD or more memory can help. It’s rarely done because database speed is frequently CPU-limited. Smaller tables (e.g. from larger blocksize) go faster.

Note that the database isn’t the backup. That’s at the Destination. Database caches information on that.

My big observation is the question of recreating the database. I did a test on the web interface and deleted the database and had it recreated. My data is in b2 cloud and the current bank has 1.6Gb and it took about 1 hour and 20 minutes. Is it acceptable?

What’s a “bank”? Is that size of destination folder (if so, tiny backup) or database (if so, big backup)?

That’s for you to decide. How big is the backup, as seen on the home screen? How much of it did the database recreate download, as seen in the Repair log Complete log in BytesDownloaded? Ideally download is only a small fraction of the backup (just dlist and dindex). You can also watch server log About → Show log → Live → Verbose to see commentary on what’s downloading, or watch progress shown on the bar at top of screen. If it gets past 70% (and especially past 90%), that’s a trouble sign.

If that 1 hour 20 minutes for 1.6GB is acceptable to you should be more based on do you really need to recreate or not?

Recreate is really slow. Only use it if you feel that you have to. At 1.5GB, if you get eg 20MB/sec upload speed, you’d finish a total redo in a few minutes start to finish.20MB/sec * 60 seconds is 1.2GB :slight_smile:

And Duplicati can backup at speeds faster than that if you have it available. My Duplicati backups are currently over 30MB/sec so I could redo in theory about 190GB in about the time it took to recreate 1.6GB. I know it can do about 17MB/sec in that time for about 80GB or so.

My storage is on b2 cloud. When I send the bank to repair, I notice a certain slowdown in the download, it doesn’t download quickly, something around 2 megabits. A question: in the advanced options, what options can I select to make the backup and restore faster and more reliable ??

Thanks.

You quoted a big paragraph but gave no response to anything there. How about at least sizes?

The rest is helpful too because with no information given to us, it’s hard to say what needs help.

What’s quick for your Internet? How are you measuring above quoted speeds? Is it an average?

I think you’ll generally see a download of a file, followed by some time processing data in the file.

Is there a reliability problem that wasn’t mentioned? Speed depends on factors questioned above.

Another way to check for problems and find out actual sizes is in the job log Complete log statistics.
The BackendStatistics for a backup summarizes sizes and versions, and they could be significant.
For Recreate, they show whether or not there are any retries which help reliability but can lose speed.
Both have lots of options if Internet connection or destination are bad, but we don’t know if that’s true.

Example Backup log:

"BackendStatistics": {
  "RemoteCalls": 12,
  "BytesUploaded": 36043741,
  "BytesDownloaded": 38349399,
  "FilesUploaded": 4,
  "FilesDownloaded": 3,
  "FilesDeleted": 3,
  "FoldersCreated": 0,
  "RetryAttempts": 0,
  "UnknownFileSize": 101366,
  "UnknownFileCount": 1,
  "KnownFileCount": 477,
  "KnownFileSize": 9213594793,
  "LastBackupDate": "2022-12-22T10:50:04-05:00",
  "BackupListCount": 58,

Example Recreate log:

"BackendStatistics": {
  "RemoteCalls": 7,
  "BytesUploaded": 0,
  "BytesDownloaded": 52510,
  "FilesUploaded": 0,
  "FilesDownloaded": 6,
  "FilesDeleted": 0,
  "FoldersCreated": 0,
  "RetryAttempts": 0,

An example of what I mean by download and processing being separate (so what did you measure?):

Backend event: Get - Started: duplicati-20221222T165000Z.dlist.zip.aes (912.84 KB)
Downloading file (912.84 KB) …
Backend event: Get - Completed: duplicati-20221222T165000Z.dlist.zip.aes (912.84 KB)
Processing filelist volume 57 of 58
Backend event: Get - Started: duplicati-20221222T175004Z.dlist.zip.aes (912.86 KB)
Downloading file (912.86 KB) …
Backend event: Get - Completed: duplicati-20221222T175004Z.dlist.zip.aes (912.86 KB)
Processing filelist volume 58 of 58
Filelists restored, downloading 217 index files
Backend event: Get - Started: duplicati-i004303c6fb9249edb4f96c2a8ebd577d.dindex.zip.aes (29.84 KB)
Downloading file (29.84 KB) …
Backend event: Get - Completed: duplicati-i004303c6fb9249edb4f96c2a8ebd577d.dindex.zip.aes (29.84 KB)
Processing indexlist volume 1 of 217
Backend event: Get - Started: duplicati-i058fea28c2a74628a24e1259cf23c640.dindex.zip.aes (56.28 KB)
Downloading file (56.28 KB) …
Backend event: Get - Completed: duplicati-i058fea28c2a74628a24e1259cf23c640.dindex.zip.aes (56.28 KB)
Processing indexlist volume 2 of 217

Above is from a verbose console log. About → Show log → Live → Verbose would timestamp to minute and a log file could get seconds. The ratio of download time to processing time varies with volume types.

You might also consider looking at Task Manager (if on Windows) to see how the Performance tab looks, because the performance limiter could be your drive (less likely if it’s an SSD), or CPU, or network speed (probably more of a factor if it becomes necessary to study dblock files in 70% to 100% on progress bar).

There’s no just-right-for-all-cases speedup option (or it would be there already), so situation is important.

EDIT:

For whatever it’s worth, on my ten-year-old quad-core desktop with 7200 RPM drive and 1 Gbit Internet, downloads of dlist files happen in a few seconds but take minutes to process. Its dindex downloads are closer together but the files are tiny. Reading either speed is difficult. It’s intermittent and off-the-graph… Disk utilization was about 65% during dlist processing and over 90% during dindex, but other things ran. CPU utilization was about 20%, so in my case I think it’s hard-drive limit (I have an SSD to install soon). Database recreate took 67 minutes and downloaded 18 MBytes in 137 files, making a 60 MB database.

Source for this backup was 7 GBytes in 6530 files, and there were 11 versions. There’s my data point…

That drive should be able to do at least 100MB/sec though which is still quite good though interestingly it may not reach the full internet connection you have there. The SSD will do much better of course. You’ll definitely like the improvement :smiley:

It depends entirely on the workload. Mechanical drives have seek and rotation time considerations.
Random access is far slower, by a factor of over 100 (supported by UserBenchmark averages and
running CrystalDiskMark just now – getting similar result ratios to others published on the Internet).
There are a variety of things that put the drive into overload (100% busy and a long queue waiting).

Thanks for the thought.