You’re probably thinking just about scaling up. There’s also scaling down, e.g. to backup a NAS. Embedded databases can sometimes be persuaded to scale up, but the awkwardness of a non embedded database may make it hard to go small and maintain it. It’s likely a separate process using more files for its databases than SQLite, and possibly having to run in a Docker container.
For maintenance, SQLite is (somewhat) a single file database. You don’t want to get to where a special backup procedure and tool is needed to backup the database for your backup program. Recently, we’ve been dodging that with database recreate from destination, but that grows slow when the backup grows large, because of too many files and too many blocks to insert back in.
Fancier database comes up sometimes, and you can see the other opinions. The most recent:
The “many roundtrips” probably refers to the opposite side of database use. I talked about the problem of a few large queries getting slow, but there are also millions of tiny ones that add up.
To see what I mean (and make a really big profiling log), you can also add the option below to enjoy what I think by default are mostly per-block SQL requests on small blocks. They add up, however they don’t stand out in the profiling log the way the occasional really slow queries do.
Some profiling tools can add up the total time cost of repeated fast queries. Duplicati doesn’t, however there’s a limited amount that can be done short of a redesign to not be issuing them.
Generally in SQL, I think things like inserting rows to the database one at a time is expensive.
--profile-all-database-queries (Boolean): Activates logging of all database
queries
To improve performance of the backups, frequent database queries are not
logged by default. Enable this option to log all database queries, and
remember to set either --console-log-level=Profiling or
--log-file-log-level=Profiling to report the additional log data
* default value: false
In a sense, this is also a scaling up and down problem. Does any one solution cover all cases?
Introducing “Duplicati, Inc.” is an unknown, but it might force a greater emphasis on large scale.
If you look at other comments on SQLite, you’ll see some saying it’s single-threaded, which is a function of it being a library that uses the thread of its caller. I think a caller that wants threading could implement it in the application. It might not be simple, and it requires thought on its usage.
Threading here is aimed at the annoying Task Manager (etc.) view of only getting one CPU core worth of SQL CPU processing on your far higher core system. There’s also the problem that the hard drive might collapse first. I’m not sure quite how big databases manage to avoid that issue.
Personally, I have a feeling that SQLite nested loop joins are where Duplicati’s big joins get slow.
I am not an SQLite wizard, but here are some docs:
Using SQLite In Multi-Threaded Applications
The current implementation of SQLite uses only loop joins. That is to say, joins are implemented as nested loops.
For each table read by the query, the output of EXPLAIN QUERY PLAN includes a record for which the value in the “detail” column begins with either “SCAN” or “SEARCH”. “SCAN” is used for a full-table scan, including cases where SQLite iterates through all records in a table in an order defined by an index. “SEARCH” indicates that only a subset of the table rows are visited.
Ways to gain speed including adding indexes to encourage the query planner to choose search over scan. I think I had a query the other day that chose to scan on innermost loop. It was slow.

Or, using a hierarchical structure for storing metadata of files/blocks?
You’d have to clarify that. The database doesn’t store any of that but it does track it, for example metadata such as a timestamp is stored in some dblock file alongside blocks of some files’ data. Blocks don’t have metadata, but they may be metadata. They might also be a blocklist. It’s all in rather flat layout in storage. In the database, SQLite already uses B-Tree tables for performance. How to work a query is left to the database, which doesn’t say hierarchy can’t be used, but how?

The best practice for current implementation of making a large backup is to stay below the above-mentioned upper limit by splitting into small backup tasks and/or adjusting block size carefully, and enable the zip64 support.
There are quite a few other performance paths, such as usn-policy on Windows to avoid drive scanning on Windows. If you’re in a millions of files situation, scan can be slow, however USN journal itself has a fixed size, and may need enlarging using Windows tools such as fsutil usn.
Options exist to increase upload parallelism, CPU concurrency, and other things. It’s tuning for specific situations and desires. If you want to play with SQLite cache tunings, search forum for CUSTOMSQLITEOPTIONS_DUPLICATI using the forum search too. Feel free to look into the settings on all of this which work well for your case. Favor results over any general opinons, as situations vary a lot, and both equipment and personnel to investigate are rare. Try some ideas.

I actually have more than 20 million files in this specific backup.
Wow. I made a 10 million file tree of empty files (1 TB drive on this desktop won’t hold big ones), tried to back it up with Duplicati, found it taking too long, and added an exclude. Formerly I used Macrium Reflect Free for a fast backup to a USB drive. That’s gone, and file backups are slower.
I’m not sure why. Probably the image backup just dumps the Windows master file table as data, rather than having to walk it and inspect individual files to look for clues suggesting need to read.