SQLite database optimization

Yes, this is the path rewrite in the database.

Even though SQLite is pretty fast, these extremely large tables with strings (some are even 4GB+) are slow to query.

The rewrite is to split the paths into (parent, filename) where “parent” is just an integer. This makes it super fast to get all files in a folder. But paths are used in many places in the code…

2 Likes

I’m more of a MS SQL guy so don’t know much about SQLite, and it’s not the “best” way to do this but rather than a giant change that takes a long time to implement could we do something more transitional where a more optimized lookup table lives alongside the current active table? Then the places that access the old table can be updated in small chunks at a time in order of the most expensive performance wise.

Also - is SQLite more efficient with integer lookups than strings? If so, perhaps a hashed ID could be created for the initial lookup and only if multiple results come back on the hash do fall back to the expensive string lookup (optimized by being only on the returned hash records).

I need to find a tool for looking at SQLite schemas but maybe there’s something that can be done with indexes as well. SQLite has indexes, right? :slight_smile:

Yes, that would be an option. I have considered storing paths like:

ID | Folder  |  Path
1   | 0      |  C:\
2   | 0      |  C:\abc\
3   | 2      |  xyz.txt

Is should then be possible to build a view that simply has:

CREATE VIEW OldPaths AS SELECT ID, (SELECT A.Path From A.NewPaths WHERE A.ID = Folder) || Path

This is a partial decomposition of the paths (as opposed to storing, C:, abc, xyz.txt). I would like to use the partial version because it has a faster resolution (i.e. no need to recursively look up the path elements).

The downside is that a setup with many (mostly empty) folders will still have a large number of entries. But for backups and restores, it is very fast to list all files in a single folder (find the folder, then all files with that parent ID), much faster than a string scan.

Yes, naturally it has schemas and indexes, and a query planner:
https://www.sqlite.org/eqp.html

1 Like

Do the extremely large tables have strings that are extremely large? How do the strings become so large?

The strings themselves are the file paths, so they are usually not too long (less than 256 bytes each), but there are a lot of them. Since they are stored each in full, it takes up a lot of space causing the slowdowns.

Can the large database be split up into smaller ones?

I suspect that would make things worse. I think the problem is it’s stored as a whole path meaning lots of expensive substring searches. Spilt the paths into multiple “tree” parts, even in a single big database, will likely speed things up.

1 Like

To View SQLite Database schema, tables, indexes for free, you can download Revove SQLite Viewer Tool. It available for free download. With help of this tool you can easily view all components of SQLite database. It also allows you to view deleted records of SQLite Database. Most importantly SQLite Viewer tool supports .db file of multiple applications such as Android Devices, Skype, Chrome and many other such applications. For more info visit: Free SQLite Viewer Software to Open & Read SQLite DB Files

1 Like