Large Database File .sqlite

I have a VM that I’m backing up, keeping files for 180 days. There are changes but I’m excluding lots of folders which are super dynamic (cache folders, etc.) so for the most part, things shouldn’t be changing all that much.

My database file is approaching 30GB. Is there a way to compress or reduce? Should I consider delete and rebuild?

Thoughts? Looking to save some server space on the drives.

Thanks,
John

Hi.

Don’t delete and rebuild!

In all databases including huge enterprise ones, deleting rows from the database rarely or never frees space from the data files. The rows are instead marked as ‘deleted’ and reused.

There is actually a way to objectively answer this, but it’s not convenient. As a result of some recent perf issues I was looking into I found a way to do this. You have to use the SQLITE3 command line client, I’ve not yet found a GUI tool that can do it because it requires specific intelligence that apparently isn’t compiled into the GUI tools, but there is a system catalogue view which can reveal how much ‘unused’ space there actually is

SELECT name, unused / 1048576 "unused mib"
  FROM dbstat
 GROUP BY name,
 ORDER BY 2 desc;

https://www.sqlite.org/dbstat.html

(depending on what Duplicati is doing, you might need to shut down Duplicati temporarily before running this)

If the amount of empty space is rather high, try Vacumming (note: vacuuming is 1, slow and 2 temporarily requires enough free space to build a completely new copy of the database file. SQLITE does not support vacuuming individual tables.)

Vacuuming can also significantly improve performance and can be turned on by a configuration option. Also consider analyzing

Great tip Ingmyv! Thank you!!

This will be fixed in the next canary and beta.

2 Likes

Specifically, –auto-vacuum and –auto-vacuum-interval options might help, or Commandline can vacuum:

C:\ProgramData\Duplicati\duplicati-2.0.4.5_beta_2018-11-28>Duplicati.CommandLine.exe help vacuum

Usage: vacuum <storage-URL> [<options>]

  Rebuilds the local database, repacking it into a minimal amount of disk
  space.

More recent changes in this multi-part effort:

Run PRAGMA optimize upon closing database connection #3745 (issue)
Run PRAGMA optimize when closing database connection #3749 (PR)

How much of a change should we see? I upgraded to 2.0.4.29, performed a backup, then did a vacuum. Database size is about the same.

It depends on the number of records of volume records marked deleted. The Remotevolume table shouldn’t have many records with a “State” of ‘deleted’