High disk space consumption by duplicati-2.0.1.72

I have duplicati-2.0.1.72 installed on my Windows Server 2012 R2 box. I recently found directory “C:\Users\Administrator\AppData\Local\Duplicati” consuming around 70 GB of disk space. Files and folders consuming this high disk space are:

  1. WQHJUFNNOU.sqlite
  2. WEOCDJWAOG.sqlite
  3. VQFBEWDNTV.sqlite
  4. C:\Users\Administrator\AppData\Local\Duplicati\Signature Cache
    Is there a way I can reduce or recycle this high disk space usage by Duplicati? Unfortunately I don’t have much free space available on server and I need to find a way to reduce this disk space consumption by Duplicati.

The .sqlite-files are holding all information regarding the files and versions within a backup job.
So one can assume you have 3 jobs running on the machine.

The only way to lower the size of these databases is to lower either file count on your backups (backup not so much files ;-)) or lower the number of versions kept by duplicati.

What you can do is to relocate these databases. This can be done through the gui (or cli). You could move them away from c: to another drive connected in or to your server.

Hope that helps.

Thanks very much for your help.
Unfortunately I have only one drive on the server, so relocating databases won’t work for me. Currently my duplicati backups are configured to retain 30 days data, so if I lower this to 15 days will it reduce the DB size and “Signature Cache” folder size?
Isn’t there a way to recycle all consumed space to zero without losing any backup information?

70GB sounds extreme. How many files is Duplicati backing up?

What’s the actual usage between each sqlite file?

I can’t speak for the signature cache (mine is always empty).
For the databases the decrease to15 days days will decrease the size, but you can’t really tell how much - you have to try this as it depends of a lot of things like file count, path length etc. You also have to consider that not only the “keep time”-parameter is relevant but also the frequency of your backups. I thiink of it as the number of versions that keep the db growing. And that is the product of keep time and frequency.

I do have some backup jobs with 130k files, but databases for that are around 1-2 GB. 70 seems quite much.

Theoretically you can delete this databases as they can be rebuild from the backup data. But they are there for a speedy and reliable backup and i think it won’t do any good. And not to forget that rebuliding them can take a lot of time…

Perhaps @Pectojin can shed a light :slight_smile:

Edit: he is here already! :slight_smile:

It depends on what’s causing this issue. Fewer backups should produce somewhat smaller databases, but I don’t think it’s what’s causing this. So first things first we need to figure out what’s using the space :slight_smile:

This is exactly my experience too. Around 100-200k files, less than 2GB files.

Yup, but if his database is 70GB for a reason, then it’ll take weeks to rebuild :frowning:

The server in question is my Prod Web server and in addition to backing up my application data, i’m also backing up my application logs and web server logs to S3. So the size of data to backup is higher than normal.

Could you tell us the source size, file count and frequency of this jobs?
And what size are the three databases and the cache-folder? Not all together but single?

One thing we can try is looking into the database using sqlite browser and checking how many rows are in each table.

You can simply open the sqlite file and go to Browse Data and check each table.

Notably, here are the number of rows in each of my larger tables:

  • block: 49040 rows
  • blockset: 3002 rows
  • blocksetEntry: 49248 rows
  • file: 2674 rows
  • filesetEntry: 19373 rows

These are the tables that will typically scale with either number of files or number of backups/versions. Which tables do you see the heighest number of rows in?

This was from a pretty small backup set that I had on this machine, so yours will be a lot larger, but it’ll help give an idea of what is taking up the space.

For comparison one of my larger (1.2GB) DB:

  • block: 4.635.484 rows
  • blockset: 362.069
  • blocksetEntry: 4.612.692
  • file: 238.483
  • filesetEntry: 13.725.931

@Pectojin: one more thought: what about the block and upload block sizes? Perhaps these are rather small (and because of that, a lot of them)?

Well, if they’re smaller there will be more block rows.

Each row in block corresponds to a block hash, size, and a reference to which volume it’s in.
BlocksetEntry maps blocks to blocksets, so tthere should be around 1 for each block.

So those 3 tables depend on the blocksize

Here is the size of DB files and folder:
WQHJUFNNOU.sqlite - 1.83 GB
WEOCDJWAOG.sqlite - 1.25 GB
VQFBEWDNTV.sqlite - 421 MB
"C:\Users\Administrator\AppData\Local\Duplicati\Signature Cache" - 61.3 GB

Size of backup source directories is:
81 GB
14.5 GB
7 GB

Please let me know if any other information is needed.

To access DB files with sqlite browser i’ll have to move the files to my desktop first. I don’t want to risk my Prod server for this :slight_smile:

Ok, so the databases are really not your problem, as they seem quite normal in size.
I think you can relax on file count, versions and so on :smile:

It’s the signature cache for sure.
Sadly that thing is out of my knowledge base. :frowning: (mine is empty)

Ditto. I don’t even have the Signature Cache folder on any of my systems as far as I know.

There is an old issue about Signature Caching here:
https://github.com/duplicati/duplicati/issues/324

The signature files are usually small (compared to the volume files) and generated by Duplicati. The signature cache stores the signature files locally after uploading them to the remote server. This has the benefit that they do not have to be downloaded before an incremental backup can be performed. To ensure that the signature files are valid, the manifest file is always downloaded (eg. not cached). Inside the manifest is the hash of the signature file, and Duplicati verifies that the signature file in the cache matches the one expected to be on the server.

If the local signature file is broken or missing, Duplicati downloads it from the remote server.

To answer your questions more directly:

The benefit of the signature cache is reduced download activity from incremental backups, the drawback is increased disk space usage.

It is harmless (except from a slowdown) to clear the cache or even disable it. Duplicati should delete signature files when they are deleted from the backend.

I do not know if this information is true anymore, as it’s very old. But maybe it’s relevant?

I think @kenkendk can help here.

Just out of curiosity, is there a reason you’re staying on the 2.0.1.72 canary version?

It’s not even up to the more stable 2.0.2.1 beta version…

I’ll surely upgrade my Duplicati version hoping it could do something good here :slight_smile:

6 posts were split to a new topic: Reducing size of sqlite databases