"Deleting unwanted files" never stops

I’m having a weird issue with Duplicati 2.0.7.1_beta_2023-05-25. I’m using it to back up a bunch of large-sized Hyper-V VMs. I’ve kept all the settings default, except for the threshold, which I’ve set to 100 to avoid compacting. This makes the backup fast. The backup grows in size, but very slow.

It works well daily, but on some rare occasions, for some reason, “Deleting unwanted files” never stops. When this happens, the log starts showing unusual events like “Backend event: Get - Started / Completed” and “Backend event: Put - Started / Completed”. Usually, on “Deleting unwanted files”, the log shows only “Backend event: Delete - Started / Completed”.

Terminating causes missing files from backup error. I wonder if it’s still trying to do Compacting which I have set not to do? Is it possible to fix this by adjusting some settings?

“Deleting unwanted files” should show while old backup versions are deleted. This will also trigger a compact to get rid of unnecessary old data.

The way you set threshold to 100, compact should only delete files that contain no used data at all. It will still need to check all the volumes to see how much wasted space they contain. That might be why it is slow. How long did you wait for it to finish?

You can disable compacting completely with --no-auto-compact. Then volumes which are entirely filled with unused data will still remain, and only the list files of old versions are removed.

I’ve tried --no-auto-compact, but the backup starts growing very quickly. I left it running all night once, and it still hadn’t finished by the next day. I don’t understand why it’s trying to Put something to the destination while “Deleting unwanted files” in my case. Shouldn’t it only be deleting blocks that have become 100% obsolete?”

While waiting to see if anyone has that answer, I’ll give some general info.

The expected pattern for a compact that’s not a complete file delete is a bunch of dblock get that eventually drive a dblock and dindex put (containing whatever is still useful), then delete old files.
Repeat until done.

If you ever did let that finish, look in the job log, e.g.

image

The COMPACT command also uses:

--small-file-max-count=<int>
The maximum allowed number of small files.

What was status bar showing? It’s not obvious to me why not compacting would slow the backup, however there’s definitely the potential for size growth. That’s what compact tries to keep down…

There are actually two styles of waste removal. When there is any bit of data left, the cycle given
towards the top of this post happens, but in the case where there is no data left, it’s just a delete.

Maybe whatever changes inside them winds up making many dblock files that can be fully deleted. Turning off compact completely prevents that, leaving only the threshold-based compact to get run, possibly due to some error in computing wasted space. I thought I read talk of a computing error…

Regardless, that’s something that a developer can chase. Maybe the error is small. You can guess
at how empty your dblock files are, if you look at how many get there are before the put gets done.

Having to get a lot of dblock files to make one full one suggests few remaining blocks in old dblocks.
Having to get few (e.g. at default 25% threshold) takes few old blocks, but churns the dblocks more.

It would probably be worth confirming some of the above with some actual log information at least at Information level, and maybe even Verbose or with a pattern to look at tag in brackets. Examples:

2023-11-17 08:02:31 -05 - [Verbose-Duplicati.Library.Main.Database.LocalDeleteDatabase-SmallVolumeCount]: Found 2 small volumes(s) with a total size of 9.05 MB
2023-11-17 08:02:31 -05 - [Verbose-Duplicati.Library.Main.Database.LocalDeleteDatabase-WastedSpaceVolumes]: Found 9 volume(s) with a total of 4.12% wasted space (387.13 MB of 9.19 GB)
2023-11-17 08:02:31 -05 - [Information-Duplicati.Library.Main.Database.LocalDeleteDatabase-CompactReason]: Compacting not required
2023-11-18 19:50:38 -05 - [Verbose-Duplicati.Library.Main.Database.LocalDeleteDatabase-SmallVolumeCount]: Found 3 small volumes(s) with a total size of 17.92 MB
2023-11-18 19:50:38 -05 - [Verbose-Duplicati.Library.Main.Database.LocalDeleteDatabase-WastedSpaceVolumes]: Found 12 volume(s) with a total of 6.09% wasted space (588.84 MB of 9.45 GB)
2023-11-18 19:50:38 -05 - [Information-Duplicati.Library.Main.Database.LocalDeleteDatabase-CompactReason]: Compacting because there are 1 fully deletable volume(s)
2023-11-18 19:55:45 -05 - [Information-Duplicati.Library.Main.Operation.CompactHandler-CompactResults]: Downloaded 15 file(s) with a total size of 564.92 MB, deleted 32 file(s) with a total size of 616.60 MB, and compacted to 6 file(s) with a size of 119.12 MB, which reduced storage by 26 file(s) and 497.47 MB
2023-11-20 08:06:14 -05 - [Verbose-Duplicati.Library.Main.Database.LocalDeleteDatabase-SmallVolumeCount]: Found 1 small volumes(s) with a total size of 1.91 MB
2023-11-20 08:06:14 -05 - [Verbose-Duplicati.Library.Main.Database.LocalDeleteDatabase-WastedSpaceVolumes]: Found 3 volume(s) with a total of 1.45% wasted space (133.07 MB of 8.94 GB)
2023-11-20 08:06:14 -05 - [Information-Duplicati.Library.Main.Database.LocalDeleteDatabase-CompactReason]: Compacting not required

You can see wasted space computation, small volume count, and download/upload versus full delete.
log-file
log-file-log-level
log-file-log-filter

Thanks! I hadn’t thought there was another way of compacting. This might be the reason it’s still trying to compact.

The status bar was showing “Deleting unwanted files…”. In my case, it might be that second style of waste removal.

I’ll try to run the backup with also --small-file-size=0. If it’ll happen again or never happen, I will go back to this thread.

I have another hypothesis, which is that the wasted space exceeded the volume size (default 50 MB). Getting more than 50 MB of source data into a 50 MB volume is easy given compression, so possibly wasted space measured in uncompressed source can exceed the volume size. I’m looking at the line:

and looking at the comparison of WastedSize to volsize.

The help text for threshold is

–threshold (Integer): The maximum wasted space in percent
As files are changed, some data stored at the remote destination may not be required. This option controls how much wasted space the destination can contain before being reclaimed. This value is a percentage used on each volume and the total storage.

and the last sentence is important. It might be deciding to compact based on something other than total storage (for example based on small-file-max-count), then compact looks at individual volume use.