I just checked source code quickly and it seems that the current auto-vacuum parameter runs vacuum everytime because default value for --auto-vacuum-interval = 0m, did I get that right?
I’ve suggested earlier that it would be a good idea to run the vacuum when compaction runs. It should set kind of sane interval for the vacuuming as well as run it only when there’s likelyhood of free space in the database. Of course on top of defragmenting database by restructuring it. But in general, running vacuum after every run, is absolutely excessive.
Just thinking about the sane default values. Any thoughts about it? I’ve seen many vacuum related discussions, but I’ve thought about this specific case several times, but I couldn’t quickly find discussions covering this. As well as it seems that the auto-vacuum-interval is quite a new option. I found out about it just today.
Correct, the minimum interval default is 0m - but it only has an effect if --auto-vacuum option itself is enabled. (The reason the default is 0m is so that the addition of this option doesn’t change behavior for people who expect auto-vacuum to happen every time.)
It was added recently by me - i like auto-vacuum but didn’t want it running after every single backup operation. On my larger backups i set the interval to 1 week.
I’m certainly open to the default being different. When I added this option (and the compact interval option) I didn’t want to affect anything unless these options were set and customized. Was trying to be conservative.
Would be nice to get feedback from a wider audience.
I am not opposed to changing the default, but I am unsure what a good default would be? If we can assume that most people run a backup every day, then a good default would be 14D ?
Ideally, we should not count time, but rather number of times stuff has been deleted. There is a general operations log that has so far not been used for anything but debugging, but maybe we can use it to count how many delete operations have been performed.
There is also settings table where we could increment a value for each delete/compact that actually deletes stuff, and then check/reset that value as needed.
I really don’t know, but that’s better than everyday. I would guess sane default would be something like monthly. Yet depending on activity even that can be too often, just a personal feeling.
I’ve done lot of work with SQLite3 in production environments and practically never vacuuming is a problem. But complete lack of it is, as well as doing it all the time. Especially if databases are in gigabyte range. In most cases I’ll just store year & month of vacuum in DB and if that value is different from current I’ll vacuum. Even that is often way more often than actually necessary, especially for ever growing database without any deletes or updates.
I’ve seen discussion on Duplicati forum about this matter as well. Personally I think that what the ideally option is just extra work, without real benefits.
Fact is that SQLite3 uses pages from free space list. Which means that deletion doesn’t mean that there would be something in the database which should be vacuumed out.
SQLite3 also supports incremental vacuum, etc. But in that case the whole database doesn’t get rewritten / defragged / compacted. Compaction means that there are only full pages. Incremental vacuum just frees space from pages which are on the free list and also increases fragmentation. Database will get on some level fragmented, even if no data is ever deleted or modified. Because new pages are always added at the end.
Anyway, at this stage this is non-issue. Is my personal opinion. Just set 14D or 1M or something like that, and enable auto-vacuum. Users which have different opinion how often to vacuum can change the value. But setting the default values to always / never is a bad option.