If I may be a pest and butt in here, I’m encountering what I think is the same issue through a different path. Trying to vacuum any of my backup databases, whether using the command line tool from the GUI or using the auto vacuum option, gives a ‘database is locked’ error. Checking out the code myself and stepping over and into some methods shows there’s an underlying instance of that ‘cannot VACUUM from within a transaction’ exception. Version .120 is the last one that works, .121 onwards fail to vacuum. Git bisect shows commit 5ff613b, merging the big async PR, introduced the issue.
Sysinternals Process Explorer also shows that every attempt to vacuum increases the number of handles open to whatever backup database I’m working with. I can’t say for sure that it’s related, but it seems suspicious.
I’m able to reproduce the issue with any attempt to vacuum any of my databases, so I suspect it’s easy enough for you to reproduce too, but if you’d like me to create a dedicated thread or Github issue I’ll be happy to do so.
I’m glad to see an issue opened, although posts on forum may be easier if it gets into a lot. Probably depends a bit on personal preferences. I can go to GitHub if it goes that direction.
At least for me it’s pretty disruptive, as it seems to block most operations using the job DB, major ones being backup and restore. I was looking for a simple failure case after vacuum.
is what I settled on, but haven’t tried it yet on a long-lived (but stuck) TrayIcon or Server, as after finding out that CommandLine hangs just as easily, I was trying to strace it on Linux. Nothing jumped out at me, and I’m afraid I don’t have enough info on syscall-level behavior.
Back at higher levels, I tried messing with (on a test backup, for safety) its options including
and they didn’t avoid the hang when trying vacuum. But at least CLI hang is temporary, as presumably whatever is going wrong in the DB locking goes away when the process does.
Welcome to the forum @ItEndsWithTens and thanks for finding this.
Canary test is meant to be open to all takers, and reports here (especially as well-done as yours was) help to shake bugs out before Duplicati moves to more testers to find the bugs missed in Canary testing. It’s harder to fix things later on, so earlier finds are quite helpful.
Glad to be here, thank you for the warm welcome! I’m also content to go where the action is; I’m comfortable on Github and was in fact originally going to file a report there, but the new issue creation flow prompted me to search the forums. I’d already done so, but tried again just to be sure and found that potentially related post here that I’d previously not noticed. There is one note I want to go leave on the GH issue, just because it involves linking to some code in the duplicati repo and elsewhere, but otherwise I’ll direct my comments wherever necessary.
Interesting, as I have no trouble with backup or restore from the Web GUI, either ngax or ngclient, on two separate Windows 10 Pro 22H2 systems. OS details slipped my mind earlier, sorry, I didn’t intend mention of the Sysinternals tools to be enough to identify my systems.
The first appearance of the issue for me was that a few of my backups started failing earlier in the week, while others were fine. I couldn’t figure it out that day, then the next day all my other backups started failing too. I was eventually able to suss out that because I had auto vacuum enabled, with a minimum interval of 4 days, those initial failing backups were the ones due for a vacuum while the others weren’t. A day later they were all due, and all failed. Simply disabling auto vacuum got my backups completing again.
I also just now tried the compact and list-filesets commands through the Web GUI’s commandline feature, to resounding success, no locked database message or any sort of hangs.
Is this after a GUI Commandline (or other) vacuum gets in trouble? For me, it stays in trouble when I do other things, possibly due to the leftover handles such as you noted.
If we keep up like this on forum topic, I’m thinking maybe I (or other authorized person) should break it into its own topic, as it’s not a 2.1.0.120 bug . That’s the last good one.
Ooh, excellent question, no, this is after I disabled auto vacuum, restarted Duplicati, and have been successfully running backups for a day or two. Testing again now I see that yes indeed, once I get into a failed state, subsequent operations like list-filesets also fail with ‘database is locked’.
Aha, I see my mistake! I misunderstood this line as “the bug blocks most operations”, when you meant that the failure itself blocks most operations afterwards. That’s true for me too, I just misread what you were saying.
As far as a dedicated thread, that sounds reasonable to me, do whatever you feel is appropriate.
Right. Sorry about that. I meant when it gets hit, and it’s a good thing that you found a way. Automated tests don’t cover everything, and neither do users, especially in small numbers.
I moved it here using a similar name to its GitHub issue, but with user-relatable version.
I now have a backup job that won’t even start because it gets this error. Is there any way to prevent it for the job or will an update with the fix be released soon? I tried adding –auto-vacuum=false but it still happens.
Frustratingly it’s a backup that I was trying to reduce in size because I needed to quickly free up backup storage, so I reduced its retention - luckily I found some space in another folder to help me out.
Can you share the stack trace of that? It sounds like we are calling VACUUM somewhere and not respecting the flag. Additionally, this might be another place than the one we have already fixed, so I would love to include a fix for this in the next build.
It could be a while before I can test again. Without thinking I restarted Duplicati, and even though it was “paused”, it became “unpaused” and the scheduled job decided to start. It then failed because my backup storage is down as I try to recover freed space (Windows Server VM, running a trim on 13TB ReFS disk, is stuck and not finished so the backend ZFS pool can’t recover the space.
No worries. I did find a place where the database was opened with “pooling” enabled. This generally causes the database to be kept open for a long time and can cause locked database errors. This was fixed at least, so hopefully taht problem is solved.
So I got my storage running again and now catching up on backups. In the end I had to crash the server, increase its memory from 24G to 32G and the trim worked. But I’ll have to do that again once I can get this broken backup running so its freed space can also be recovered.
How can I get this stack trace you need? I’ve tried a few things but nothing show much more, except this from the live log:
8 Aug 2025 13:31: SQLite Error 5: 'database is locked'.
8 Aug 2025 13:31: Rollback during transaction dispose took 0:00:00:00.000
8 Aug 2025 13:31: Starting - Rollback during transaction dispose
8 Aug 2025 13:30: ExecuteReader: SELECT "ID", "Timestamp" FROM "Operation" ORDER BY "Timestamp" DESC LIMIT 1 took 0:00:00:00.000
8 Aug 2025 13:30: Starting - ExecuteReader: SELECT "ID", "Timestamp" FROM "Operation" ORDER BY "Timestamp" DESC LIMIT 1
8 Aug 2025 13:30: Setting custom SQLite option 'shared_cache=true'.
8 Aug 2025 13:30: Setting custom SQLite option 'threads=8'.
8 Aug 2025 13:30: Setting custom SQLite option 'mmap_size=67108864'.
8 Aug 2025 13:30: Setting custom SQLite option 'cache_size=-65536'.
8 Aug 2025 13:30: Setting custom SQLite option 'journal_mode=WAL'.
8 Aug 2025 13:30: Setting custom SQLite option 'temp_store=MEMORY'.
8 Aug 2025 13:30: Setting custom SQLite option 'synchronous=NORMAL'.
8 Aug 2025 13:30: The operation Backup has failed
8 Aug 2025 13:30: Running Backup took 0:00:01:00.073
8 Aug 2025 13:30: ExecuteScalarInt64Async: INSERT INTO "Operation" ( "Description", "Timestamp" ) VALUES ( "Backup", 1754652612 ); SELECT last_insert_rowid(); took 0:00:00:30.014
8 Aug 2025 13:30: Starting - ExecuteScalarInt64Async: INSERT INTO "Operation" ( "Description", "Timestamp" ) VALUES ( "Backup", 1754652612 ); SELECT last_insert_rowid();
8 Aug 2025 13:30: Setting custom SQLite option 'shared_cache=true'.
8 Aug 2025 13:30: Setting custom SQLite option 'threads=8'.
8 Aug 2025 13:30: Setting custom SQLite option 'mmap_size=67108864'.
8 Aug 2025 13:30: Setting custom SQLite option 'cache_size=-65536'.
8 Aug 2025 13:30: Setting custom SQLite option 'journal_mode=WAL'.
8 Aug 2025 13:30: Setting custom SQLite option 'temp_store=MEMORY'.
8 Aug 2025 13:30: Setting custom SQLite option 'synchronous=NORMAL'.
8 Aug 2025 13:30: PreBackupVerify took 0:00:00:30.057
8 Aug 2025 13:30: ExecuteScalarInt64Async: INSERT INTO "Operation" ( "Description", "Timestamp" ) VALUES ( "Backup", 1754652582 ); SELECT last_insert_rowid(); took 0:00:00:30.053
8 Aug 2025 13:29: Starting - ExecuteScalarInt64Async: INSERT INTO "Operation" ( "Description", "Timestamp" ) VALUES ( "Backup", 1754652582 ); SELECT last_insert_rowid();
8 Aug 2025 13:29: Setting custom SQLite option 'shared_cache=true'.
8 Aug 2025 13:29: Setting custom SQLite option 'threads=8'.
8 Aug 2025 13:29: Setting custom SQLite option 'mmap_size=67108864'.
8 Aug 2025 13:29: Setting custom SQLite option 'cache_size=-65536'.
8 Aug 2025 13:29: Setting custom SQLite option 'journal_mode=WAL'.
8 Aug 2025 13:29: Setting custom SQLite option 'temp_store=MEMORY'.
8 Aug 2025 13:29: Setting custom SQLite option 'synchronous=NORMAL'.
8 Aug 2025 13:29: Starting - PreBackupVerify
8 Aug 2025 13:29: Starting - Running Backup
Sorry for the late reply; I could easily have checked out the fixed code while it was still a pull request and tested it, but I’ve had a stressful couple of weeks that demanded my attention elsewhere. Happily I can report that with the release of Canary 2.1.1.100 my issue appears to be fixed, and I can run vacuum on all of my test databases and real ones without any of the errors I’d previously seen!
One peculiarity I notice now is that running a vacuum leaves the .sqlite-shm and .sqlite-wal files behind in the database’s directory when the operation is finished. Subsequent backups seem to work, and those files eventually get cleaned up somehow, so I’m not too worried. Should I be?
No worries there. The two files are to use the write-ahead-log (.wal) and coordinate it with shared memory (.shm). They are created and removed as needed by SQLite.