Backup Runtime after 2.0.7.1 update

I have a job that backs up my media library (mostly music, TV shows, and photos) to a dedicated HDD in the same server. Prior to updating to 2.0.7.1, the job would take 8-12 minutes each day. However, after the update, it’s consistently taking about 1.5 hours.

Any ideas as to why that would be?

2023-06-12 05:15:00 -07 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Backup has started
2023-06-12 05:16:08 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()
2023-06-12 05:16:08 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (1.90 KB)
2023-06-12 05:18:27 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-ba917ba2cf3f54263bd72f1e4de996837.dblock.zip (409.50 MB)
2023-06-12 05:18:57 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-ba917ba2cf3f54263bd72f1e4de996837.dblock.zip (409.50 MB)
2023-06-12 05:19:08 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-i30eb5dcb3368487b972084083a3943a0.dindex.zip (276.11 KB)
2023-06-12 05:19:08 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-i30eb5dcb3368487b972084083a3943a0.dindex.zip (276.11 KB)
2023-06-12 05:19:10 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-20230612T121500Z.dlist.zip (3.77 MB)
2023-06-12 05:19:10 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-20230612T121500Z.dlist.zip (3.77 MB)
2023-06-12 05:19:10 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler:RetentionPolicy-StartCheck]: Start checking if backups can be removed
2023-06-12 05:19:10 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler:RetentionPolicy-FramesAndIntervals]: Time frames and intervals pairs: 7.00:00:00 / 1.00:00:00, 181.00:00:00 / 31.00:00:00
2023-06-12 05:19:10 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler:RetentionPolicy-BackupList]: Backups to consider: 6/11/2023 5:15:40 AM, 6/9/2023 5:20:16 AM, 6/8/2023 5:15:00 AM, 6/7/2023 5:15:00 AM, 6/6/2023 5:15:00 AM, 6/5/2023 5:15:00 AM, 5/24/2023 5:22:23 AM, 4/21/2023 5:35:27 AM, 3/20/2023 5:35:47 AM, 2/17/2023 4:26:56 AM, 1/17/2023 4:23:55 AM
2023-06-12 05:19:10 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler:RetentionPolicy-BackupsToDelete]: Backups outside of all time frames and thus getting deleted: 
2023-06-12 05:19:10 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler:RetentionPolicy-AllBackupsToDelete]: All backups to delete: 6/5/2023 5:15:00 AM
2023-06-12 05:19:10 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler-DeleteRemoteFileset]: Deleting 1 remote fileset(s) ...
2023-06-12 06:43:48 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Started: duplicati-20230605T121500Z.dlist.zip (3.77 MB)
2023-06-12 06:43:48 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-20230605T121500Z.dlist.zip (3.77 MB)
2023-06-12 06:43:50 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler-DeleteResults]: Deleted 1 remote fileset(s)
2023-06-12 06:47:06 -07 - [Information-Duplicati.Library.Main.Database.LocalDeleteDatabase-CompactReason]: Compacting not required
2023-06-12 06:47:06 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()
2023-06-12 06:47:06 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (1.90 KB)
2023-06-12 06:47:07 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-20230612T121500Z.dlist.zip (3.77 MB)
2023-06-12 06:47:07 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-20230612T121500Z.dlist.zip (3.77 MB)
2023-06-12 06:47:07 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-i870390d73dab411cb513956fcbe41870.dindex.zip (13.19 KB)
2023-06-12 06:47:07 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-i870390d73dab411cb513956fcbe41870.dindex.zip (13.19 KB)
2023-06-12 06:47:07 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-ba917ba2cf3f54263bd72f1e4de996837.dblock.zip (409.50 MB)
2023-06-12 06:47:19 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-ba917ba2cf3f54263bd72f1e4de996837.dblock.zip (409.50 MB)

Hello

please be kind enough to say from what you upgraded. If it was on the canary channel, there is little change. On the other hand if it was on the beta channel, it’s beginning to be from old stuff and there are more changes to consider.

From what I see, the backup itself is not taking long, the funny stuff seems linked to the retention policy. Duplicati is deleting a fileset and taking quite a long time, then decides to backup more stuff. That’s not a part of the code I have looked at yet. Maybe a bug was fixed in the retention policy handling since your last upgrade. Can’t know it if you don’t say what was the previous version.

Ah, good point. I upgraded from 2.0.6.3 beta.

I’ll set my log level to Verbose and see what info we get.

I’m not sure how much Verbose helps… below are the relevant time periods (I excluded all of the verbose messages at the beginning that were a bunch of [Verbose-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-IncludingPath] and [Verbose-Duplicati.Library.Main.Operation.Backup.FilePreFilterProcess.FileEntry-SkipCheckNoMetadataChange] entries.

There’s still no information between 5:18am and 6:44am.

Any ideas?

2023-06-14 05:16:10 -07 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Backup has started
...
2023-06-14 05:18:14 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-b82b209a797a44194a69fb1671d971f0b.dblock.zip (9.73 MB)
2023-06-14 05:18:40 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-b82b209a797a44194a69fb1671d971f0b.dblock.zip (9.73 MB)
2023-06-14 05:18:51 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-i47889597c3c24531a3d63e4aa6e94997.dindex.zip (9.06 KB)
2023-06-14 05:18:51 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-i47889597c3c24531a3d63e4aa6e94997.dindex.zip (9.06 KB)
2023-06-14 05:18:52 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Started: duplicati-20230614T121610Z.dlist.zip (3.77 MB)
2023-06-14 05:18:53 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-20230614T121610Z.dlist.zip (3.77 MB)
2023-06-14 05:18:53 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler:RetentionPolicy-StartCheck]: Start checking if backups can be removed
2023-06-14 05:18:53 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler:RetentionPolicy-FramesAndIntervals]: Time frames and intervals pairs: 7.00:00:00 / 1.00:00:00, 181.00:00:00 / 31.00:00:00
2023-06-14 05:18:53 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler:RetentionPolicy-BackupList]: Backups to consider: 6/13/2023 5:15:00 AM, 6/11/2023 5:15:40 AM, 6/9/2023 5:20:16 AM, 6/8/2023 5:15:00 AM, 6/7/2023 5:15:00 AM, 5/24/2023 5:22:23 AM, 4/21/2023 5:35:27 AM, 3/20/2023 5:35:47 AM, 2/17/2023 4:26:56 AM, 1/17/2023 4:23:55 AM
2023-06-14 05:18:53 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler:RetentionPolicy-BackupsToDelete]: Backups outside of all time frames and thus getting deleted: 
2023-06-14 05:18:53 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler:RetentionPolicy-AllBackupsToDelete]: All backups to delete: 6/7/2023 5:15:00 AM
2023-06-14 05:18:53 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler-DeleteRemoteFileset]: Deleting 1 remote fileset(s) ...
2023-06-14 06:44:59 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Started: duplicati-20230607T121500Z.dlist.zip (3.77 MB)
2023-06-14 06:44:59 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-20230607T121500Z.dlist.zip (3.77 MB)
2023-06-14 06:45:01 -07 - [Information-Duplicati.Library.Main.Operation.DeleteHandler-DeleteResults]: Deleted 1 remote fileset(s)
2023-06-14 06:48:10 -07 - [Verbose-Duplicati.Library.Main.Database.LocalDeleteDatabase-FullyDeletableCount]: Found 0 fully deletable volume(s)
2023-06-14 06:48:10 -07 - [Verbose-Duplicati.Library.Main.Database.LocalDeleteDatabase-SmallVolumeCount]: Found 14 small volumes(s) with a total size of 1.43 GB
2023-06-14 06:48:10 -07 - [Verbose-Duplicati.Library.Main.Database.LocalDeleteDatabase-WastedSpaceVolumes]: Found 9 volume(s) with a total of 0.26% wasted space (4.39 GB of 1.68 TB)
2023-06-14 06:48:10 -07 - [Information-Duplicati.Library.Main.Database.LocalDeleteDatabase-CompactReason]: Compacting not required
2023-06-14 06:48:10 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()
2023-06-14 06:48:10 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (1.90 KB)
2023-06-14 06:48:11 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-20230614T121610Z.dlist.zip (3.77 MB)
2023-06-14 06:48:11 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-20230614T121610Z.dlist.zip (3.77 MB)
2023-06-14 06:48:11 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-i30eb5dcb3368487b972084083a3943a0.dindex.zip (276.11 KB)
2023-06-14 06:48:11 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-i30eb5dcb3368487b972084083a3943a0.dindex.zip (276.11 KB)
2023-06-14 06:48:11 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: duplicati-b82b209a797a44194a69fb1671d971f0b.dblock.zip (9.73 MB)
2023-06-14 06:48:11 -07 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b82b209a797a44194a69fb1671d971f0b.dblock.zip (9.73 MB)

From your most recent post I have noticed that your backup is 1.6 TB. I am beginning to suspect a DB performance issue. You’ll see that by enabling the profiling log level.
Enabling the profiling log level may overflow the log window from the UI though.
A precaution could be to add the log-file and log-file-log-level options to your job

OK, I’ll set the log level and report back tomorrow.

I’m a bit confused as to how it could be a database performance issue though, given that it was running quite quickly before the update. Maybe there’s a factor that I’m not considering.

I have no idea why it should be now a performance issue, but that’s also the next logical step. Duplicati is doing something during this time, raising the log level could be a way to find out what.

Given the limited log, that’s the big mystery. The stuff logged after that looks like it’s just verifications.

To give an idea of expected new information (SQL timings might be very helpful), here’s an example:

2023-06-14 12:26:43 -04 - [Information-Duplicati.Library.Main.Operation.DeleteHandler:RetentionPolicy-AllBackupsToDelete]: All backups to delete: 6/13/2023 12:20:01 PM
2023-06-14 12:26:43 -04 - [Information-Duplicati.Library.Main.Operation.DeleteHandler-DeleteRemoteFileset]: Deleting 1 remote fileset(s) ...
2023-06-14 12:26:43 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: DELETE FROM "Fileset" WHERE "Timestamp" IN (1686673201) 
2023-06-14 12:26:43 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: DELETE FROM "Fileset" WHERE "Timestamp" IN (1686673201)  took 0:00:00:00.001
2023-06-14 12:26:43 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: DELETE FROM "FilesetEntry" WHERE "FilesetID" NOT IN (SELECT DISTINCT "ID" FROM "Fileset")
2023-06-14 12:26:43 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: DELETE FROM "FilesetEntry" WHERE "FilesetID" NOT IN (SELECT DISTINCT "ID" FROM "Fileset") took 0:00:00:00.323
2023-06-14 12:26:43 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: DELETE FROM "ChangeJournalData" WHERE "FilesetID" NOT IN (SELECT DISTINCT "ID" FROM "Fileset")
2023-06-14 12:26:43 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: DELETE FROM "ChangeJournalData" WHERE "FilesetID" NOT IN (SELECT DISTINCT "ID" FROM "Fileset") took 0:00:00:00.000
2023-06-14 12:26:43 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: DELETE FROM "FileLookup" WHERE "ID" NOT IN (SELECT DISTINCT "FileID" FROM "FilesetEntry") 
2023-06-14 12:26:43 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: DELETE FROM "FileLookup" WHERE "ID" NOT IN (SELECT DISTINCT "FileID" FROM "FilesetEntry")  took 0:00:00:00.039
2023-06-14 12:26:43 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: DELETE FROM "Metadataset" WHERE "ID" NOT IN (SELECT DISTINCT "MetadataID" FROM "FileLookup") 
2023-06-14 12:26:43 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: DELETE FROM "Metadataset" WHERE "ID" NOT IN (SELECT DISTINCT "MetadataID" FROM "FileLookup")  took 0:00:00:00.029
2023-06-14 12:26:43 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: DELETE FROM "Blockset" WHERE "ID" NOT IN (SELECT DISTINCT "BlocksetID" FROM "FileLookup" UNION SELECT DISTINCT "BlocksetID" FROM "Metadataset") 
2023-06-14 12:26:43 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: DELETE FROM "Blockset" WHERE "ID" NOT IN (SELECT DISTINCT "BlocksetID" FROM "FileLookup" UNION SELECT DISTINCT "BlocksetID" FROM "Metadataset")  took 0:00:00:00.041
2023-06-14 12:26:43 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: DELETE FROM "BlocksetEntry" WHERE "BlocksetID" NOT IN (SELECT DISTINCT "ID" FROM "Blockset") 
2023-06-14 12:26:43 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: DELETE FROM "BlocksetEntry" WHERE "BlocksetID" NOT IN (SELECT DISTINCT "ID" FROM "Blockset")  took 0:00:00:00.327
2023-06-14 12:26:43 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: DELETE FROM "BlocklistHash" WHERE "BlocksetID" NOT IN (SELECT DISTINCT "ID" FROM "Blockset") 
2023-06-14 12:26:43 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: DELETE FROM "BlocklistHash" WHERE "BlocksetID" NOT IN (SELECT DISTINCT "ID" FROM "Blockset")  took 0:00:00:00.019
2023-06-14 12:26:43 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: INSERT INTO "DeletedBlock" ("Hash", "Size", "VolumeID") SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" AS "BlockID" FROM "BlocksetEntry" UNION SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash") 
2023-06-14 12:26:44 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: INSERT INTO "DeletedBlock" ("Hash", "Size", "VolumeID") SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" AS "BlockID" FROM "BlocksetEntry" UNION SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash")  took 0:00:00:00.174
2023-06-14 12:26:44 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: DELETE FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" FROM "BlocksetEntry" UNION SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash") 
2023-06-14 12:26:45 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: DELETE FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" FROM "BlocksetEntry" UNION SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash")  took 0:00:00:01.267
2023-06-14 12:26:45 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: UPDATE "RemoteVolume" SET "State" = "Deleting" WHERE "Type" = "Files" AND "State" IN ("Uploaded", "Verified", "Temporary") AND "ID" NOT IN (SELECT "VolumeID" FROM "Fileset") 
2023-06-14 12:26:45 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery: UPDATE "RemoteVolume" SET "State" = "Deleting" WHERE "Type" = "Files" AND "State" IN ("Uploaded", "Verified", "Temporary") AND "ID" NOT IN (SELECT "VolumeID" FROM "Fileset")  took 0:00:00:00.000
2023-06-14 12:26:45 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteReader]: Starting - ExecuteReader: SELECT "Name", "Size" FROM "RemoteVolume" WHERE "Type" = "Files" AND "State" = "Deleting" 
2023-06-14 12:26:45 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteReader]: ExecuteReader: SELECT "Name", "Size" FROM "RemoteVolume" WHERE "Type" = "Files" AND "State" = "Deleting"  took 0:00:00:00.000
2023-06-14 12:26:47 -04 - [Profiling-Timer.Begin-Duplicati.Library.Main.BackendManager-RemoteOperationDelete]: Starting - RemoteOperationDelete
2023-06-14 12:26:47 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Started: duplicati-20230613T162001Z.dlist.zip.aes (923.40 KB)

The DELETE command in GUI Commandline would be a way to get a smaller log sooner, if you have a backup version you’re willing to delete. Although it looks clear even from logs posted that it’s in a delete, doing one by itself would add more proof. You could also set backup to not delete versions, and see if it gets fast, but this probably won’t get the kind of SQL log we’re looking for – it just dodges any problems.

The SQL my log showed for my smaller backup (I hope your big one raised blocksize to >= 2 MB) is like

These are files that are mostly static, so the only potential downside is if I delete a backup version that has the only copy of a file, and I have since deleted that file from the source, and I then want to restore it for some reason… does that sound right? If so, I’m OK with that downside and I can pick a version from earlier this week to delete.

Well… I have bad news. This backup does not have a blocksize set explicitly, so that means it’s using the default of 100kB, right?

Yes, plus whatever retention setting you have deletes backups, so you could just accelerate it a bit…

Yes. That can slow the SQL down, sometimes by a lot, however I don’t know why 2.0.7.1 hurts more (presuming profiling log shows slow SQL – don’t know yet). It added four indexes which sped up the backup, but I don’t know if there was anything it slowed down. Did the database size get any bigger? There should be a backup copy of that job database in the same folder, with update date in its name.

Bigger database might be less able to fit in the default SQLite cache, so would basically go thrashing. Fortunately (if that’s the issue) 2.0.7.1 can increase cache, but this is all guessing far too far ahead…

The job is running now, but I think it’s already clear exactly what’s taking so long:

2023-06-14 12:35:05 -07 - [Profiling-Timer.Begin-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: Starting - ExecuteNonQuery: INSERT INTO "DeletedBlock" ("Hash", "Size", "VolumeID") SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" AS "BlockID" FROM "BlocksetEntry" UNION SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash") 

This entry has been running for over 20 minutes already.

My immediate thought: why is it using a NOT IN to find the records needed to write to the DeletedBlock table? Shouldn’t it know what was deleted so we don’t have to do such an expensive query?

Expensive queries are the price paid for management of deduplicated data.
This said, your DB should have about 16M blocks. With my biggest database, one with 1.5 M blocks (and with the same Duplicati version), this query takes about 3.5s so your ‘should’ not take so long. It smells of an engine starved of memory.
I have no idea at all why your system should be so much slower. Or is it ? Did you monitor this backup so closely before the update ? was it deleting filesets with each backup ? Maybe that’s what changed, the retention policy could be interpreted differently.
Note that this version does allow to allocate more memory for the engine. You have to set the environment variable CUSTOMSQLITEOPTIONS_DUPLICATI to something like cache_size=-200000 (200 MB to the DB engine)

Attached is the full job log.
DuplicatiDeleteJob.zip (3.6 KB)

The operation took 1:26:40 total. I think this clearly demonstrates that it’s the delete operation that has caused my backup job to balloon in execution time.

Here is the memory usage on the host during the delete operation:

I’ve reviewed logs from before the update. Every single log showed the backup job completing in 8-12 minutes. I looked at enough consecutive logs to rule out the off-chance that on any given day it had no versions that it needed to delete.

The retention policy for this particular job is: 7D:1D,6M:1M

Perhaps in some cases, but I’m not sure I agree in this case.

The below queries should be functionally the same as the delete queries performed by the delete operation (timestamp for the Fileset table doesn’t match because I’m working with a different database for testing purposes).

Block:

Execution finished without errors.
Result: 2323096 rows returned in 8315ms
At line 1:
SELECT 
  * 
FROM 
  block 
WHERE 
  id IN (
    SELECT 
      blockid 
    FROM 
      blocksetentry 
    WHERE 
      blocksetid IN (
        SELECT 
          id 
        FROM 
          blockset 
        WHERE 
          id IN (
            SELECT 
              blocksetid 
            FROM 
              filelookup 
            WHERE 
              id IN (
                SELECT 
                  fileid 
                FROM 
                  filesetentry 
                WHERE 
                  filesetid IN (
                    SELECT 
                      id 
                    FROM 
                      fileset 
                    WHERE 
                      "timestamp" IN (1685361600)
                  )
              )
          )
      )
  ) 
  OR id IN (
    SELECT 
      id 
    FROM 
      block 
      JOIN blocklisthash ON block.hash = blocklisthash.hash 
    WHERE 
      blocksetid IN (
        SELECT 
          id 
        FROM 
          blockset 
        WHERE 
          id IN (
            SELECT 
              blocksetid 
            FROM 
              filelookup 
            WHERE 
              id IN (
                SELECT 
                  fileid 
                FROM 
                  filesetentry 
                WHERE 
                  filesetid IN (
                    SELECT 
                      id 
                    FROM 
                      fileset 
                    WHERE 
                      "timestamp" IN (1685361600)
                  )
              )
          )
      )
  )

BlocklistHash:

Execution finished without errors.
Result: 54666 rows returned in 1455ms
At line 1:
SELECT 
  * 
FROM 
  BlocklistHash 
WHERE 
  BlocksetID in (
    SELECT 
      ID 
    FROM 
      Blockset 
    WHERE 
      ID in (
        SELECT 
          BlocksetID 
        FROM 
          FileLookup 
        WHERE 
          ID in (
            SELECT 
              FileID 
            FROM 
              FilesetEntry 
            WHERE 
              FilesetID in (
                SELECT 
                  ID 
                FROM 
                  Fileset 
                WHERE 
                  "Timestamp" IN (1685361600)
              )
          )
      )
  )

BlocksetEntry:

Execution finished without errors.
Result: 2356291 rows returned in 1744ms
At line 1:
SELECT 
  * 
FROM 
  BlocksetEntry 
WHERE 
  BlocksetID in (
    SELECT 
      ID 
    FROM 
      Blockset 
    WHERE 
      ID in (
        SELECT 
          BlocksetID 
        FROM 
          FileLookup 
        WHERE 
          ID in (
            SELECT 
              FileID 
            FROM 
              FilesetEntry 
            WHERE 
              FilesetID in (
                SELECT 
                  ID 
                FROM 
                  Fileset 
                WHERE 
                  "Timestamp" IN (1685361600)
              )
          )
      )
  )

Blockset:

Execution finished without errors.
Result: 119250 rows returned in 1141ms
At line 1:
SELECT 
  * 
FROM 
  Blockset 
WHERE 
  ID in (
    SELECT 
      BlocksetID 
    FROM 
      FileLookup 
    WHERE 
      ID in (
        SELECT 
          FileID 
        FROM 
          FilesetEntry 
        WHERE 
          FilesetID in (
            SELECT 
              ID 
            FROM 
              Fileset 
            WHERE 
              "Timestamp" IN (1685361600)
          )
      )
  )

Metadataset:

Execution finished without errors.
Result: 224674 rows returned in 944ms
At line 1:
SELECT 
  * 
FROM 
  Metadataset 
WHERE 
  ID in (
    SELECT 
      MetadataID 
    FROM 
      FileLookup 
    WHERE 
      ID in (
        SELECT 
          FileID 
        FROM 
          FilesetEntry 
        WHERE 
          FilesetID in (
            SELECT 
              ID 
            FROM 
              Fileset 
            WHERE 
              "Timestamp" IN (1685361600)
          )
      )
  )

FileLookup:

Execution finished without errors.
Result: 261258 rows returned in 511ms
At line 1:
SELECT 
  * 
FROM 
  FileLookup 
WHERE 
  ID in (
    SELECT 
      FileID 
    FROM 
      FilesetEntry 
    WHERE 
      FilesetID in (
        SELECT 
          ID 
        FROM 
          Fileset 
        WHERE 
          "Timestamp" IN (1685361600)
      )
  )

ChangeJournalData:

Execution finished without errors.
Result: 0 rows returned in 5ms
At line 1:
select 
  * 
from 
  ChangeJournalData 
WHERE 
  FilesetID in (
    SELECT 
      ID 
    FROM 
      Fileset 
    WHERE 
      "Timestamp" IN (1685361600)
  )

FilesetEntry:

Execution finished without errors.
Result: 261259 rows returned in 44ms
At line 1:
SELECT 
  * 
FROM 
  FilesetEntry 
WHERE 
  FilesetID in (
    SELECT 
      ID 
    FROM 
      Fileset 
    WHERE 
      "Timestamp" IN (1685361600)
  )

Fileset:

Execution finished without errors.
Result: 1 rows returned in 4ms
At line 1:
SELECT 
  ID 
FROM 
  Fileset 
WHERE 
  "Timestamp" IN (1685361600)

BlocksetEntry and BlocklistHash changed above look like they work into this calculation:

INSERT INTO "DeletedBlock" (
	"Hash",
	"Size",
	"VolumeID"
	)
SELECT "Hash",
	"Size",
	"VolumeID"
FROM "Block"
WHERE "ID" NOT IN (
		SELECT DISTINCT "BlockID" AS "BlockID"
		FROM "BlocksetEntry"
		
		UNION
		
		SELECT DISTINCT "ID"
		FROM "Block",
			"BlocklistHash"
		WHERE "Block"."Hash" = "BlocklistHash"."Hash"
		)

Pretty-printing courtesy of poorsql.com. The UNION is of the two ways a block can still be in use.
It’s either part of a blockset (e.g. file data or metadata) detailed by BlocksetEntry or it’s a blocklist
which is the ordered concatenation of the 32 bit SHA-256 hashes of blocks that are in a blockset.

Taken literally, that’s a huge subquery, however I can never forecast what a query planner will do.
SQLite has an The EXPLAIN QUERY PLAN Command that probably can show what it would run.

If you wonder if it could find which BlocksetEntry and BlocklistHash blocks got deleted, maybe so.
It’s now doing it sort of indirectly, and I don’t know how much that hurts, e.g. in the temporary files.

None of this directly explains why things got slower, but improving queries (correctly) can be good.

SQLite won’t use an arbitrary amount of host memory. It’s capped, and I wonder if cap got too low, causing the query to work at the speed of the drive, rather than the speed of the RAM. That’d hurt.
OS caching probably helps some though. Do you have a way to measure reads during slow spot?

EDIT:

PRAGMA cache_size sets the limit, and this Beta just added the ability to change it in case it helps.

The problem is not the total host memory, it’s the memory that Sqlite allocates for its work.

Here are the changes in the db structure:

CREATE INDEX "nnc_Metadataset" ON Metadataset ("ID","BlocksetID");
CREATE INDEX "nn_FilesetentryFile" on FilesetEntry ("FilesetID","FileID");

-- Line 602 & 603 LocalBackupDatabase.cs 
-- CREATE INDEX "tmpName1" ON "{0}" ("Path"),tmpName1
-- CREATE INDEX "tmpName2" ON "{0}" ("Path"),tmpName2

CREATE INDEX "nn_FileLookup_BlockMeta" ON FileLookup ("BlocksetID", "MetadataID");

CREATE INDEX "nnc_BlocksetEntry" ON "BlocksetEntry" ("Index", "BlocksetID", "BlockID");

UPDATE "Version" SET "Version" = 12;

given the query, the only thing that can be related is this index:

CREATE INDEX “nnc_BlocksetEntry” ON “BlocksetEntry” (“Index”, “BlocksetID”, “BlockID”);

A simple experiment could be to delete it

drop index nnc_BlocksetEntry;
pragma optimize;

and see if it changes anything in next backup performance.

Or find the 2.0.6.3 backup of the database made at time of first 2.0.7.1 use, and compare results.
Dropping the index doesn’t shrink the database, and possibly the whole size still has an impact…

By compare results, I don’t mean run Duplicati, but run the query in sqlitebrowser or a similar tool.

Do you have an idea for the slow one that finds what blocks are actually deleted?

Deduplication is one of the challenges – one can’t delete until there are no users.

Possibly one can simplify a little to try to give the query a little less data to digest.

For example, use the blocks just released as a candidate list for further checking.

If size information can be collected, the BlockHashSize index might avoid a scan.
There are other Block table indexes, but it looks like none of them help this query.

This is EXPLAIN QUERY PLAN of the SELECT to INSERT INTO ""DeletedBlock":

id	parent	notused	detail
2	0	0	SCAN TABLE Block
7	0	0	LIST SUBQUERY 2
8	7	0	COMPOUND QUERY
9	8	0	LEFT-MOST SUBQUERY
12	9	0	SCAN TABLE BlocksetEntry USING COVERING INDEX nnc_BlocksetEntry
18	8	0	UNION USING TEMP B-TREE
21	18	0	SCAN TABLE BlocklistHash
23	18	0	SEARCH TABLE Block USING COVERING INDEX BlockHashVolumeID (Hash=?)

Huh. Can’t quite understand why a query that want to get all distinct block Ids

SELECT DISTINCT “BlockID” FROM “BlocksetEntry”

uses an index such this: (“Index”, “BlocksetID”, “BlockID”) while the engine has another index giving
CREATE INDEX “BlocksetEntry_IndexIdsBackwards” ON “BlocksetEntry” (“BlockID”);

seems backward from the engine but who am I to judge ? Anyway, for 1.7 M entries in this table this is not making much difference. And basically trying to optimize a query that has not changed to solve a regression don’t seem an interesting effort to me unless one understands exactly why it happens.
An idea to investigate could be to ask if at the same time of the new problem some new data pattern could have impacted the db. For example, new massive insert and delete: if previously there was every day 10 new MB of data in the DB and basically very few deletes, and then suddenly there is every day 510 MB of new data and the user deletes 500 MB of data from the previous day, the long term evolution is the same however it could impact seriously retention performance.

Edit: on my smallish example (1.7 M lines in Blocksetentry) the query without the new index is clocking at 3.2 s while with the new index it is clocking at 3.7s. Not only using the new index is logically strange but the net result is actually slower.

Edit 2: replace UNION by UNION ALL is a bigger win in this case.