Hello
this is what I have found
select * from fileset;
679|1408|5937|1|1673958235
695|1439|6030|1|1676636816
720|1470|6258|1|1679315747
752|1502|6428|1|1682080527
784|1535|6614|1|1684930943
800|1551|6706|1|1686313216
802|1553|6721|1|1686485740
804|1555|6734|1|1686658500
805|1556|6741|1|1686744970
807|1562|6750|1|1686917700
this is what I did to emulate a retention handling:
delete from fileset where id in (800,802,804);
DELETE FROM "FilesetEntry" WHERE "FilesetID" NOT IN (SELECT DISTINCT "ID" FROM "Fileset");
delete from fixedfile WHERE "ID" NOT IN (SELECT DISTINCT "FileID" FROM "FilesetEntry") ;
DELETE FROM "Metadataset" WHERE "ID" NOT IN (SELECT DISTINCT "MetadataID" FROM "Fixedfile");
DELETE FROM "Blockset" WHERE "ID" NOT IN (SELECT DISTINCT "BlocksetID" FROM "Fixedfile" UNION SELECT DISTINCT "BlocksetID" FROM "Metadataset");
DELETE FROM "BlocksetEntry" WHERE "BlocksetID" NOT IN (SELECT DISTINCT "ID" FROM "Blockset");
DELETE FROM "BlocklistHash" WHERE "BlocksetID" NOT IN (SELECT DISTINCT "ID" FROM "Blockset");
And here are a few tests with my system (according to cpubenchmark, my processor is scoring 3000 passmark while yours is at 3500 if it’s a xeon e5-2407v2 like you implied in Error: The socket has been shut down - #63 by tkohhh.
I begin with the database at its primary state (as extracted),with data deleted as above.
sqlite> select count(*) from (SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" AS "BlockID" FROM "BlocksetEntry" UNION SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash"));
37
Run Time: real 185.880 user 81.729881 sys 89.865376
sqlite> select count(*) from (SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" AS "BlockID" FROM "BlocksetEntry" UNION ALL SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash"));
37
Run Time: real 29.486 user 21.750463 sys 3.445500
sqlite> select count(*) from (SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT "BlockID" AS "BlockID" FROM "BlocksetEntry" UNION ALL SELECT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash"));
37
Run Time: real 179.944 user 69.144392 sys 87.384809
sqlite> select count(*) from (SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT "BlockID" AS "ID" FROM "BlocksetEntry" UNION ALL SELECT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash"));
37
Run Time: real 165.787 user 69.276944 sys 88.195351
sqlite> select count(*) from (SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" AS "ID" FROM "BlocksetEntry" UNION ALL SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash"));
37
Run Time: real 24.444 user 21.569854 sys 2.658428
sqlite> pragma cache_size=-200000;
Run Time: real 0.000 user 0.000000 sys 0.000057
sqlite> select count(*) from (SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" AS "ID" FROM "BlocksetEntry" UNION ALL SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash"));
37
Run Time: real 24.499 user 21.804207 sys 2.648285
sqlite> drop index "nnc_BlocksetEntry";
Run Time: real 8.284 user 1.432280 sys 5.766403
sqlite> pragma analyze;
Run Time: real 0.000 user 0.000050 sys 0.000000
sqlite> pragma optimize;
Run Time: real 0.002 user 0.000601 sys 0.000000
sqlite> select count(*) from (SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" AS "ID" FROM "BlocksetEntry" UNION ALL SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash"));
37
Run Time: real 24.581 user 22.079502 sys 2.182641
sqlite> select count(*) from (SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" AS "ID" FROM "BlocksetEntry" UNION ALL SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash"));
37
Run Time: real 24.190 user 21.979442 sys 2.172846
sqlite> select count(*) from (SELECT "Hash", "Size", "VolumeID" FROM "Block" WHERE "ID" NOT IN (SELECT DISTINCT "BlockID" AS "ID" FROM "BlocksetEntry" UNION SELECT DISTINCT "ID" FROM "Block", "BlocklistHash" WHERE "Block"."Hash" = "BlocklistHash"."Hash"));
37
Run Time: real 35.799 user 32.849523 sys 2.872120
Even in the worst case I am in the 3 minutes ballpark.
Fun details: adding an index has a (bad) impact only because the query is using (erroneously) a simple ‘UNION’. If using UNION ALL having (and using) this index has no impact. Using DISTINCT has actually a good impact on performance.
Your system is more in the 40 minutes territory.
So it seems to have terrible performance. Not sure if it’s because you have a HDD and my system as SSD. It don’t seem enough to explain it to me.
A very common reason for bad performance with Linux systems is swap fully used.
It’s difficult to check for that if not having the backup running while connected to the computer - I don’t remember having seen an alarm logged somewhere ‘swap full !’. OTOH if you have a monitoring system it’s easy to check.