The query is simply checking that all files are recorded correctly in the database (i.e. that the length recorded for each file is the same as the sum of the file’s chunks.
I am guessing that you have a non-SSD disk, and that the backup database is fairly large. Can you provide some numbers:
- What is the size of the database?
- How many rows are in “Blockset” ?
- How many rows are in “Block” ?
- How many rows are in “BlocksetEntry” ?
Maybe I can artificially create a database of that size and run some performance tests against it.
Heh - I was thinking the same thing. A few VM disk images with a --block-size of 1KB should help isolate high BLOCK count from high FILE count issues…
Wanted to revive this topic. Recently came back to trying duplicati again - grabbed the latest version - 22.214.171.124_canary_2017-11-22 - and set up the same backup.
What I’m backing up is an email system - so lots of small files.
This time around, I only added about 1/3 of the total file collection (smaller subset of domains). The backup ran successfully, but subsequent backups, ran manually, where only a small handful of files changed (30-40), would take about 3 hours to complete. Watching the logs, it’s generally sitting on the sqlite queries. The longest of which is the one I posted above (Oct 20).
Then, I added the complete collection of accounts, and it’s back to the original post above in terms of performance. The initial backup that included the extra files went fine, but subsequent runs that run that file-checking-query in the beginning is where it hangs forever.
The server this runs on is just a single-CPU, 2GHZ, virtual machine, with a SATA drive attached. So, no SSD, but i’m wondering, do we really need higher horsepower and disk performance to run this? Watching the machine it’s pretty much got the CPU pegged during this query - but disk I/O is almost nill - so doesn’t seem to be swapping.
To return to @kenkendk’s questions above, this current database is as follows:
What is the size of the database? --> 509M
How many rows are in “Blockset” ? --> 753146 rows
How many rows are in “Block” ? —> 1283320 rows
How many rows are in “BlocksetEntry” ? --> 1257365 rows
Going on 15 hours with the file-check query - happy to provide the DB for testing if necessary…
Quick followup - this HAS to be something with my server / virtual host.
As a test, I copied the file up to a droplet at Digitalocean, 2G RAM x 2 CPU, SSD drive, just to see the results. I ran the long running query manually on the DB and it took about 10 seconds to complete. That’s insane compared to the DAYS it’s taking on the other server - though, on the primary server, it’s not THAT much slower - 1 cpu, 2GB ram. Can’t explain it. Same server runs an email server, about a dozen domains, and webmail all day long with no performance issue. Noticed on the DO instance, sqlite only binds to a single CPU which is expected.
shrugging shoulders no idea on the vast performance difference. I’ll try migrating said server and see if that helps at all.
I am very surprised by your results as well. I hope you find out what is causing it.
Quick idea: could it be the SQLite version that is older on the slow machine?