At a 1 KB blocksize (the minimum), I guess that means about that many GB. I don’t have that much laying around as installed programs (less likely to have personal paths), so maybe I can ransack my collection of Windows ISOs (all obsolete now…) and copy those and hope that having only a few files shows the issue.
Before trying that, I guess I should ask @rsevero if some similar approach is possible on local systems, because transferring the database might be challenging. I might try a WeTransfer, and I think it does 2 GB.
Running now, but the poor mechanical hard drive in this desktop is very busy (queue length is about 5).
This is just an unencrypted backup with 1 KB blocksize of four folders checked in GUI to a local folder.
The only slight tricky part should be that blocksize is on the Options screen 5 in Advanced options.
I can certainly help with any usage questions, and to some extent on the DB layout and how it’s used. How the backup process works is helpful too, although it’s not really getting down to SQL table levels.
As for my testing to build a big database, I got one but it’s not slow. I wonder if more files would slow it?
I’m not too keen on going wholly-custom on data building. I wonder if throwing small files in might slow?
Kahomono DB ran the new query (after editing File to FixedFile) for 11 hours before I stopped it.
Process Monitor showed it reading the database heavily. Process Explorer suggested 250 MB/sec, however almost none of it got through Windows to the disk, which Task Manager often showed idle.
DB Browser for SQLite results of the query on my database (not using a dump, just directly with it):
rows 0 2970 with 760 good, meaning the functionality of query changed.
time 20 325
One approximation would be to use the FixedFile view. Admittedly performance behavior may differ from actual because there’s no longer a File view that has to go to other tables to return the result to Duplicati.
CREATE VIEW “File” AS SELECT “A”.“ID” AS “ID”, “B”.“Prefix” || “A”.“Path” AS “Path”, “A”.“BlocksetID” AS “BlocksetID”, “A”.“MetadataID” AS “MetadataID” FROM “FileLookup” “A”, “PathPrefix” “B” WHERE “A”.“PrefixID” = “B”.“ID”
Do you have an opinion on how to slow down the query after looking at the statistics differences between the slow-but-private DB and my not-so-slow-but-not-so-private DB? Is File count a plausible guess for it?
@Kahomono do you still have equipment (and can you do scripting) to write dummy data to try to match problem on your actual data? Guidance from the first paragraph might help figure out the recipe for “slow”.
Probably possible given enough tools and coding. A fixed value might be easier, but will that defeat tuning?
What I’m talking about is scrambling in the DB (we’re not DB experts), assuming that is what was meant. Duplicati’s dump does a nice job, but it also gets rid of the view for a table (where performance may vary).
The options might be:
Try to figure out how to make my non-private backup slow.
Try to figure out how to dummy up data to do slow backup.
Try to figure out how to sanitize DB without hurting tunings.
Option 1 not going well simply adding more files. I added 40000 empty files and it didn’t slow, and the odd thing is that the File view showed only the folder not its files. I made the files non-empty. Still not changing.
What finally got the File view update (Refresh button wasn’t enough) was closing and reopening browser, however original query is still pretty fast (22 seconds). One interesting thing is the same query running in Duplicati is hitting the disk hard (unlike the query in browser), and is making it slow, but not ridiculously so.
Still looking for a good recipe for a complete database that shows slowness…
Summary is that SQLite is reading at a crazy rate (maybe about 200 MB/sec, but not actually to the drive). Writing doesn’t seem to be happening much, which makes me wonder if 17 days’ run made any headway.
The other issue at least maybe has hope of getting a database bug report with the relevant tables intact to attempt a performance analysis (and maybe SQL revamp) because it doesn’t use File/FixedFile table.
It also had a query that was semi-slow before things fell off the cliff. Maybe that one could be tuned instead.
Ick. I’ve been sort of wondering if there are any SQLite issues in here, but it’s hard to prove. Have you seen any well-explored-and-documented writeups? If any workarounds were identified, that would be even better.
I skimmed this thread and appreciate all the research you’ve done. It seems I would benefit from increasing blocksize from 0.5MB to 5MB. Is a blocksize conversion feature likely to be added at some point? It sure would be helpful so we could keep our backup history without doubling the backup size.
Looks unlikely with current level of community support. All progress depends on community volunteers.
Volunteers are a bit unpredictable. Sometimes they go. Sometimes new ones arrive. It’s hard to plan…
Please read further here on why IMO this is more than just a feature. I’ll highlight, but offer good news.
so in theory you can evaluate it on your own, however there are other untested changes in there too.
The usual Canary advice applies. Be careful because it contains new fixes and potentially new bugs.
This one is unusually rough in packaging (you can do an install from .zip file) due to signing issues.
It was mainly a quick attempt to solve a Dropbox problem, but it does exist and I’ll be running it soon.
I don’t actually have a working backup job yet to validate, but the sanitized database will probably be ok as something to look at, but since it doesn’t have a backup that we could run operations on that database against, all it will be useful for is seeing sizes of tables and one off queries. So it would be good for troubleshooting one persons issues but not broad performance troubleshooting.