Database recreate performance

I have an older machine where I’ve done a test database Recreate that took about 4.5 days (for a 335M sqlite file) - so I thought I’d take a look at why database recreates take so long.

I started by copying the destination files to another folder pointed to by a test backup created on a faster machine, then running a Recreate there. This took 3.7 days - which I expected to be a lot shorter as this other machine is a lot more powerful and the destination files were local.

Note that all numbers / times here are approximate - all I did was get counts for time ranges then multiply by the middle of the time range. If I get access to a best enough of a machine to open all 2G of the SQL calls log file maybe I can narrow things down a bit… :slight_smile:

What I found was that of the 89.5 hours spent doing the recreate, 24.5 of them involved SQL commands with a duration breakdown as follows (percents are of total SQL time, not total recreate):

Pct Dur Calls Duration
0% 0.0 hrs 0 0+ days
0% 0.0 hrs 0 10-24 hours
10% 2.5 hrs 1 1-10 hours
33% 8.0 hrs 27 10-60 minutes
20% 5.0 hrs 61 1-10 minutes
20% 5.0 hrs 594 10-60 seconds
9% 2.3 hrs 1,672 1-10 seconds
6% 1.5 hrs 10.468 0.1 to 1.0 seconds
1% 0.2 hrs 16,537 0.01 to 0.1 seconds
0% 0.2 min 2,206 .001 to .01 seconds
0% 0.9 min 105,165 under 0.001 seconds

(Overlapping ranges assume upper range ends at .999999…)

Some of the slow execs seem to be these…

2.5 hrs / 1 exec STYLE call
2019-01-12 02:28:34 -06 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery:
INSERT INTO "BlocksetEntry" ("BlocksetID", "Index", "BlockID") SELECT DISTINCT "H"."BlocksetID", "H"."Index", "H"."BlockID" FROM (SELECT "E"."BlocksetID" AS "BlocksetID", "D"."FullIndex" AS "Index", "F"."ID" AS "BlockID" FROM (  
            "F"."Index" + ("E"."BlocklistIndex" * 3200) AS "FullIndex",
            MIN(102400, "E"."Length" - (("F"."Index" + ("E"."BlocklistIndex" * 3200)) * 102400)) AS "BlockSize",
                    SELECT * FROM
                            "A"."Index" AS "BlocklistIndex",
                            MIN(3200 * 32, ((("B"."Length" + 102400 - 1) / 102400) - ("A"."Index" * (3200))) * 32) AS "BlocklistSize",
                            "A"."Hash" AS "BlocklistHash",
                            "BlocklistHash" A,
                            "Blockset" B
                            "B"."ID" = "A"."BlocksetID"
                    ) C,
                    "Block" D
                   "C"."BlocklistHash" = "D"."Hash"
                   "C"."BlocklistSize" = "D"."Size"
            ) E,
            "TempBlocklist-C0F15C75B70BE74D98B94D68ECAA5A6B" F
           "F"."BlocklistHash" = "E"."Hash"
        ORDER BY 
) D, "BlocklistHash" E, "Block" F, "Block" G WHERE "D"."BlocksetID" = "E"."BlocksetID" AND "D"."BlocklistHash" = "E"."Hash" AND "D"."BlocklistSize" = "G"."Size" AND "D"."BlocklistHash" = "G"."Hash" AND "D"."Blockhash" = "F"."Hash" AND "D"."BlockSize" = "F"."Size"  UNION SELECT "Blockset"."ID" AS "BlocksetID", 0 AS "Index", "Block"."ID" AS "BlockID" FROM "Blockset", "Block", "TempSmalllist-61B9E4C601A2E24BA94356D21CC3248A" S WHERE "Blockset"."Fullhash" = "S"."FileHash" AND "S"."BlockHash" = "Block"."Hash" AND "S"."BlockSize" = "Block"."Size" AND "Blockset"."Length" = "S"."BlockSize" AND "Blockset"."Length" <= 102400 ) H WHERE ("H"."BlocksetID" || ':' || "H"."Index") NOT IN (SELECT ("ExistingBlocksetEntries"."BlocksetID" || ':' || "ExistingBlocksetEntries"."Index") FROM "BlocksetEntry" "ExistingBlocksetEntries" )
took 0:02:30:27.740
8.0 hrs / 27 execs STYLE call
2019-01-11 23:58:06 -06 - [Profiling-Timer.Finished-Duplicati.Library.Main.Database.ExtensionMethods-ExecuteNonQuery]: ExecuteNonQuery:
INSERT INTO "Block" ("Hash", "Size", "VolumeID") SELECT "FullHash" AS "Hash", "Length" AS "Size", -1 AS "VolumeID"  FROM (SELECT "A"."FullHash", "A"."Length", CASE WHEN "B"."Hash" IS NULL THEN '' ELSE "B"."Hash" END AS "Hash", CASE WHEN "B"."Size" is NULL THEN -1 ELSE "B"."Size" END AS "Size" FROM (SELECT DISTINCT "FullHash", "Length" FROM (SELECT "BlockHash" AS "FullHash", "BlockSize" AS "Length" FROM (  
            "F"."Index" + ("E"."BlocklistIndex" * 3200) AS "FullIndex",
            MIN(102400, "E"."Length" - (("F"."Index" + ("E"."BlocklistIndex" * 3200)) * 102400)) AS "BlockSize",
                    SELECT * FROM
                            "A"."Index" AS "BlocklistIndex",
                            MIN(3200 * 32, ((("B"."Length" + 102400 - 1) / 102400) - ("A"."Index" * (3200))) * 32) AS "BlocklistSize",
                            "A"."Hash" AS "BlocklistHash",
                            "BlocklistHash" A,
                            "Blockset" B
                            "B"."ID" = "A"."BlocksetID"
                    ) C,
                    "Block" D
                   "C"."BlocklistHash" = "D"."Hash"
                   "C"."BlocklistSize" = "D"."Size"
            ) E,
            "TempBlocklist-C0F15C75B70BE74D98B94D68ECAA5A6B" F
           "F"."BlocklistHash" = "E"."Hash"
        ORDER BY 
 ) UNION SELECT "BlockHash", "BlockSize" FROM "TempSmalllist-61B9E4C601A2E24BA94356D21CC3248A" )) A LEFT OUTER JOIN "Block" B ON "B"."Hash" =  "A"."FullHash" AND "B"."Size" = "A"."Length" ) WHERE "FullHash" != "Hash" AND "Length" != "Size"
took 0:00:37:33.427

Hopefully I’ll get some time to dig into those two above that account for 43% of the current SQL execution time. Of course if somebody else wants to take a look at them that would be OK too… :slight_smile:

Oh, and I’m running another Recreate test on a smaller job to see if the same percentages apply to the same call types.


For the record, the way this SQL is built dynamically makes it very difficult to optimize things - not just the SQL but also the calling method (I suspect we’re not benefiting from store execution plans on these repeated calls).

Testing (and ultimately final code) for this may entail “hard coded” SQL. If it improves performance enough, that should be a valid reason to suffer through the pain of maintaining it.

1 Like

Just want to mention there has been some progress discussed here:

I am close to making a merge request for some significant improvements.

1 Like

It’s wonderful to hear that, and it’s been a long time coming. Thanks hugely!

And improve recreate too :smiley:

I tried recreate recently and the time was so poor that I deleted the backup instead and redid it in far less time. Definitely need a exceedingly large improvement there (or the feature dropped in my opinion as its that poor. If it takes longer than a redo then its almost worthless).

I’ve been complaining about (and proving) this problem for years. What I’ve noticed recently is that the recreate process gets progressively slower, and it slows down everything else on my macbook as well. I recently upgraded to an M2 macbook with 96GB of RAM, and my biggest rebuild is still taking weeks.

Meanwhile, performance of everything else, including the Chrome browser, is getting worse by the day, even though the macOS Activity Monitor shows little CPU or memory pressure.

For example, when I started the current rebuild a week ago, the GETs from the GDrive remote storage were taking about 15 seconds. Now they’re taking about one minute for the same size blocks. Chrome is taking a long time to load pages, some of which are timing out. Yet Speedtest shows that my GB internet (wired) is still performing at full spreed.

(Rant on)

What kills me is that the database is so darn delicate!

Timemachine tries to backup the database while Duplicati is running a task: database corrupted, delete and recreate. Crash/reboot the machine while Duplicati is running a task: database corrupted, delete and recreate. The cat walks over the laptop and disconnects the external source hard drives while Duplicati is running a task: database corrupted, delete and recreate.

Simple database repair is rarely successful (“database repair was attempted but did not complete, please delete and recreate the database”).

And… there is no checkpoint feature (that I know of) to allow me to pause ANY operation (backup, rebuild, etc.), reboot my machine, and resume the operation. I haven’t been able to run Time Machine for weeks while waiting for the current delete/recreate to complete.

I am a long-time, early adopter, user of Duplicati, and have worked closely with the developers on occasion to test bug fixes. I truly believe in Duplicati, as a way to ensure that if my on-site storage is destroyed or corrupted, I’ll always have an off-site backup. I am deeply indebted to the volunteers who develop and improve Duplicati, and hope you can forgive my frustration.

(Rant off)

Thanks for listening… Steve


re: database prone to corruption: all database can be corrupted, the specificity of sqlite is that it is an embedded database, so if you kill the application, the database is killed with it and no database transaction can unroll the changes cleanly.
The only workaround is to avoid trouble, really - I don’t want to give my thoughts about the cat, this being a family forum after all.

This said, duplicati has code for repairing database without recreating it. I have tried a few times and it has mostly worked. Code can always be made more robust, but recreate is not always needed.

re: recreate slow
Recreate is slow because some queries are not well optimized for big databases. The kill factor is not so much the size of the data to backup, it’s rather the number of blocks (size of data to backup divided by the unit of deduplication, 100 K by default). So people having bigs backups, usually over 1 TB, but the true unit of size with Duplicati is the blocks number, are raising the dedup block size.
If with a 100 GB database you have, say, 3 millions blocks, it will begin to be seriously slow, but if you raise the block size by a 10 factor (1 Mb), you have only 300000 blocks and it’s tolerably fast, with the price being that you may have a bit less deduplication. Disk speed matters obviously since SSD are at least 10 times faster. If you raise the block size, you have to recreate your backup obviously.

I have pushed experimental code for better database recreation toward the Github project:

but so far no one has tried it; obviously using code written by someone not a Duplicati expert is a risk. I have tested it and it seems to work but there are a lot of unknowns unknowns :slight_smile:

Suppose I would test it but I’d rather workaround the problem by not recreating that way. Just create new instead and problem solved. New is a better way to go as its a protection.

If you ask me, Duplicati is too complicated and code should be removed. Remove the recreate old stuff and just stick a call to new. Simplified, project is less complicated, less to fix, less problems. Done.

Do you mean zap all the backups, empty the backend and create a new backup ?

Have you tested Exclude files from a Time Machine backup on Mac (macOS User Guide)?
Note Apple’s note about APFS local snapshots, but I see you’re talking about backup here.

That might be an SSD downgrade, although it might depend on what model you got.
I don’t know all the Apple model details, but you can search for yours. One example:

512GB version of the new MacBook Pro has a slower SSD than the Mac it replaces
“Apple is using fewer chips in M2 Macs to provide the same amount of storage.”

View disk activity in Activity Monitor on Mac sounds tricky to interpret, but any clues?
I’d prefer queue length or a load average including I/O wait. I’m not a macOS expert.

Duplicati effects on other things come from something, so keep on looking if you like.
You can also make Duplicati more polite, yielding to other demands but slowing itself.

use-background-io-priority would be the one for drive contention not yet investigated.
thread-priority may help CPU, however the breadth of the slowdown suggests it’s OK
except maybe for programs which can’t use multiple cores easily, such as Duplicati’s
SQLite database, meaning it could be going flat out on one core at 10% load over 10.

Big backups (e.g. when over 100 GB) such as some of yours need a larger blocksize,
because without that some of the SQL queries get really slow. You can watch them in
About → Show log → Live → Profiling if you like. Also see the @gpatel-fr comments.

That’s likely RAM resident. If so it wouldn’t notice an issue if the SSD was being slow.
Chrome and pretty much any browser will write the download to the drive as a cache.

Assuming that’s Duplicati Get, first make sure this is the Started to Completed time.

2022-10-17 16:52:32 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Started: (45.16 MB)
2022-10-17 16:52:39 -04 - [Profiling-Duplicati.Library.Main.BackendManager-DownloadSpeed]: Downloaded and decrypted 45.16 MB in 00:00:06.6162567, 6.83 MB/s
2022-10-17 16:52:39 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: (45.16 MB)

File does get written to the drive. Above “politeness” options can slow your download.
I’m not sure if any decryption slowness (per middle line) will delay the Completed line.

If you want to try a more pure download test, Export As Command-line and run get in
Duplicati.CommandLine.BackendTool.exe and see if that also takes a minute to finish.

Any breakage needs a detailed description. To just say it “corrupted” leads us nowhere.
Feel free to cite previous topics if you think we’ve beaten against some of those cases.
I’m pretty sure I haven’t been in one involving the cat, but that’s especially odd, since a
source interruption (while maybe risky to the source drive) is very far removed from DB.

Ideally description has steps reproducible on any OS, thus allowing more people to test.
If need be, we could probably ask @JimboJones who I think actually has macOS to run.

I sure hope not. Some people like their file history. We try hard to try to let them keep it…
Beyond that, if a database recreate isn’t there, it impedes disaster recoveries (drive loss).
Other ways such as Duplicati.CommandLine.RecoveryTool.exe are for emergencies only.

In the current case, especially with any large backup, blocksize change needs fresh start.
I heard talk of 12 TB of backups, so keep the current 100 GB rough rule of thumb in mind.

Please change default blocksize to at least 1MB #4629 can boost 100 GB advice to 1 TB.
Despite years of SQL improving, Duplicati still slows down when holding too many blocks.
Solving that is simple (bigger blocksize) if one knows in advance or is willing to fresh-start.

I understand the reasoning on why someone wouldn’t want that.That’s why they’re not the same. But, pros and cons to both. At the same time waiting days, weeks, or months for recreate to hopefully finish isn’t a good thing either.

Big con is that Duplicati is too big a project, too slow to fix everything, too much to fix, etc. It really should be slimmed down. But, that’s just my viewpoint and not that it would happen.

You’re welcome to say nooooo NOOOOO lol. I’d axe right out instantly.

Sure thing, let me know what you need. I don’t have any M1/M2s but they are up-to date versions of macOS.

would maybe not be coming up if, for example, there was only CLI (but people would Control-C more).
Do CLI-only backup programs attempt real-time control I wonder? My guess is no, but might be wrong.
You give people a GUI with buttons that work in some situations, not in others, and you get complaints.

The pause issue is explained elsewhere in the forum. At least currently, it mainly stops scheduled work.

Back to Control-C or whatever equivalent one’s OS uses to rudely kill some CLI process that’s running:

SQLite puts up with this quite well as far as I see, but application also needs to handle commit/rollback.
There are a couple of open issues on that which I found by accident then confirmed with lots of killing…

My current work is to try to see if I can find how to make recreate get into its dreaded dblock downloads. Without those, it’s not so bad. Until then, we rely on working with the user community to get their details.

Maybe you could take a try at seeing if you can get Time Machine interference with Duplicati database, however a really quick Internet search wasn’t showing me impacts on SQLite, which might be possible. Both programs might want exclusive use of the database at the same time, and the OS would not allow.

If so, maybe you can test the exclusion instead of risking actual slow operation @StevenKSanford is in. Consulting on macOS would also be helpful. How and using what tools can one examine performance?

Some of the other issues may be cross-platform, but they pretty much all are too vague to comment on. Suggestion might be to either open or locate relevant topics. This one is already off-topic, per topic title.

“Database recreate performance” is related to hardware, backup size, blocksize, damage to destination causing dblock downloads (from 70% and beyond on progress bar, and can also be seen in the live log), SQL query code, internal code design which might reuse general code rather than recreate-specific, etc.

@StevenKSanford how about detailing exactly and accurately (quoted message doesn’t exist) the steps leading to any database corruption? If you can do it, maybe someone else (on macOS or not) could test. Even after you know the right message, you might be able to search for some relevant prior forum notes.

You might be trying to say “The database was attempted repaired, but the repair did not complete. This database may be incomplete and the repair process is not allowed to alter remote files as that could result in data loss.” which could be looked up in code to see what it means. Or maybe you saw something else.

the real problem I think is that for each dblock, Duplicati runs these 2 queries that I have tried to optimize - and their performance seems to decrease exponentially with the blocks number. That’s what I hope to fix, if performance decreases proportionally instead it’s as good as it can get.

Yes it may seem like that, but like in all things it’s necessary to look at the problems and evaluate them to solve the more pressing one.
In my opinion, the biggest short term Duplicati problem is by far that it does not scale well. I think that my 2 PR are fixing one of the big performance roadblocks (database list), and for the other major performance problem (database recreate) is making a good advance toward solving it.

I don’t think that the Duplicati code is unmaintainable. The file LocalDatabaseRecreate.cs is mostly fine, only nit is that the main routine should be subdivised in 3 sub fonctions for the 3 steps to do to recreate the database, it’s too big and that’s what is making it difficult to understand.

Now, the function I touched in LocalDatabase.cs, that’s another story. I don’t rewrite a function instead of fixing it for nothing. There are some dark parts that were probably done in bad conditions while time starved, they are not easy to read at all and it’s probably better to redo them.

Some people describe it as a cliff. If it can degrade more slowly with blocks, that’d be great, however there’s also quite a bit of download time if it winds up downloading all dblocks, especially if that doesn’t find enough. My SQL skills are not quite up to digging deeply into the three phases. Third is the worst.

Both directions may help, and there are also the general (somewhat off topic here) corruption mysteries (some, and some others are known with issues filed, waiting for work). Maybe we’ll hear what’s up here.

Thanks for digging in.

why would Duplicati download all dblocks ? You have to understand what it is doing. The problem is that with the dlist files Duplicati gets a description of the existing files with the first block OR a pointer to a list of blocks. Now the problem is to find where in the dblock files are these blocks. This is solved by the indexes files, who provide the map of blocks (the 102 K elemental units) to dblocks (the 50 Mb files). Duplicati needs to download dblocks when it can’t find the available data for some files. If all indexes files are healthy, there is no need to download blocks and Duplicati don’t download any in this case.
So downloading blocks happens when the backend is damaged. Now why is the backend is damaged can happen for many reasons: bad hardware, networks glitches, client computer crash, bugs, and (maybe the baddest of all problems) operator errors such as misconfigurations, mixing databases or manual cleaning of the backend.

Basically filtering the only thing that can be done at Duplicati level - fixing bugs - is extremely difficult based on this forum postings, that are most often half baked. So the real good thing to do is to remedy, that is, make the time to recover less painful.

if the crash is happening because the application crashes ‘cleanly’, that is, abnormally ends but exit according to the operating system rules, yes. If the task is killed by say something like kill -9 under Linux, or the operating system itself abends, it’s very doubtful that Sqlite can always unroll the transaction. Not to think of the case where the transactions are not programmed correctly of course.

Thank you all for listening. I’ll try to address your questions…

re: error message, here are the specific error messages I’ve seen, drawn from the logs; each is followed by paragraphs of code traces:

“Attempt to write a read-only database”

this one I’m pretty sure is because Timemachine is trying to backup the database, and locks it. Suspending use of Timemachine while running Duplicati has made this one go away.

“The database was attempted repaired, but the repair did not complete. This database may be incomplete and the backup process cannot continue. You may delete the local database and attempt to repair it again.”

this one appears when I try to backup to or rebuild a database whose previous rebuild was interrupted, usually because I had to reboot my machine or kill Duplicati.

“Some kind of disk I/O error occurred”

this one happens when I lose contact with the source (external) drive because of a USB error (all of the external USB disks disconnecting simultaneously) or Thunderbolt 2 cable being disturbed by me or the cat :frowning: . This should happen less often with the newer USB C connectors, which are less easily disturbed. However, I cannot keep my (two) cats out of my home office.

re: blocksize

years ago I decided to use 1GB blocksize because I have a lot of video files, and because I determined from testing that Duplicati target file (…aes) processing was not so much dependent on blocksize as on the number of files (blocks), so I opted for larger blocks. I now have gigabit fiber-optic internet from my ISP, and am wired directly into the router. I am backing up source folders that are in the 1-6 TB range.

re: hardware, I’m on a brand new (2023) M2 Max MacBook Pro, running macOS Ventura, with 96GB of RAM and 4TB SSD.

this is considerably faster than my previous (2014) Intel I7 MacBook Pro with 16GB RAM and 1TB SSD, but still suffers from overall slowdown on rebuilds lasting more than a day.

re: pause ANY operation

this means that I rebooted, crashed, or killed the mono-sgen64 process in Activity Monitor. My experience with the GUI “Stop Now” or “Stop after current file” features has been that it does not stop Duplicati, at least in the timeframe I’m willing to wait, particularly if my whole system is creeping along. Once Duplicati is killed, performance returns to normal.

re: waiting weeks or months for recreate to hopefully finish

I’ve waited as long as six weeks to complete a recreate, but usually the impact (slowdown) on my system causes something else to break, requiring a reboot, or at least killing the recreate. For example, after my post yesterday, the macOS Finder app stopped responding and could not be relaunched. I left it run overnight, but the next morning the machine was frozen, so I was forced to reboot.

re: the database is so darn delicate

by “corrupted”, I mean that I cannot continue the previous operation, and Duplicati recommends that I rebuild the database. The rebuild then fails, and advises me to recreate the database. Or, I get the infamous count-mismatch error (sorry, I don’t have one to paste, so this is from memory), “…in version ##, found xxxxxxx blocks, expected yyyyyy blocks”, for which the best solution I found (in the fora) is to delete the version and try again, which usually finds a count-mismatch in another version, so I delete the versions until I get a clean run.

re: “Gets … were taking 15 seconds”

looking at the log with the “explicit” filter, I see a message that says (from memory), “the remote GET operation took 00:00:0x.xx.xx”

re: SSD

this is brand new Macbook Pro with a 4TB internal SSD. I keep the Duplicati databases on the internal SSD.

re: Speedtest

Speedtest tells me if there is an issue with my ISP. I’ve recently switched to a fiber-optics based provider, so upload and download have similar speeds in the 900 Mbps range. My previous ISP was cable, with 800Mbps download speeds, but 10-30 Mbps upload, and subject to local network congestion, which lower the transfer speeds.

re: Activity Monitor

CPU, Memory, Disk, and Network loading do not seem to indicate any bottlenecks, and are consistent with normal use of the system

re: Exclude Duplicati files from backup

This is an option, of course, but I’d rather have the Duplicati DBs backed up to Timemachine, as a rule. I could use it as a temporary workaround when doing long-runnning rebuilds, I guess.

re: kill the DB and start over

I have been using Duplicati for years, and am trying to retain my backup history, rather than starting over, but that seems to be unavoidable now.

re: some background on what’s going on here:

Last Spring, I was advised that my GDrive (provided by my University) was going away, and that I should migrate to a personal GDrive. I had to migrate about 20TB of data, most of which is Duplicati blocks. Over the last year, though many iterations and issues with Google Drive support, I’ve managed to get most of my Duplicati target folders moved. One of my workarounds was to download the target folders to a local drive, then upload to the new GDrive. Various errors occurred, and the local DBs have had to be rebuilt multiple times, using either the local or remote copy of the target blocks. Usually a simple rebuild failed for reasons discussed above, and I’ve had to recreate the DB. Of the half-dozen Duplicati backup-sets that I’ve migrated, I’m down to the largest target folder (4TB), which has been uploaded, but getting the DB to work without throwing the aforementioned errors has been problematic. When my machine froze this morning, it had been rebuilding since Feb 11th. Once (last Fall) I was able to rebuild this DB using the local copy of the target blocks in “only” a single six-week stretch without interruptions (on a slower laptop). My ISP customer service told me that I hold the all-time record for data volume in several months.

Again, thank you for your patience and perseverance as I struggle with (and vent about) this experience.

are you talking about this parameter:

–blocksize = 100kb
The block size determines how files are fragmented.

or this one:

–dblock-size = 50mb
This option can change the maximum size of dblock files.

if the former, I am afraid that it could make your block larger than the dblock… Don’t think that this is a normal case and that it could even work, and if it works by some miracle it could be a bigger problem than anything. A block size of 1 Gb seems, well, extremely dubious to me.

If the latter, it will not fix anything about the database queries. The database queries performance depend almost entirely on the block size, not so much on the dblock.