Database recreate performance

why would Duplicati download all dblocks ? You have to understand what it is doing. The problem is that with the dlist files Duplicati gets a description of the existing files with the first block OR a pointer to a list of blocks. Now the problem is to find where in the dblock files are these blocks. This is solved by the indexes files, who provide the map of blocks (the 102 K elemental units) to dblocks (the 50 Mb files). Duplicati needs to download dblocks when it can’t find the available data for some files. If all indexes files are healthy, there is no need to download blocks and Duplicati don’t download any in this case.
So downloading blocks happens when the backend is damaged. Now why is the backend is damaged can happen for many reasons: bad hardware, networks glitches, client computer crash, bugs, and (maybe the baddest of all problems) operator errors such as misconfigurations, mixing databases or manual cleaning of the backend.

Basically filtering the only thing that can be done at Duplicati level - fixing bugs - is extremely difficult based on this forum postings, that are most often half baked. So the real good thing to do is to remedy, that is, make the time to recover less painful.

if the crash is happening because the application crashes ‘cleanly’, that is, abnormally ends but exit according to the operating system rules, yes. If the task is killed by say something like kill -9 under Linux, or the operating system itself abends, it’s very doubtful that Sqlite can always unroll the transaction. Not to think of the case where the transactions are not programmed correctly of course.

Thank you all for listening. I’ll try to address your questions…

re: error message, here are the specific error messages I’ve seen, drawn from the logs; each is followed by paragraphs of code traces:

“Attempt to write a read-only database”

this one I’m pretty sure is because Timemachine is trying to backup the database, and locks it. Suspending use of Timemachine while running Duplicati has made this one go away.

“The database was attempted repaired, but the repair did not complete. This database may be incomplete and the backup process cannot continue. You may delete the local database and attempt to repair it again.”

this one appears when I try to backup to or rebuild a database whose previous rebuild was interrupted, usually because I had to reboot my machine or kill Duplicati.

“Some kind of disk I/O error occurred”

this one happens when I lose contact with the source (external) drive because of a USB error (all of the external USB disks disconnecting simultaneously) or Thunderbolt 2 cable being disturbed by me or the cat :frowning: . This should happen less often with the newer USB C connectors, which are less easily disturbed. However, I cannot keep my (two) cats out of my home office.

re: blocksize

years ago I decided to use 1GB blocksize because I have a lot of video files, and because I determined from testing that Duplicati target file (…aes) processing was not so much dependent on blocksize as on the number of files (blocks), so I opted for larger blocks. I now have gigabit fiber-optic internet from my ISP, and am wired directly into the router. I am backing up source folders that are in the 1-6 TB range.

re: hardware, I’m on a brand new (2023) M2 Max MacBook Pro, running macOS Ventura, with 96GB of RAM and 4TB SSD.

this is considerably faster than my previous (2014) Intel I7 MacBook Pro with 16GB RAM and 1TB SSD, but still suffers from overall slowdown on rebuilds lasting more than a day.

re: pause ANY operation

this means that I rebooted, crashed, or killed the mono-sgen64 process in Activity Monitor. My experience with the GUI “Stop Now” or “Stop after current file” features has been that it does not stop Duplicati, at least in the timeframe I’m willing to wait, particularly if my whole system is creeping along. Once Duplicati is killed, performance returns to normal.

re: waiting weeks or months for recreate to hopefully finish

I’ve waited as long as six weeks to complete a recreate, but usually the impact (slowdown) on my system causes something else to break, requiring a reboot, or at least killing the recreate. For example, after my post yesterday, the macOS Finder app stopped responding and could not be relaunched. I left it run overnight, but the next morning the machine was frozen, so I was forced to reboot.

re: the database is so darn delicate

by “corrupted”, I mean that I cannot continue the previous operation, and Duplicati recommends that I rebuild the database. The rebuild then fails, and advises me to recreate the database. Or, I get the infamous count-mismatch error (sorry, I don’t have one to paste, so this is from memory), “…in version ##, found xxxxxxx blocks, expected yyyyyy blocks”, for which the best solution I found (in the fora) is to delete the version and try again, which usually finds a count-mismatch in another version, so I delete the versions until I get a clean run.

re: “Gets … were taking 15 seconds”

looking at the log with the “explicit” filter, I see a message that says (from memory), “the remote GET operation took 00:00:0x.xx.xx”

re: SSD

this is brand new Macbook Pro with a 4TB internal SSD. I keep the Duplicati databases on the internal SSD.

re: Speedtest

Speedtest tells me if there is an issue with my ISP. I’ve recently switched to a fiber-optics based provider, so upload and download have similar speeds in the 900 Mbps range. My previous ISP was cable, with 800Mbps download speeds, but 10-30 Mbps upload, and subject to local network congestion, which lower the transfer speeds.

re: Activity Monitor

CPU, Memory, Disk, and Network loading do not seem to indicate any bottlenecks, and are consistent with normal use of the system

re: Exclude Duplicati files from backup

This is an option, of course, but I’d rather have the Duplicati DBs backed up to Timemachine, as a rule. I could use it as a temporary workaround when doing long-runnning rebuilds, I guess.

re: kill the DB and start over

I have been using Duplicati for years, and am trying to retain my backup history, rather than starting over, but that seems to be unavoidable now.

re: some background on what’s going on here:

Last Spring, I was advised that my GDrive (provided by my University) was going away, and that I should migrate to a personal GDrive. I had to migrate about 20TB of data, most of which is Duplicati blocks. Over the last year, though many iterations and issues with Google Drive support, I’ve managed to get most of my Duplicati target folders moved. One of my workarounds was to download the target folders to a local drive, then upload to the new GDrive. Various errors occurred, and the local DBs have had to be rebuilt multiple times, using either the local or remote copy of the target blocks. Usually a simple rebuild failed for reasons discussed above, and I’ve had to recreate the DB. Of the half-dozen Duplicati backup-sets that I’ve migrated, I’m down to the largest target folder (4TB), which has been uploaded, but getting the DB to work without throwing the aforementioned errors has been problematic. When my machine froze this morning, it had been rebuilding since Feb 11th. Once (last Fall) I was able to rebuild this DB using the local copy of the target blocks in “only” a single six-week stretch without interruptions (on a slower laptop). My ISP customer service told me that I hold the all-time record for data volume in several months.

Again, thank you for your patience and perseverance as I struggle with (and vent about) this experience.

are you talking about this parameter:

–blocksize = 100kb
The block size determines how files are fragmented.

or this one:

–dblock-size = 50mb
This option can change the maximum size of dblock files.

if the former, I am afraid that it could make your block larger than the dblock… Don’t think that this is a normal case and that it could even work, and if it works by some miracle it could be a bigger problem than anything. A block size of 1 Gb seems, well, extremely dubious to me.

If the latter, it will not fix anything about the database queries. The database queries performance depend almost entirely on the block size, not so much on the dblock.

In addition to those, damage can be by software handling files wrong. Those are more controllable.
Damage is also in the eye of the software. In one open issue, an interrupted compact got confused.
It deletes dindex file (as it should), gets interrupted before commit. so next run sees a “missing” file.
This is a fairly harmless nuisance though. I’d like to know if software can currently lose a dindex file.

Another approach in addition to speedup of the worst case might be detecting pending recreate risk. Yesterday I was working on a Python script heading towards sanity checking dblocks versus dindex.
Step after detection might be Duplicati code to recreate any missing dindex files before it’s too late…

No argument with any of that. Just wanting to say this may be difficult, just as fixing bugs is difficult…

How To Corrupt An SQLite Database File

If an application crash, or an operating-system crash, or even a power failure occurs in the middle of a transaction, the partially written transaction should be automatically rolled back the next time the database file is accessed.

SQLite is Transactional

The claim of the previous paragraph is extensively checked in the SQLite regression test suite using a special test harness that simulates the effects on a database file of operating system crashes and power failures.

is SQLite claim anyway. I know some systems must have something in them that goes wrong, as there sometimes are times when SQLite flat out can’t open its database – a step worse than bad data inside.

Its not that its unmaintainable. Its that its a lot and apparently too much. Its many years on things. For a backup application where stability is vital, its too slow on fixing things. The only two ways around that are more time spent fixing things and slimming it down.

If you’re fine with various things being broken for years then its fine. I’m also fine with it as I’m not experiencing any issues atm and the issues I know about I know how to avoid.

But, I’d still instantly axe a bunch of things to make it easier to maintain. I will do that with my own projects where I need to.

Of course, recreate here might be worth it to keep it assuming it receives enough improvements. Its a valid viewpoint to say its necessary. Personally, I’d never ever wait for more than a day on it. I’d find another way of doing things or focus on its performance to the point where something happens that it can be made fast enough to be happy with it.

I would not send PR to fix them if I was fine with it.
It’s not right in this time to have dog performance while recreating DB with a data size of 500 GB. In my tests I have seen performance begin to degenerate with 1.2 M blocks (equivalent to 120 GB of data with standard block size) Having to change block size with 10 TB of data or more could be expected. Many users would consider and ask themselves if backing up 10 TB could need special consideration before starting to configure a backup. Not many will do with 500 GB.

Try not backing up the database. APFS local snapshots are probably invisibly low-level.
Windows NTFS ones are that way at least, but they do cause a brief suspension of I/O.

So situation is just what the message says. It’s not really corruption, just unfinished work.
The way to avoid this is avoid intentional interruption. May be hard to do with it so slow…

Is the database on that drive? If so, don’t do that if the drive is prone to being unplugged.
If the database is on the permanent drive, can source disconnect reliably break test DB?
How exactly is it messaged? Source drive errors usually are caught and just do warning.

Agree with @gpatel-fr question. Maybe you’re thinking of Options Remote volume size.

image

There’s a link on the GUI screen there to click, or direct link is Choosing sizes in Duplicati.
Remote volume size is a simpler-but-still-quite-confusing term for the dblock-size option.

It’s got to be from something. I don’t know macOS much, so can’t give step by step advice.
Doing Google search for troubleshooting macos performance claims 43 million hits though.

This is interesting. For me, it reliably stops after finishing what it’s doing and uploading data.
I’m on Windows. Any one of the Duplicati folks want to see if they can reproduce this issue?

It’s pretty clear some resources are being exhausted, so keep looking for what that could be.
Do you know how to measure the size of processes, such as the mono ones doing Duplicati?
One Linux user was observing memory growth although we don’t know exactly the operation.

A little too vague, though I have some guesses of similar ones.

looks like (on mine) the same as the Started to Completed time I posted. It still leaves open the
question of whether there’s some other work such as decryption that might be pushing time up.

2022-10-17 16:52:39 -04 - [Profiling-Timer.Finished-Duplicati.Library.Main.BackendManager-RemoteOperationGet]: RemoteOperationGet took 0:00:00:06.617

If that’s true of the external drive, then how a source error damages a DB gets more mysterious,
although the definition of “corrupted” DB being used here isn’t the usual one that one would use.

as does disk congestion, which is why I’ve been asking. There’s a CLI-based test I described too.

Make sure you understand my comment on CPU cores, but if normal use means not straining for
database recreate or something, are you saying that slows things down but all monitors look fine?

If you somehow back it up during backup, it goes instantly obsolete, as it’s changing during backup.
If you back it up while idle and restore a stale copy later, it mismatches and Repair hurts destination.
Database and destination must always match, e.g. you can copy it with Duplicati script after backup.
Configuration data in Duplicati-server.sqlite is less active but usually you Export and save the config.

To soften the impact a little, if space allows and old one is still intact, save it for if you need older files.
Newer one with a better blocksize and who knows what other old problems removed may work better.
Sometimes hidden damage may be possible and reveal itself in a few ways, e.g. dblock downloading.

I can think of an ISP action that would be a lot worse than that. It’s nice that yours wasn’t very upset…

if the block size is at the default value, that is, 100K, with a total data size of 6 TB, that would mean 60 Millions blocks to manage. Database size could be about well, maybe 30 GB ? In the abstract, such database size could work all right with simple, optimized queries. But with the current queries it’s way too much. Recreating backup with a block size of 5 Mb would reduce the block number to about 1.5 M and that would be much more manageable.

@StevenKSanford : open a sqlite db browser, open the job database and enter
select count(*) from block
if it’s over 2 millions, you have to consider raising the block size.

For example DB Browser for SQLite or any one you like.
While overloading Duplicati is known but hard to fix, how
exactly that leads to whole system slowdown is not clear.

Another approach might be to measure I/O done at mono which maybe doesn’t actually hit the SSD.
OS caching can do that, but keeping macOS busy handling I/O requests might slow other programs.

It’s nice (in an odd way) to see your idea of the rough limit is the same one I got. I think I tested with increasingly small blocksize, and at some point it spent all its energy endlessly reading etilqs files, making me wonder if making some cache larger might avoid that – same idea as you’ve been trying.
Watching file activity on Windows would be with Process Monitor. I suppose Linux could run strace.

On another topic, I might test disconnecting a USB-attached SD while backing up to see how it goes.

that’s exactly what I did, I did not want to mess with hundreds of gigabytes to do tests, so I set block size and dblock size to default / 10.

1 Like

Test with USB SD card was to do first backup with one file to get at least one dlist up.
Next added a folder, began backup, pulled USB, got yellow popup at bottom that said:

image

Log truncates that list and only shows first line (I opened an issue asking for more) e.g.

2023-02-18 18:20:57 -05 - [Warning-Duplicati.Library.Main.Operation.Backup.FileBlockProcessor.FileEntry-PathProcessingFailed]: Failed to process path: I: …

Profiling log that I run gives the important second line giving an exception summary, e.g.

System.IO.DirectoryNotFoundException: Could not find a part of the path '\?\I: …

Live log could have found that too. One can click for details, but I’d prefer better regular log.

Verify files button has no complaints, so at first glance I’m not sure I can reproduce bug.
Next backup starts just fine and gets 86 warnings when I kill it prematurely by disconnecting.
If macOS is different, I can’t run it, but I’d still like test-backup steps from @StevenKSanford

@gpatel-fr

“blocksize” in my reply is “–dblock-size”, which I set using the GUI config screen #5 to 1 GB.

My local sqlite DB in the ~/.config/Duplicati folder, for the recreate that just failed (froze my system), is currently 13GB. This is the largest DB in that directory, although I have several more that are nearly as large.

so your block size is still at the default value of 100 K. That’s way too low for good performance currently. You should raise it to 5 MB and recreate your backups.

Seems to me, raising the default -blocksize to a larger value could be a good thing moving forward, I’m thinking 100K just isn’t quite enough for the amounts of data people are now backing up.

@gpatel-fr Thanks for the effort on a fix.

This is the in the developer section so I’ll toss a bit more onto the fire…

1- As mentioned above, changing the default -blocksize to a larger value (YTBD) is likely a simple change that could be implemented in the next release (I know, also YTBD). Short of re-compiling with the new default there shouldn’t be anything else required, no re-writing of code or conflicts it’s just a default change that would likely prevent a lot of user frustration moving forward as this sort of thing seems to be coming up more and more often.

If that’s not in the books for the foreseeable future could we create a “hot fix” of sorts, something users can run post install to adjust the -blocksize setting to a larger default value? This leads to item 3, but stop by item 2 first.

2- Change the wording in the Duplicati documentation around -blocksize to emphasize how it should be scaled up based on data sizes. Maybe a note like if you’re backing up more than 0.5TB you absolutely should increase the value to “xyzKB/xyzMB” or more. I started re-writing the manual page on the topic but lost the edit in a stray reboot, I’ve been trying to get back to it but things have been busy as of late.

3- Create a routine that can change the -blocksize for an existing backup set. Now I haven’t really looked into this all that hard but I know there are many threads around here on the topic and at the end of the day it can’t currently be done but holy moly would it be nice.

I get that the whole DB and backup set would probably have to be recreated and storage would likely increase following the process. I also expect that the process would require at least 2x the storage space in the first place to make it even close to a safe process but I don’t think that’s all that out of reach for that many users. I think this should be a local process vs processing directly to a remote destination.

Storage is cheap and time is money… An external 12TB USB3 drive is only a few hundred bucks and getting cheaper by the day. If the process has to download the existing backup set to a sufficiently large local drive, convert/verify it, remove the old one (option to keep either version locally), purge the old backup set then upload the new set, that doesn’t seem like the worst thing. You can at least reuse the external drive afterwards to create an additional local backup.

I’m by no means saying that’s a simple set of tasks to complete but there has to be a way it could be done and if so would probably really help with user retention.

If the users backup set is too large to be converted locally then they would need to create a new backup with a “better” -blocksize value. That brings up another thing, in a changing data set the “best” -blocksize could change next week or most certainly after a few years, chances are your data is going to grow not shrink so being able to move up a new -blocksize at some point seems like a really valuable feature.

Sure, if you’d rather just make a new backup set then by all means but I really think that that option only appeals to a very small percentage of users faced when faced with that reality. One of Duplicati’s best features, in my mind is it’s versioning and to loose a years of versions would be a huge hit for many, if not most.

I’ll do some more reading on the subject…

4- I’ll try to get some tests setup to see if I can catch TM and Duplicati clashing.

1 Like

re: --blocksize

@gpatel-fr

Thanks for the clarifications on blocksize vs dblock-size. The backup I’m currently repairing only had two versions, so I’ve deleted it and am recreating with a 5MB blocksize, and 1GB dblock-size. Let’s see how this works?

@JimboJones

Thanks for your thoughts on hash blocksize. Being stuck with an initial blocksize for years of backups is, as you rightly pointed out, undesirable. I really like your idea for a utility to re-blocksize the backups without losing all of one’s versions.

I’d love to contribute to the coding efforts, but my programming skills were built in the 1970s-1980s. COBOL anyone? :smiley:

Maybe here:

I’d agree, although there might still be improvements possible for some areas such as recreate.
If, for example, it’s doing anything (e.g. SQL) per-block, that will really add up with many blocks.
@gpatel-fr has likely looked at that more than I have, in addition to being better at C# and SQL.

That would simplify it, if it winds up being a standalone. A less complex rewriter is described at:

Re encrypt remote back end files (referencing a Python script from forum user @Wim_Jansen)

and it also gets into more of the little details of the files. First thing to do might be to open a few.
Format doesn’t seem super complicated. A file is either a block or a blocklist that lists its blocks.
Both cases identify a block by its SHA-256 hash, sometimes directly, sometimes after a Base64.

The challenge is that a hash is one-way, so hash of (say) 10 blocks in sequence can’t be known
unless the blocks are around. Unfortunately they’re packed into dblock files, but can be undone,
however that’s more storage space temporarily while repacking data into larger blocks per spec.

Following the idea of opening a few files, unzip of a dblock can just drop all its blocks right there.
The dindex file has a vol folder with a file of JSON data describing the contents of its dblock file.
There might also be a list folder to hold redundant copies of blocklists also in its big dblock file.

EDIT:

Duplicati.CommandLine.RecoveryTool.exe is another rewriter, in C#, working on recompression.
Duplicati.CommandLine.BackendTool.exe might possibly be the basis for a file uploader where it
matters that the uploaded files wind up looking like Duplicati made them. Google Drive likes that.

Not knowing the code at all, this may be impossible:

Could “blocksize” be made an attribute of the backup version? That way, one could change the blocksize going forward, like “dblock-size” can be changed. I imagine the local database might have to keep separate tables for each version, though, unless it’s already doing that now.

That would open up an opportunity to re-blocksize a single version at a time, which would use local temp space to download the files, but not all the versions at once?

Infeasible would be a better word. Given unlimited resources, lots of things would become possible.

It’s not. Fixed block size is quite firmly embedded. Short blocks (e.g. at end of file) are of course OK.

Files of a backup version are mostly older blocks (only changes are uploaded), so concept doesn’t fit.
There might be some similar way to do this, but developer resources are too scarce. Any volunteers?