Copy DB to backup storage to speed up desaster restore on another computer? Blocksize on restore?

Catfriend1 · August 22, 2022, 2:11pm

Hi,

I’ve increased the blocksize during backup. Do I need to set the blocksize explicitly when creating a restore job (temporary database, direct restore mode)?

Kind regards
Catfriend1

Catfriend1 · August 22, 2022, 6:40pm

Is there any way to also backup the database to my external hdd drive?

gpatel-fr · August 23, 2022, 6:24am

well, when you read the documentation, let alone what is saying the Web UI, you see:

--blocksize = 100kb
(...)
Note that the value cannot be changed after remote files are created.

if that’s not clear enough, it is meaning that when you change it, you should create a new backup.

Catfriend1 · August 23, 2022, 7:43am

Hi,

thanks for your answer. I did not change the blocksize after the first backup job run. My question is: How do I tell a restore job on a different computer in direct restore mode what is the correct block size of this backup set? Or does Duplicati determine the block size automatically on restore?

gpatel-fr · August 23, 2022, 7:50am

if you only ever did one backup, it’s certainly setup with the original block size.

Catfriend1 · August 23, 2022, 8:19am

No, the block size is correct and I can restore using the original machine where Duplicati runs on schedule and backs up the files. I’ve double checked this. I’ve initially set it to 5 MByte.

How do I tell another Duplicati installation on a different computer in direct restore mode what the correct block size is? Will it automatically determine it originally was set to 5 MByte from the backup set’s files? I can’t find this extra option on the direct restore UI.

gpatel-fr · August 23, 2022, 8:36am

it should work without setting a block size.

Catfriend1 · August 23, 2022, 9:27am

Ok, thank you. Then I guess it just takes soooo long during “Building temporary database” because the backup set is very large.

Catfriend1 · August 23, 2022, 9:28am

@gpatel-fr Do you know if I could manually copy over an existing database from the original machine to the machine which only should do an emergency restore to speed up the process?

gpatel-fr · August 23, 2022, 11:33am

You should not do that unless what you want is to risk your backup data. Having 2 databases targetting the same data files seems something NOT to do. Restoring to another computer is done normally when the original computer is dead.

Catfriend1 · August 23, 2022, 12:16pm

I understand it might be a bit risky. The second instance would exactly need the corresponding database copy with the same state the backup files have plus, no backup execution must take place on the second instance (which would “confuse” the first instance).

Is there a better way to achieve a faster restore of ~ 800 GB of files? I thought when a desaster hits the file server machine - which is also the machine running Duplicati backups - database and files would be gone. Backup files are stored on another machine, hopefully being restore-able.

gpatel-fr · August 23, 2022, 12:28pm

For recovery from a failed computer/disk, you need an export of your config - not saved on the original computer of course. Then on the new computer you create a new job by importing the json file.
Before saving the new job, change it to disable any automatic backup since starting a backup while your new computer is still empty of any data would be a bad move
Then after saving the new job, you recreate the database. If your backups are not messed up, it should be relatively fast since Duplicati is backing up information for this very purpose. Then you restore data from this new config.

ts678 · August 23, 2022, 3:26pm

To what? Historical rule of thumb has been to try to not track more than a few million blocks.
That would suggest blocksize of 1 MB or so before the database and SQL become too slow.

You can move the database to any folder you like. Is that drive also the backup’s destination?
Sometimes keeping the database with the files it tracks is a way to handle drive rotation plan.
It’s also a no-DB-rebuild-waiting-required way to get a restore going if original system breaks.

If you’re talking about a backup (not the “live” location), the path is in an environment variable
DUPLICATI__dbpath=<whatever> which you can use in a run-script-after to run the DB copy.
We’ve already discussed how trying to put stale DB copies back in use can be catastrophic…

Large can do it. Too small a blocksize for its size can do it. Problems with backup can also do it.
Typically when it goes looking for data, the progress bar is in the final 10%. Did yours get there?
You can also see About → Show log → Live → Verbose. Download of dblock files is a bad sign.

This is probably the best way to get back in business. The DB rebuild might be a little slower, and potentially troublesome compared to Direct restore from backup files, but you run it once.

If that doesn’t work because the config wasn’t saved, you still restore your data, but not the config. Sometimes browsing the restore will give you some clues about how to re-enter the config though.

I’m not certain how the direct restore code gets blocksize, but one guess is the manifest in the zip:

{“Version”:2,“Created”:“20210620T130850Z”,“Encoding”:“utf8”,“Blocksize”:204800,“BlockHash”:“SHA256”,“FileHash”:“SHA256”,“AppVersion”:“2.0.5.114”}

Catfriend1 · August 24, 2022, 6:10am

Thanks, that’s exactly what I’d liked to hear :-). I’ll do this.

Yes, I’ve used a ramdisk location where Duplicati could build the temporary database much faster. The lonk-taking-step was the last 10% of the progress bar. I’ll retry it and catch the logs next time.

5 MByte.

Catfriend1 · August 24, 2022, 9:40am

@ts678 If I copy the job database, is the encryption password part of the job DB? I’m asking because if the password would be in it, it would not make sense to encrypt the backup file set.

drwtsn32 · August 24, 2022, 2:56pm

I wouldn’t bother backing up the large, job-specific databases. Those can be recreated from the back end data, although a bug in how some older versions of Duplicati wrote dindex files can make that process painful. (There is a way to proactively fix that situation before you need to do a database recreate.)

Backing up the Duplicati-server.sqlite may be quite useful as it contains job definitions and other settings. Note that it contains credentials to back end storage so treat the file accordingly. I would just make a copy of it and store somewhere securely, don’t back it up with Duplicati itself. You can repeat the backup if you make any config changes.

Alternatively, you could export the job configurations to json format and store in a secure location. That’s what I actually do instead of backing up Duplicati-server.sqlite.

ts678 · August 25, 2022, 1:06am

No. These are also kept in Duplicati-server.sqlite.

Catfriend1 · August 28, 2022, 7:56am

My target storage space is reachable via FTP. Is there a way to read the ftp login credentials from the job DB via a bash script, use them with lftp to connect to the FTP storage and upload the DB file?

Much better would be the FR to check an option in the job settings “also copy the DB after job run to the target storage” and Duplicati would do this.

ts678 · August 29, 2022, 3:42pm

Options are in the server DB Duplicati-server.sqlite. This is in TargetURL column of Backup table.

Unless bash reads databases (I doubt that) you would have to run something from bash that does.
Command Line Shell For SQLite may be one, but there are many. Few do encryption, so you’d set
–unencrypted-database so that you (and malware) will have an easier time getting desired access.

You can do much better if you use the target URL direct from Duplicati for a script run after backup.

github.com

duplicati/duplicati/blob/666b2281032460254839fdc3b6e1055fdf7ce1db/Duplicati/Library/Modules/Builtin/run-script-example.sh#L105-L107


      
          # DUPLICATI__REMOTEURL
          # This is the remote url for the target backend. This value can be changed by
          # echoing --remoteurl = "new value".

Because uploading and downloading large databases is not instant, you might try timing against a DB recreate when the backup is not corrupted, meaning either try the proposed fix or time a fresh backup. Repeat periodically (you test restores anyway I hope) to make sure it doesn’t go into dblock download.

All feature requests (and the existence of Duplicati) depend on volunteers. There is a severe shortage.
Anybody who is able to contribute to code, test, documentation, forum, GitHub support, etc. is needed.

Catfriend1 · September 2, 2022, 10:22am

Thanks, I’ll check out DUPLICATI__REMOTEURL .