Backup corrupted, need to rebuild a few files

I was backing up my RAID-5 to a local hard drive with duplicati’s most recent canary. While manually copying my duplicati backup folder to a spare drive, I had a handful of read errors during the hours-long copy to my USB drive. So 3 data files of 500 MB are not copied to the spare drive and are unreadable.

My thought was to just forgo recovering the 3 corrupt files and just rebuild them from the RAID-5 cource, but I don’t know how. The Home screen GUI says that I have 14 versions of the backup.

So here’s what error I got. In Windows 10 doing the copy gave me this error:
Can’t read from the source file or disk.
[I clicked *skip* so I could worry about those files later]

All 3 of my corrupted files are 500 MB dblock files. Two were from last August, one from last month. I would like to re-point duplicati to my new backup file location and give it the command to rebuild, but don’t understand how. My RAID system is degraded but still functioning for now, so I need to ensure the duplicati backup is pristine while I wait for my replacement hard drive.

And maybe I should use smaller dblock files next time. I thought this would increase efficiency of drive usage, but a corrupted file that big (in case something catastrophic happens to the degraded RAID-5 array) would be likely hard to retrieve data from than, say, a 25 or 100 MB file. NTFS and ReFS seem to be so reliable … more corruption seems to be on physical drives than the file system nowadays, especially since UPS systems and modern OSes are able to avoid most hard crashes.

Welcome to the forum @uruiamme

I’m losing track of the drives, but it sounds like there’s:

  1. A degraded RAID-5 which needs a good backup.
  2. A local hard drive which got the original backup, but is also having problems such as read errors.
  3. A USB drive which got Windows 10 copy of most of backup files on drive 2 except for 3 bad files.

Is that the USB drive? Pointing is easy (just change the destination), but rebuilding is not, and data from August (or even last month) might be gone by now. There’s an option to try to rebuild by looking around, trying to see what blocks are still available, but it’s slow and generally can’t manage. How about reading Disaster Recovery and doing The AFFECTED command in Commandline with the bad dblock names?

Pristine is hard to test without restore test using with the no-local-blocks option set to force backup reads. You can readily check file integrity using The TEST command, but that only proves intact destination files.

I’m not sure how much extra space you have, but you might have 2 dying drives plus a bad backup copy? That’s a dangerous situation. I wonder if you should put a robocopy or Macrium Reflect image onto USB?
Duplicati backup can stay on the local drive (which we hope holds together awhile), and has old versions.

  1. A degraded RAID-5 which needs a good backup.

Yes. Drive 1 MAINDATA is degraded after a few hard errors showed up and the controller failed one physical disk in the array

  1. A local hard drive which got the original backup, but is also having problems such as read errors.

Yes, Drive 2 DUPE is my normal location for duplicati backups. During the copying, it now has >100 physical errors that affected exactly 3 files. It is going back for warranty replacement.

  1. A USB drive which got Windows 10 copy of most of backup files on drive 2 except for 3 bad files

Yes, Drive 3 USB-FTW is my emergency copy of Drive 2 DUPE. So I want it to be the new go-to place for backups and the likely source for a restore, if ever needed.

(None of those drives are the C: drive, by the way.)

I am trying the affected command using copies of both the dbpath and the data, but I just got this error, and trying to troubleshoot it (K: drive is Drive 3):

K:\>"c:\Program Files\Duplicati 2\Duplicati.CommandLine.exe" affected

ErrorID: DatabaseDoesNotExist
Database file does not exist:

True enough, that database file is nonexistent. I guess I’ll see if I have an old backup of it or what it’s all about. But I surmise that the XXWCNKZDJT.sqlite is wanting to link to QGPAQZJYXU.sqlite and I wonder why that is?

There’s some kind of bug going on here. I can slightly change my command and the non-existent dbase file error is different. See if you can spot the subtle difference:

C:\Program Files\Duplicati 2>Duplicati.CommandLine.exe affected "K:\Dupe" -dbpath="K:\Duplicati\XXWCNKZDJT.sqlite" --full-result
ErrorID: DatabaseDoesNotExist
Database file does not exist: C:\Users\Admin\AppData\Local\Duplicati\QGPAQZJYXU.sqlite

C:\Program Files\Duplicati 2>Duplicati.CommandLine.exe affected "file://K:\Dupe" -dbpath="K:\Duplicati\XXWCNKZDJT.sqlite" --full-result
ErrorID: DatabaseDoesNotExist
Database file does not exist: C:\Users\Admin\AppData\Local\Duplicati\PXAMPWGQIO.sqlite


I left off one of my hyphens for dbpath. Trying it with two hyphens got results! (That was some off-the-wall errors, though. Leaving it here in case anyone gets this same error.)

The following filesets are affected:
0       : 1/22/2021 6:00:00 AM
1       : 1/19/2021 6:00:02 AM
2       : 1/18/2021 12:48:17 AM
3       : 1/12/2021 6:00:00 AM
4       : 12/29/2020 6:00:02 AM
5       : 11/24/2020 6:00:00 AM
6       : 10/13/2020 7:00:00 AM
7       : 9/8/2020 7:00:00 AM

A total of 305 file(s) are affected:

Definitely some juicy, good files in there. The other 2 corrupt files are likely similar.

So I’m going to try and run through “recovering by purging files” and then switch to my Drive 3 as the duplicati backup location, and then run a regular backup. My assumption is that once the database is fixed and no longer has those 305+ files, it will rebuild those in new dblock files.

Yes. If the files are still around you should get a current copy, but some of the back versions were lost when the dblock files they were in were lost.

There were another 1200 files. Oddly enough, files concerning the SCSI controller were in that group. The output of affected is still a bit odd. Even with the --full-result option, it tells me:

Found 941 related log messages (use --full-result to see the data)

But one time it showed a single log message:
The following related log messages were found:
12/29/2020 12:05:09 PM: {"Size":524123614,"Hash":"jVh/IkWwvLYphyXLMeH7VU6C8v50pkwy0Di4c29M+xY="}

So 2/3 times it says to run --full-result, one time it did that.

Unfortunately, I didn’t see any difference in the output of list-broken-files and purge-broken-files. And nothing was deleted from my corrupt files. (I put dummy text files there, so the files are fake.)

C:\Program Files\Duplicati 2>Duplicati.CommandLine.exe purge-broken-files "file://K:\Dupe" --dbpath="K:\Duplicati\XXWCNKZDJT.sqlite" --no-encryption=true --full-result
  Listing remote folder ...
Backend quota is close to being exceeded: Using 2.22 TB of 4.55 TB (196.41 GB available)
remote file is listed as Verified with size 14 but should be 524268673, please verify the sha256 hash "Rnm8aXos5UztcR16TT4tnei9/C/A1jBgFB8wDdb9qpk="
remote file is listed as Verified with size 14 but should be 525152017, please verify the sha256 hash "l5RiWSZnKxwcBGFEP95u8fQuaQEs4WMSuuQLEiED97I="
remote file is listed as Verified with size 14 but should be 524123614, please verify the sha256 hash "jVh/IkWwvLYphyXLMeH7VU6C8v50pkwy0Di4c29M+xY="
Update "" detected

This does not look like the output of the Disaster Recovery page. Nothing on my K: drive’s Dupe folder gets touched. The database’s file date is updated, but the size remains the same.

Ok, this finally succeeded, but only after I deleted the corrupt files. This process took about an hour or so rather than a few minutes. Someone, please update the Disaster Recovery page to include the fact that you need to delete corrupted files for purge-broken-files to work. It implies that this command will fix corrupted files, but it won’t. Just delete them. Either a bug in duplicati or in the documentation.

C:\Program Files\Duplicati 2>Duplicati.CommandLine.exe purge-broken-files "file://K:\Dupe" --dbpath="K:\Duplicati\XXWCNKZDJT.sqlite" --no-encryption=true --full-result --quota-warning-threshold=0
  Listing remote folder ...
  Uploading file (56.94 MB) ...
  Deleting file ...
  Uploading file (57.19 MB) ...
[... lots more ...]
  Uploading file (60.30 MB) ...
  Deleting file ...
Update "" detected

So deleting the bad files worked. I have 14 new files in my Dupe folder. Also, take note of my comandline in case you experience errors about encryption or quotas, like this:

Backend quota is close to being exceeded: xxx

 Enter encryption passphrase:
 ErrorID: EmptyPassphrase
 Empty passphrases are not allowed

I’m going to run a new backup and see what happens.

First off, I renamed my C: drive’s database (to make it a backup), then I copied the K: drive’s renewed database to its original location.

I then went to the browser GUI page and made a slight modification or two: updated the destination to K: (Drive 3) because Drive 2 is toast, and changed the remote volume size to a measly 90 MB instead of 500 MB. I also removed the Recycle bin with the Filter: R:\$RECYCLE.BIN (my backup source is the R: drive)

I started the backup manually and it added around 6 GB to my K: drive in 10 minutes. Looks like things are backed up and I appreciate your input. I think my system is back on track… I’ve got 2 copies of everything. What could go wrong, right?

Agreed. It’ll probably take some doing to find the original intent. Even getting docs changed is hard. is corrupted and marked for deletion.

from the docs seems correct (though I can’t always get that), but I haven’t seen the file get deleted.

I’m thinking maybe a shorter path is to skip affected, and just delete or hide by renaming or move.

It’s not often done, but you might consider keeping the DB on the external drive, if it’s more reliable.
Your C: drive could only read 14 bytes (I wonder where the 14 came from) from your 500 MiB files?
Have you looked in event viewer to see what might have been behind that issue? Any similar ones?
SMART can sometimes say if bad sectors have been replaced with spares, likely with the data lost.

If you haven’t tested recreating the DB from backend files yet (e.g. move it to a temporary name then Repair – safer than the Recreate button in case something goes wrong), then that would be helpful, avoiding a surprise if you lose the DB. A backup copy of a DB that’s gone stale is less (or not) useful.

Alternatively, if you have another system, you could move USB drive to it and test Direct restore from backup files, which should approximate your recovery if your RAID and your C: drive fail completely…

Well, I had the three unreadable files during my copy from Drive 2 to Drive 3. I simply created dummy files with 14 bytes there. That turned out to just waste time. The best way to do disaster recovery is to simply have missing files, not corrupt or truncated ones I found out.

My C: drive was never any problem at all. It’s my reliable NVMe drive, I guess Drive 0. I was trying to ensure that my Drive 1 had a backup onto external Drive 3, since Drive 2 is starting to fail.

I did throw a database backup into my mix of backups. I wonder if/when someone might enlist duplicati to upload its own database as a final shot at redundancy? Mine is just a 2 GB file at this moment. Oh, and I can’t remember if I’ve been backing up the entire C: anyway. I should probably do that, huh?!

For true CLI (which does not use Duplicati-server.sqlite and in fact does not use the Duplicati server at all), there’s a dbconfig.json file that (I think) maps the destination URL to a DB, so you don’t need to name one.

When trying to match a GUI job, GUI Commandline passes all the matching options (and potentially extra). Export As Command-line is another way to get a GUI-job-similar-already-hyphened-and-quoted starter line.

Why not backung up latest Database? is one such feature request, which I point to because it says how to do it yourself, if you really want it, and also why you might not want that. Databases can get huge, so a tiny incremental backup might upload very little actual file data, but a huge database (which has all the history).

One user who previously backed up the primary database with a second backup job (don’t try to backup a database in the job that’s also making the database) has now decided that Recreate is enough safeguard.

Don’t mix the above two, because Recreate is fairly fast because ideally it just needs dlist and dindex files which are pretty small, and attaching a DB to each dlist is going to make for some big dlist downloads that actually might not be so bad for your local drives, but would hurt a lot if somebody had slow Internet speed.

[Idea] Backup Duplicati database to avoid recreate is a feature request and some details on a two-job DIY.

My usual advice for redundancy is redundant backups, preferably to different places with different software.

So far, so good … haven’t needed my duplicati backup… my RAID-5 is back working and getting a consistency check right now. I didn’t even cross my fingers.