Testing Restore from Galcier DA

ctroyp · June 25, 2021, 2:11pm

Hi folks, new here and testing my backups by attempting to restore.

I understand that there is limited support for S3-IA, Glacier, and Glacier DA, but I am committed to test this use case for AWS. Also, this is not the primary back and is intended to sit for long periods and only there for restoring from significant disaster.

It is also understood that the aforementioned AWS storage types are not immediately available for restore as it is with S3 Standard and other Cloud targets. However, as mentioned, I still plan to test this out.

Thus far, my backups have worked as expected and the catalog appears accurate for all of my versions. As expected, there are errors (i.e. “not valid for the object’s storage class” and “Could not find file”) when attempting restore via the Duplicati interface because AWS requires their own limited restore request for the Glacier DA storage type.

While testing, I requested an AWS restore of the four affected Duplicati files that are needed to restore my file. Hours later (when the AWS restore was completed for the archived objects,) I was able to restore my backup files and directories as expected.

So, since restores at this level will be infrequent at the most, this may be a viable option for me. There is still one issue - significant for large archives which leaves me with two questions:

If I need to restore “file.xyz”, is there a way to determine which dlist, dindex, and associated dblock(s) need to be manually restored from AWS in order to complete the restore without pulling down ALL of the backup files?
The AWS CLI or API can be hit with a command to request the restore of the needed backup files. If I can determine what affected Duplicati files are needed to complete my restore, it would be easy to script it manually which leads me to believe that it can be reasonably easy to integrate these restore requests into each of the backups. Any hope for something like this in future releases?

ctroyp · July 2, 2021, 12:19pm

Does anyone know how to determine what backup files are needed to restore a file?

drwtsn32 · July 2, 2021, 4:53pm

I don’t think there’s a way to do this. There is an AFFECTED command but it does the opposite of what you’re asking.

My first idea was to do a “dry run” restore and see what dblocks would be requested, but I saw that the dblocks were still downloaded. All the dry run did was stop duplicati from saving the restored files.

I think it would be a nice additional feature if it could do what you’re asking.

ts678 · July 3, 2021, 7:36pm

How committed are you to getting it to (maybe) work? It may take ugly methods or some SQL expertise.
I’m not really encouraging any of this, but for non-primary backup, it might be more reliable than nothing.

Probably no reasonable one, so I’m asking about unreasonable. The local database does have the info…

Downloads can be persuaded to fail. How does S3 actuallybehave while files are still in cold storage?
Presumably they’re all visible (or backup would complain), but if one tries to read a file, what does S3 do?

If failures can be made to happen fast enough, then the log file would probably give you the needed file list.
Faster failures can be encouraged with number-of-retries=0 (which also avoids the retry-delay) slowness.

I don’t have Glacier, but I’m prototyping with a local folder backup where I give an empty folder to Restore.
Ordinarily it would notice the missing files, but I set no-backend-verification to make sure it goes in blind…

EDIT:

On the other hand

isn’t true if you lose the database, so is the need actually to find out what you need without a database?
That’s a tall order because it usually changes all the time as compacting (not Glacier-friendly) happens.