Using s3 Glacier for backups

mohak · August 18, 2017, 9:00am

Now that Duplicati is Beta, I’m planning to use it for my offsite backups to s3. As these backups are not going to be accessed unless I lose all of my local discs, I think it makes sense to use Glacier.

After going through Duplicati’s guide to use Glacier, I have the following questions:

Assuming Amazon does not corrupt or lose any of my files, what, if any, are the risks of running with --no-backend-verification?
What will I have to do if I wish to make a full restore of my data?

kenkendk · August 18, 2017, 9:07am

Glacier is a bit problematic, due to the restore process and it has not been fully automated (nor integrated in Duplicati). It works by using the S3 life-cycle rules to move files to Glacier. This means that Duplicati can no longer verify that all is working as expected, so you need to “trust” it, which is a bad idea.

To test the restore, you basically have to recall the files from Glacier. If you can get them moved back into S3, you can test the restore.

I generally recommend not using Glacier with Duplicati for these reasons.

Also, there is a longer discussion on the topic here: Amazon Glacier support · Issue #701 · duplicati/duplicati · GitHub

mohak · August 18, 2017, 9:10am

Thanks for the quick reply! Would it then be o.k. to use Standard Infrequent Access or should I just stick to Standard?

kenkendk · August 18, 2017, 9:14am

If you use IA there is a fee if you access the files too quickly. Duplicati will do a test by downloading a file after it has been uploaded and then checking if it is stored correctly.

This will incur a penalty fee.

You can use the advanced option --backup-test-samples=0 to skip the download, but still check that the list of remote files is correct.

mohak · August 18, 2017, 10:52am

But with a Lifecycle Policy that moves dblocks to IA after 30 days, that should not be a problem, right?

kenkendk · August 18, 2017, 10:58am

I do not know the AWS pricing well enough to help with that