Azure Blob - Using Archive Access Tier


#1

Hello,

Azure Blob Storage offers three access tiers for individual blobs, including a new Archive tier with very good pricing, which would be ideal for me for backups.

Does Duplicati support using this access tier or is it a feature that could be added in the future versions?

I haven’t managed to find any information or setting relating to the Azure Blob access tiers, I’m not sure if I haven’t missed something.

Thanks for your time.


One or more buckets in Backblaze B2?
#2

I like the idea, but I haven’t see any product do it, so would be surprised if Duplicati does it now, or even is engineering in a way that can wait for backup data to become available.

Azure Archive tier is only accessible by demoting specific blobs from hot or cold tiers, and once the data is there it is not readable. The blob has to be requested to be moved up to a hot or cold tier, and that is measured in hours.(Reference: Azure hot, cool, and archive storage for blobs | Microsoft Docs, Archive Access Tier section)

My impression is (and please correct me if I’m wrong) that Duplicati is using cloud storage similar to backup to disk. Since it is an incremental forever backup product, being able to access all previous backup files/blobs is important. A typical archival/full+incremental backup product would be better suited to take advantage of archive tier, but not an incremental forever.

I also suspect that the amount of development required to move backup files up and down between tiers would be significant. For instance, to restore a file would require selecting a file, submitting a request to Azure move the correct backup blobs from the Archive tier to Cold tier, then queuing up the restore and periodically checking to see if the backup blobs are available in the Cold tier.

I haven’t looked at the structure of Duplicati internally enough to know whether this is a big leap, but given that I have yet to run into a paid backup product that includes this functionality, I’d be surprised.

Just my two cents.


#3

Technically a destination that does not allow reading is probably possible however using it with Duplicati would involve disabling a bunch of “default” including those that do things like:

  • make sure no expect files are missing from destination
  • verify destination files (by default after every backup job at least 1 archive file is downloaded and verified)
  • cleaning up “expired” versions (and potential subsequent consolidation of sparse archive files)

Other things like reporting on backup contents or comparing versions won’t be usable without referencing the local .sqlite file (and maybe not even then).

And finally, doing a restore would (at the moment) involve manually shifting all necessary blobs back into a hot or cold tier, waiting for that to finish, then running the backup.

Assuming the Azure API exposes the functionality then it should be feasible to have Duplicati in some way make the necessary blob tier changes then check back every X minutes to see if it’s done and finally (potentially hours later) do whatever task was initially requested.


#4

As far as I’m aware, Cloudberry Backup supports this, see this article: CloudBerry Backup Now Supports Azure Archival Storage Tier

However they have very odd 1 TB limit (even though they don’t provide storage) in all but the really expensive Ultimate version, so I’m looking at alternatives.

I understand there are certain limitations and some backup features wouldn’t be possible, but I think for certain workloads this would be ideal.

For example I need to backup large amount of photos and videos. The files pretty much never change and I would only ever need to restore them in case of an emergency, so waiting a few hours isn’t going to be a problem from user perspective. So being able to leverage the Archive storage would cut the storage price significantly.


#5

If you’re adding for this functionality to be added you might be waiting a while depending on other user interest and developer load.

However if you just want to see off you can get it to work with current functionality I’d suggest a little testing of settings to see what does / doesn’t work.

Once that’s figured out a nice #howto article could help others get started with it and possibly increase interest (and potentially development priority).


#6

There is currently no support for demoting the files to cold storage, but I don’t think it would require a big changed in terms of backup.

Yes, you can add --no-backend-verification --no-auto-compact to have a setup that does not require that you can read remote files. I am not familiar enough with the feature to know if it still returns the cold-storage items in the file listing. If it does list the files, a better setup is --remote-samples=0 --no-auto-compact which will still do verifications based on the filelist, but not attempt to download any files.

The problem is not making the backup, that part is easy enough. The problem is the restore. We did some work for Glacier, but it was a mess since you need to figure out which files need to be moved out, wait a few hours, and then access them, and then put them back.

I think supporting this feature would introduce a similar restore setup.


Use Duplicati with a cold cloud storage
Resume backup after PC restart
#7

I stand corrected. And impressed.

And found that I’m wrong about at least one other backup product that will support the Azure Archive Tier: Veeam V10 | view topic