Non-incremential backup and data archiving

Axel · September 20, 2022, 9:05am

Hello,

I have set up a Duplicati backup on a server and duplicati sends the data to an Azure container. The problem is that there is only one big backup and then it is only incremental. I would like to know if it is possible to make a big backup every two weeks for example. And I would also like to know if we have the possibility to archive the backups.

On Azure I see a lot of files named “duplicati-2165498451.dlist.zip”, how to see what it corresponds to and if we can archive them.

Thanks for your help.
Axel

ts678 · September 20, 2022, 6:25pm

Welcome to the forum @Axel

Features

Incremental backups
Duplicati performs a full backup initially. Afterwards, Duplicati updates the initial backup by adding the changed data only. That means, if only tiny parts of a huge file have changed, only those tiny parts are added to the backup. This saves time and space and the backup size usually grows slowly.

Why is this seen as a problem?

What do you mean by “archive”? If you mean from Azure to elsewhere, that’s do-it-yourself if you can.
You should consider archiving the current local database if you’re going to archive the current backup.
I’m not sure what the goal is. Offline storage? Backup “redundancy” (consider multiple backup tools)?

The dlist files are not a backup (and are not nearly big enough). They list files from that backup and supply information on what blocks are needed to reassemble everything. Blocks are in the dblock files which also have an index in an associated dindex file. A given block can be referenced over and over, meaning there is what’s known as block-based deduplication. Any unchanged file uses existing blocks.

Block-based storage engine
The backup process explained
How the backup process works
How the restore process works

Axel · September 21, 2022, 7:50am

this is not a real problem,

Let’s say I save 20GB of data that is modified every day, in 3 years I will probably have 500GB of data but with 2 years of data that is useless. It’s this part that I’m asking if I can archive part of the backup because I’m going to store 500GB for nothing. That’s also why I’m asking if it’s possible to make a non-incremental backup once every 2 weeks. The goal for me is to make 1 big backup every 2 weeks and the rest of the backups in incremental. So after one year, I archive the old backups. While keeping full backups.

gpatel-fr · September 21, 2022, 11:13am

Duplicati is a backup tool and does not do file replication, you should not ever fiddle directly with the backuped data, if you do that, you will most probably break your backup. It’s also a continuous backup tool, removing old data according to the retention policy.

The idea of ‘a complete backup from time to time, and incremental between that’ (known as grand father / father / son policy) is not appropriate for a tool like Duplicati, it’s not your grandfather’s backup.

The advantage of Duplicati design is to allow for deduplication, saving quite some backup disk space.
The disadvantage of it is that you can’t manage the backup data yourself.

If a tool does not fit your use case, don’t use it, do not try to use it like another tool for another use case. If you buy an utility vehicle, and try to drive it like a sport car, you may end up missing a curve somewhere. And the reverse case, if you try to drive a sport car on rough territory, you may break it.

Axel · September 21, 2022, 12:30pm

it’s a good news the smart backup retention, it’s already set up or it’s coming soon ?

I understood what you told me, I thank you for your help it helps me a lot.

gpatel-fr · September 21, 2022, 1:23pm

as the forum post is from 2018, this has been done for a long time.

Axel · September 21, 2022, 1:36pm

it’s in test phase, deployed in some places already but only in test

gpatel-fr · September 21, 2022, 1:40pm

sorry, what do you mean by ‘it’ ? if it’s the retention policy, its in Duplicati since years.

ts678 · September 21, 2022, 3:36pm

Not really. Any file could be needed for a given restore because a restore may need some older blocks because some new file happened to have some of the same blocks that some old file happened to use. Deduplication achieves slow growth by referencing old data rather than uploading the same data again.

If this is unclear, the links that I provided earlier might help. Also worth reading is the manual Overview.

I don’t know if we got past this in the later explanations, but such terminology doesn’t fit Duplicati plan. The destination has all your backup versions with file usage as previously described. Old versions get deleted by retention policy you chose on Options screen, and lead to Compacting files at the backend.