when you say “the backup gets deleted” do you mean the backup for that day (let’s say “N”) or just revisions inside “N” (and of course “N” if it ends up empty)?
Yeah … I mean the whole backup gets deleted.
While the idea of deleting individual file versions from within a backup sounds great for the future, it was a too big change for me to introduce. See the second half of this comment: Issue 2084 (comment)
As a volunteer open-source project, giving time estimates is very difficult since we have “day jobs” to keep us fed () however we are working to finalize the the features for the next stable release and I believe this (and hopefully these #planned tagged items are to be included when it is released.
I have a rather big backup, 275 GB with 500 versions.
It is extremely slow listing files. It takes several hours if something should be restored.
Getting the initial list takes about 40 minutes. Browsing the next folder takes equal time …
Is it possible to apply retention policy to this backup in order to reduce the versions and then run compact to make it process faster?
This is a known problem. Fixing it will require a major change to the internal database structure and a rewrite of many parts of the source code.
Unfortunally there is no fix or workaround for this issue at the moment. Fixing this problem is on the To-Do list, but a release date is not known.
See this discussion for more info:
I only have 175 versions right now on one machine and in the restore process it takes 45 each time I drill down into a deeper folder. Is this time a function of version count, # of files backed up, or both?
Just for fun I tried opening two web interfaces to Duplicati, one to the Live Log (Profiling) and the other to a backup job in which I clicked Restore.
The sum of the “took” times reported in the profiler was definitely less than the time it took for the browser to display the results, so there may be a third slowdown - the UI itself.
I know historically I’ve run into issues with browser performance when dealing with very large or complex select lists. One way to test this could be to try @mr-flibble’s idea and run the same thing in the command line to see if it spits out results any faster.
I had a similar approach as you – backing up files every hour. I maxed out around 289 versions. At that time the backup time had increased by an hour (to 1h15m) and the time to render every click in the restore UI was around 29 minutes. I think the CPU usage of the service had also increased during backups as well.
At that point I decided that the 1 hour backup interval was not a practical configuration at this time. The volatile files are for work and if a restore was needed it would probably be a time sensitive issue so the prospect of waiting hours to find and restore a file was not acceptable. In the end I reverted to a once per day interval.
Just keep in mind, that if the retention policy you defined leads to deleteing a lot of backups, depending on your settings Duplicati might start compacting your backup files.
If your backup location is somewhere online and your download and uploadspeed is rather slow, then this might take a “while”
--keep-time, --keep-versions and --retention-policy are part of the backup job configuration.
They are defined in step 5 of the Add/Edit Backup wizard. --keep-time and --keep-versions can be supplied under “General Options”:
--retention-policy is an advanced option.
The --keep-time, --keep-versions and --retention-policy options are applied after completion of each backup job. Compacting is started automatically when needed. You can set thresholds for compacting the advanced options --small-file-max-count and --small-file-size.
I have made some tests on this backup set now. The first backup was done 2015.
I applied an acceptable retention policy and reduced the versions from 502 to 20. There is 130 000 files in the backup. Time for the initial listing on restore is now reduced from 40 min to 20 sec. Listing the next folder takes about the same time, 20 sec.
Wow, that’s a serious improvement!
Looks to me that the local database is the bottleneck for large backups with many source files.
You significantly reduced the number of snapshots from 502 to 20. Did the used space at the backend reduced a lot also? If there is not much changed between 2 backup operations, deleting backups should not have much effect on the data used at the backend.
If only new data is added between 2 backups (for example if you backup a folder with pictures from your camera), deleting backups shouldn’t have any effect to the storage space used.
If browsing through folders in Duplicati improves in situations like these, I suppose it’s safe to say that queries to the local DB are causing the performance issues, not the number of blocks or the number of files backed up.