My suggestion is a bit different from the prior one so let me know if I should start a new thread.
Could compaction have operational limits added? Specifically, what I’m thinking here is a download limit - only download say 500MB during each compaction operation. There is already a limit on the file-verification downloads, but I’m not aware of any with regard to compaction. And correct me if I’m wrong, there’s no operational/performance benefit from compacting 25% down to 5% in 1 operation versus doing a little bit each day right?
Now, the main reason I would like this is that I would like to avoid/reduce/control download charges and would allow spreading out the compaction operations over multiple days and keep me within my daily (free) download quota… So in my situation, I don’t really care if compaction is performed daily or not, but I do care when I had to download 30GB of data in a day when I could have spread that load over multiple days to reduce my monthly costs - the cost to download is significantly more expensive than it is to store the same amount of data.
Hi @stahr, welcome to the forum - and thanks for the suggestion!
I don’t see any reason bandwidth caps / “batching” couldn’t be put on compacting processes and i like the idea, but some things need to be considered such as:
- what to do it falling behind (each limited compact had more pending to-do items than the last)
- how to handle a single compact step needing more than bandwidth limit (download what it can each run and sit on the files somewhere until they’re all local?)
- if limit is set to daily total what happens if running jobs more frequently (do hourly runs have to keep track of the parts of a compact batch that were previously competed?)
- what if backup runs less frequently than daily (should scheduler support compact only runs?)
- what happens if previously downloaded compact targeted chunk is modified at destination (retention policy decided it could be deleted)
I’m not saying it can’t be done or that it’s a bad idea (it’s not) - just that these good ideas are often much harder to design / implement than it might appear.
My responses are:
continually falling behind - stick to the download limit which the user has set. Eventually the data will fall off and will be deleted OR the disk usage will continue to grow. I would not see this as a default setting, so it would be up to the user to understand this before setting it. And in this scenario, the disk usage costs could outweigh the download costs and it might be cheaper to run a full compaction every 3 months if you say get back 95% of your disk usage. But I don’t expect duplicati to make that decision, it’s up to the individual to run those numbers to determine for themselves.
download limit too small - compacting is simply disabled if set to less then 2*upload volume size. IMHO, I wouldn’t store fragments, I’d simply disable the compaction run entirely if someone were to set the download limit to <= volume size.
daily/per-run quota - the ideal I would think this could be a global or per storage destination item, but the initial implementation could be done on a per run/job basis and leave it up to the individual to figure out what percentage of their quota they would be comfortable in using for each job. But as things stand now, I can either turn off compaction all together or raise the % so they run less frequently, but I can’t control how thorough the jobs are when they do run or how long they will run. A download limit on each job run might not be the most effective tool for either of those situations, but it could be utilized to do so.
compact-only runs - sure, on a per-job basis
So my proposal would be for an download-quota/threshold set on a per job basis only. I’m not opposed to having
a more sophisticated approach, but feel as though it would be overkill at this point without knowing that users want/need such control/complexity.
Sounds like a good way to start - I’ve got ahead and moved this to it’s own thread on the Features group to see if others want to chime in. Let me know if you want a different title or group.