The compacting process is very dangerous!

First and foremost, Duplicati is amazing. I would like to wholeheartedly thank Duplicati devs and the community.

Now to business. Talking about Duplicati version Duplicati - 2.0.5.1_beta_2020-01-18.

Yesterday I lost all versions of my backup due to the way Duplicati does compacting. I changed the “small-file-size” parameter to 90 MB for my 100 MB volume size to make Duplicati consider finally compacting the 2.5 Gigs of smaller volumes and clicked “Compact now for the job”. During the job I needed to stop it to restore files from another backup and since Duplicati can only perform one job at a time, no other jobs need to be running for that. So I clicked “Stop running backup” and chose “Stop after current file”. Waited for a minute and Duplicati stopped. I did other stuff and came back to this backup. Clicked verify and…

Error while running VeraCrypt to Vultr VPS (DAILY)
Unexpected difference in fileset version 15: 29.09.2020 22:25:20 (database id: 15), found 3 entries, but expected 4

BTW Clicking “Show” doesn’t do anything, since the test doesn’t even get logged properly.


(This screenshot is from the same job when I started deleting offending versions and got all the way to version 1 - see below)

I tried removing the offending version as suggested on the forum. Then…

Error while running VeraCrypt to Vultr VPS (DAILY)
Unexpected difference in fileset version 14: 29.09.2020 22:25:20 (database id: 15), found 3 entries, but expected 4

And so on and so on… I had a single large 10 GB VeraCrypt volume backed up using DuplicatiI. I went back all the way to version 5 to realize that all was lost. I ran the PURGE-BROKEN-FILES command and lo and behold - backup went down to 4.5 Gigs obviously meaning I’ve lost all versions since the VeraCrypt volume is 10 Gigs and completely incompressible (it’s AES inside). Luckily I use Duplicati to backup to another location where I immediately went and added the “-no-auto-compact” parameter to the other backup job to make sure it doesn’t get lost due a random connection error during compacting.

So yeah. Compacting is dangerous! And for me it’s such a bummer since Duplicati can in most cases recover from errors and loss of connection but in this case it doesn’t happen since you lose actual volumes from the backup location. No way around this.

My suggestion:

To revise the compacting process to make sure Duplicati first writes the compacted file containing smaller chunks and only then writes metadata that points to the new volume and after this - goes for deleting the older archives. As far as I can see, it’s the opposite. Here’s the screenshot I made yesterday right before “gracefully” stopping the compacting process and waiting for Duplicati to finish the process, not making it stop immediately. Still, though, the user should be able to do that as well! Looking from down-up I can see Duplicati first deleting the smaller files and then uploading the large compacted one. That’s not the best way to do this. Just like with disk defragmentation, data security is the key: defragmenting software first creates a duplicate of the parts of the fragmented file and only when it’s done does it change the MBR records pointing to the new location and only then does it delete the old fragments.

I hope this helps. It was a bummer losing all those backups when I didn’t even push the software to the limit. You can easily replicate this, but make sure you backup the folder with your backup as well as the database!

  1. Go to settings for a job with a good number of versions, like 3 and at least ~300 Megs of disk space to compact.
  2. Set the “small-file-size” parameter to 95% of your volume size (say, 95 Mb for 100 MB volumes)
  3. Start compacting. After a minute or so click “Stop” and choose “After current file”
  4. Wait until it’s done. If you’re “lucky” it will stop before writing the new compacted volume to the server basically removing functional backup volumes and calling it “done”.
  5. Verify your job to discover that you’re missing volumes and can’t recover since actual file chunks are missing.

A side note: I did have “auto-cleanup” enabled for this backup, if that makes any difference. It shouldn’t obviously, but it might be relevant.

Here are all advanced options for the backup that was lost:

--disable-filetime-check=true [required to backup VeraCrypt containers]
--auto-vacuum=true
--auto-vacuum-interval=2W
--auto-compact-interval=1D
--auto-cleanup=true
--zip-compression-level=0
--quota-size=16GB
--quota-warning-threshold=10
--snapshot-policy=Required
1 Like

I agree with you - the new, compacted dblock should be uploaded first before the old ones are deleted.

Duplicati doesn’t handle some stop/interruptions very gracefully. Improvements have been made since 2.0.5.1 (in the Canary channel) with interrupting backups, but not sure if they extend to interrupting a compaction process.

wow, really? That’s pretty surprising, and kinda a big problem, considering one of the top touted features of duplicati is it’s hardening to this. “fail safe design” is one of the top bullet points in the manual:

Duplicati is designed to handle various kinds of issues: Network hiccups, interrupted backups, unavailable or corrupt storage systems. Even if a backup run was interrupted, it can be continued at a later time. Duplicati will then backup everything that was missed in the last backup. And even if remote files get corrupted, Duplicati can try to repair them if local data is still present or restore as much as possible.

it’s also mentioned in the “built for online” section on the homepage

this was literally one of the top reasons duplicati drew me in, so I hope stability and fail-safe changes are a TOP priority in upcoming updates. that’s like the number one thing you want in a backup system.

Sorry, I mean specifically when users try to stop the backup process. Not really network interruptions - those are automatically retried and continued failure would cause the backup job to gracefully end in a error state.

I wasn’t around during the early days (I’ve only been using Duplicati a few years) but I THINK it was when the devs added concurrency that caused the “stop now” function to cause database inconsistency in SOME situations. (Which has been improved in Canary channel, as I mentioned.)

Welcome to the forum @serpantin

While I haven’t tested to see if this is directly relevant (I think, though, that it’s just a triggering change),
The COMPACT command explains the default, and it sounds like you increased it substantially. Does increasing it make compact more likely? This text sounds like it should make it less likely to compact:

--small-file-size=<int>
Files smaller than this size are considered to be small and will be compacted with other small files as soon as there are <small-file-max-count> of them. --small-file-size=20 means 20% of <dblock-size> .

but I haven’t poked at compact configuration in awhile, and it’s entirely possible that I’m mistaken on above.
EDIT: Likely mistaken. I guess the goal was to force excess-small-file compact by making more look small.

You need to look at the whole sequence. I think you got a window into it. What you want (reading down):

get small files
put
delete small files

get small files
put
delete small files

A tiny window can cross cycles like this:

delete small files

get small files
put

which is exactly what screenshot shows. Window not showing any file name twice leaves one guessing.

If you can follow the code (I’m not an expert), the compact looks like it downloads, uploads, then deletes:

Although I’m only showing dblock operations here, here’s a sample compact due to excess space waste.
I’ve used the letters of the alphabet for the get and subsequent del operations, and numbers for the put:

      2020-09-18 14:23:29 -04 - [Information-Duplicati.Library.Main.Database.LocalDeleteDatabase-CompactReason]: Compacting because there is 26.26% wasted space and the limit is 25%
A get 2020-09-18 14:23:34 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-bcea90319c484424d93186c8194ea4348.dblock.zip.aes (22.03 MB)
B get 2020-09-18 14:23:43 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b55d1702df8f34227a66ee00f934d8cda.dblock.zip.aes (50.01 MB)
C get 2020-09-18 14:23:53 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-bf126022771e6486e8d1a94e6e1a141e6.dblock.zip.aes (50.02 MB)
D get 2020-09-18 14:24:00 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-bcc1b0a0f5c4d4659a2eedb34765ffd76.dblock.zip.aes (21.17 MB)
1 put 2020-09-18 14:25:17 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-bc5696ab8113d48aeb1c11109c7bb502e.dblock.zip.aes (50.04 MB)
A del 2020-09-18 14:25:19 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-bcea90319c484424d93186c8194ea4348.dblock.zip.aes (22.03 MB)
B del 2020-09-18 14:25:19 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b55d1702df8f34227a66ee00f934d8cda.dblock.zip.aes (50.01 MB)
E get 2020-09-18 14:25:28 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b2326a2292f0146b5a3590d6084ae204b.dblock.zip.aes (40.94 MB)
F get 2020-09-18 14:25:35 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-bd217d7359b984ef6ab51dd32f5aa8416.dblock.zip.aes (38.17 MB)
G get 2020-09-18 14:25:37 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-bba8fdb27c8654772aa0b6bf889af576f.dblock.zip.aes (7.19 MB)
H get 2020-09-18 14:25:38 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b68ceed59d3b046728996a3d362c4e463.dblock.zip.aes (1.48 MB)
I get 2020-09-18 14:25:48 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b2850a0f5c0064f03abffe36997388b52.dblock.zip.aes (49.93 MB)
J get 2020-09-18 14:25:56 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b8d2cd10afc7b4cd3bc6acc7c51352e7c.dblock.zip.aes (49.93 MB)
K get 2020-09-18 14:26:04 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b76688aa907f24efc90c2ea032b4a7984.dblock.zip.aes (46.71 MB)
L get 2020-09-18 14:26:09 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b9c39959a7c404422aee140d9ef0ee923.dblock.zip.aes (29.63 MB)
M get 2020-09-18 14:26:18 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b7cf753a0a047471b8e2e42b3b3990f6f.dblock.zip.aes (49.93 MB)
N get 2020-09-18 14:26:20 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-bc56b2d70123d4c47be2ece94af6d6f08.dblock.zip.aes (7.21 MB)
O get 2020-09-18 14:26:27 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b46b5ffa44d7b4da08cca9a5d5e9444b2.dblock.zip.aes (42.15 MB)
P get 2020-09-18 14:26:36 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b3986e04c610f47b89acf54dbf7342b26.dblock.zip.aes (45.93 MB)
Q get 2020-09-18 14:26:44 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b3beabdad5713448fb83881febbf69238.dblock.zip.aes (49.97 MB)
R get 2020-09-18 14:26:53 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-bcb69a1e2349b4c059253ac3f24b830ca.dblock.zip.aes (47.13 MB)
S get 2020-09-18 14:26:54 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b94419b40477c4feca26c8fbcec31ac20.dblock.zip.aes (1.30 MB)
T get 2020-09-18 14:26:55 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-bc5d0bc72c2674412915ad2b0b19252dd.dblock.zip.aes (1.45 MB)
U get 2020-09-18 14:26:55 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-bd2ae8fab2ccc4be8b9e49f75a4e280b7.dblock.zip.aes (1.63 MB)
V get 2020-09-18 14:26:56 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b38d8fd47dd644674a44f964c58b78829.dblock.zip.aes (1,013.08 KB)
W get 2020-09-18 14:26:58 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Get - Completed: duplicati-b23ad328d0fd247e9b8add26256e055b6.dblock.zip.aes (7.34 MB)
2 put 2020-09-18 14:28:14 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-b56cce75472fe410d9ba9f6e4d00cdf2d.dblock.zip.aes (50.07 MB)
C del 2020-09-18 14:28:15 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-bf126022771e6486e8d1a94e6e1a141e6.dblock.zip.aes (50.02 MB)
D del 2020-09-18 14:28:15 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-bcc1b0a0f5c4d4659a2eedb34765ffd76.dblock.zip.aes (21.17 MB)
E del 2020-09-18 14:28:16 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b2326a2292f0146b5a3590d6084ae204b.dblock.zip.aes (40.94 MB)
F del 2020-09-18 14:28:16 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-bd217d7359b984ef6ab51dd32f5aa8416.dblock.zip.aes (38.17 MB)
G del 2020-09-18 14:28:17 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-bba8fdb27c8654772aa0b6bf889af576f.dblock.zip.aes (7.19 MB)
H del 2020-09-18 14:28:17 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b68ceed59d3b046728996a3d362c4e463.dblock.zip.aes (1.48 MB)
I del 2020-09-18 14:28:18 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b2850a0f5c0064f03abffe36997388b52.dblock.zip.aes (49.93 MB)
J del 2020-09-18 14:28:18 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b8d2cd10afc7b4cd3bc6acc7c51352e7c.dblock.zip.aes (49.93 MB)
K del 2020-09-18 14:28:19 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b76688aa907f24efc90c2ea032b4a7984.dblock.zip.aes (46.71 MB)
L del 2020-09-18 14:28:19 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b9c39959a7c404422aee140d9ef0ee923.dblock.zip.aes (29.63 MB)
M del 2020-09-18 14:28:19 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b7cf753a0a047471b8e2e42b3b3990f6f.dblock.zip.aes (49.93 MB)
N del 2020-09-18 14:28:20 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-bc56b2d70123d4c47be2ece94af6d6f08.dblock.zip.aes (7.21 MB)
O del 2020-09-18 14:28:20 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b46b5ffa44d7b4da08cca9a5d5e9444b2.dblock.zip.aes (42.15 MB)
P del 2020-09-18 14:28:21 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b3986e04c610f47b89acf54dbf7342b26.dblock.zip.aes (45.93 MB)
Q del 2020-09-18 14:28:21 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b3beabdad5713448fb83881febbf69238.dblock.zip.aes (49.97 MB)
R del 2020-09-18 14:28:21 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-bcb69a1e2349b4c059253ac3f24b830ca.dblock.zip.aes (47.13 MB)
S del 2020-09-18 14:28:22 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b94419b40477c4feca26c8fbcec31ac20.dblock.zip.aes (1.30 MB)
T del 2020-09-18 14:28:23 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-bc5d0bc72c2674412915ad2b0b19252dd.dblock.zip.aes (1.45 MB)
U del 2020-09-18 14:28:23 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-bd2ae8fab2ccc4be8b9e49f75a4e280b7.dblock.zip.aes (1.63 MB)
V del 2020-09-18 14:28:24 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b38d8fd47dd644674a44f964c58b78829.dblock.zip.aes (1,013.08 KB)
3 put 2020-09-18 14:28:30 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Put - Completed: duplicati-b8795c3ad6bc74acc9e86a126befa72ae.dblock.zip.aes (3.28 MB)
      2020-09-18 14:28:31 -04 - [Information-Duplicati.Library.Main.Operation.CompactHandler-CompactResults]: Downloaded 23 file(s) with a total size of 662.24 MB, deleted 46 file(s) with a total size of 662.97 MB, and compacted to 6 file(s) with a size of 103.43 MB, which reduced storage by 40 file(s) and 559.54 MB
W del 2020-09-18 14:28:31 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: Delete - Completed: duplicati-b23ad328d0fd247e9b8add26256e055b6.dblock.zip.aes (7.34 MB)

Although one can’t see exactly what get volumes went into a given put, the code snippet above shows activity following a filled-to-capacity dblock. It gets put and the volumes that it fully consumed get a del.

This should be what it does now. If anyone sees otherwise after looking at compact in its entirety including matching of file get put and del (e.g. from log-file and log-file-log-level=information) please file an Issue, including steps to reproduce it. This will get it in a tracking queue (forum is not one) for deveopment work.

Basically, I’m challenging the suggestion “To revise the compacting process …” because I think it already works that way. This seemed like the primary one. Improving the stopping process is certainly worthwhile, and has been improved for backup. And clearly the backup should not break, but that’s not really what the idea of a “feature” is. Duplicati’s 2.0.5.1 Beta concentrated on robustness (not features) to a large degree, however there’s always room to improve. It’s not usually bad design intent, but more a removal of bugs…

403 error during compact forgot a dindex file deletion, getting Missing file error next run. #4129
would be an example of that (but you do need to fail all the retries first). I haven’t seen it make your exact error though, after throwing lots of simulated errors at it. Other errors during compact also cause issues.
But there’s an issue filed (one among too many), so perhaps some day someone will figure out the cure.

Thanks for digging more deeply than I had time to do. Glad it is being done correctly.

This is not really my skill either, but is there any chance that a delete-before-uploading-is-finished-issue was introduced with the release of Duplicati v2.0.3.6 where concurrent processing was added as a new feature?

I can imagine that remote files are deleted, before the upload of related merged data is completed if these processes are executed in separate threads. If this is true, setting --concurrency-max-threads=1 theoretically could help to workaround this behaviour.

Correctly designed programs don’t care what the reason of the interruption is. It can be power outage, instant reboot, disk being disconnected in middle of writing data, etc. Stuff like this doesn’t sound bad, until you’re working with big production systems.

Problem is that if there’s one “small and short” power outage, fixing the mess created by it can take several months manually and even getting systems back running weeks. Unless systems just can recover from it automatically. - And that’s how it should be. -> Fail -> Recovery -> Running. No manual intervention needed whatsoever.

Totally agreeing with @serpantin and @starwarswii

1 Like

You get no argument from anyone here… there has been some nice progress this year in the Canary channel to fix interruptions causing issues.

I think one time I proposed more negative testing, perhaps on dedicated systems to induce failures.
There are infrastructure limits due to (I think) no project-owned systems for this. Staff shortage too.
While perfect design would be great, it’d also help to have a test team to beat the systems hard. :wink:
Another advantage of controlled testing is you can get debug logs rather than a meaningless mess.

Any testers out there?

Typically the plan seems to be to (maybe) work on reproducible GitHub issues (someplace to start).
There might already be some there, but if you know some nice exact steps for trouble, please file

1 Like