Duplicati temp dup-xxxx files not being deleted

bug

#41

I’ll see I can get a before/after script going to log temp files around runs.


#42

I mentioned this in another thread, but I apparently have the same issue as others here as it relates to a compact operation as well as my regular backups (which were generally solved with the shell script above). I just ran a 6 hour compact operation today. I haven’t run any other jobs, and my tmp directory has 114 files in it. Here are a few lines:

-rw-r--r-- 1 root root 52487238 Jan 15 08:34 dup-22d2bb53-e5b1-4582-8b33-5a17a5478666
-rw-r--r-- 1 root root 52520492 Jan 15 12:07 dup-234b0e98-c0c0-465b-8e0a-2fad3c778e85
-rw-r--r-- 1 root root 52438628 Jan 15 09:17 dup-2b6ad41c-aae5-4692-8775-87939fe26a5f
-rw-r--r-- 1 root root    47328 Jan 15 08:58 dup-2c81ae18-6c9e-45a7-a579-24c385ee473d

and

root@server:/tmp# du -ch /tmp/dup* | grep total
2.4G    total

I was using a ramdrive for tempdir; I’ve got plenty of ram on this system so I may try just making it huge and hoping I don’t come up on this limit too often since it seems to break things. Another idea might be ramfs instead of tmpfs, which I’ve read won’t use swap (fine) but can grow dynamically.

Edit: this is on 2.0.4.5_beta_2018-11-28, ubuntu.


#43

Also seeing problems with orphaned temp files on prune/compact with 2.0.4.10_canary_2018_12_29. I’m running the commandline instance on Win10 x64 (10.0.17134.0). Snapshots are set to required.

In my case, I had an old, large fileset to delete but that filled up 25GB spare space in my %TEMP% folder. Duplicati aborted after disk-full and left unexpected files requiring a repair but, after leaving that to run for days, I decided to just blow the backup away completely and re-create it.


#44

Any progress on finding the cause of this, can we provide any further information to help?

I’m still seeing this happening on all my Windows machines, all running 2.0.4.12_canary_2019-01-16 without any other problems that I can detect.


#45

Yep, it seems to happen every now and then. Probably with some kind of application crash / process kill. I would recommend generic temp path clearing with filename prefix and potentially some age filter. Which would get stale temp files deleted bit later, if this situation persists. Check could be done when process is started or so.

At least this is something I’m typically doing with my projects and temp files. Another option is to create a sub-directory in temp path and completely clear it when starting / finishing process.


#46

The last time I tried clearing down temp files over a week old, a week later backups started failing as it could not find files.


#47

That shouldn’t happen. If it complains about files it is probably on the remote side.
I’ve never had any issues deleting the duplicati temp files on my local machine.


#48

I think in my case, the problem was specifically caused by temporary files accumulating during the cleanup process. Due to the size of the fileset to be deleted, this eventually led to running out of disk space. As I mentioned in my post above, that was >25GB of accumulated temporary files for that particular run. It appears that temporary files are being cleaned up once the backup process is completed so, as long as you have enough free space to cover the size of the fileset to be deleted, you should be good. If not, well, BOOM!

I should add that I realise many users won’t be seeing filesets that are as big as the ones I’m generating (I’m backing up VM images for the curious) and, I guess, many users will have a lot more than 25GB free on their %TEMP% drive, but it still illustrates a problem.

I think a review of how temporary files are handled during cleanup is warranted for any/all of the following:

  • check whether the space available to the temporary directory is likely to be exhausted
  • delete temporary files immediately after use to prevent accumulation of dead files
  • improve handling of out-of-disk-space conditions to prevent ungraceful abort and unexpected files requiring a rebuild.

Edited to add that I run --auto-cleanup=true and --auto-vacuum=true on all my backups.


#49

I’m not sure what’s in these dup- files during the cleanup phase, but in my experience, I can delete them if they’re over 10 minutes old and it does not seem to hurt anything. I needed this as I had a backup that had probably not gone successfully through the cleanup for a while, and with only a few gigs available in /tmp/ it needed a lot more. I think I easily deleted over 20GB of temporary files during one run before it would complete.


#50

Neither option will delete stale dup-* files in your local Temp folder.


#51

To avoid deleting something that Duplicati is using in the middle of its run, perhaps one of these options will stay clear, and I’m not sure which one will manage that better, or possibly either one will stay out of trouble.

–run-script-before
–run-script-after

If you have multiple jobs, putting this in any one of them might be sufficient to clear space for the other jobs. Settings in Duplicati can put them in all jobs if you prefer. This is a workaround (not an actual fix) either way.


#52

Firstly, yes - I can confirm that I’ve got left over dup* files in my /tmp folder for 2.0.4.14 with the official Docker container.

Most likely these are the temp storage for files before they are encrypted then uploaded to the destination. This means that they are probably zip files (just add a .zip and try opening them).

I don’t have any data changes currently so for me, the smaller files (min are 339k) are seem to be the dlist files (with filelist.json and manifest contents).

Note that I am NOT finding a related PUT entry in the job remote log leading me to wonder if there are maybe multiple issues going on.

For example, I don’t have my job set to upload if there are changes but perhaps the dlist temp files are being created anyway but then not being cleaned up since they were never uploaded. (Just a guess - I have not looked at any code yet.)


#53

Issue is being tracked here:


#54

I’ve been having difficulty trying to reproduce this. As far as I can tell, none of my temp files created during backup or compact operations are named like dup-xxxx.

Is there a simple description of how to reproduce this from scratch?

EDIT: I found a few, except that they are really small in size, unlike some of the cases above. I have a fix that appears to remove these, but it’s not clear that these small files are resulting from the same issue as others are describing.


#55

I checked my Temp folder just now and noticed another ~100 dup* temp files. I checked Duplicati and confirmed that no backup jobs were in progress. So I selected all files and deleted them.

With 2 of the files, Explorer told me that Duplicati.GUI.TrayIcon.exe had them locked (in use). Strange - no backups are running.

Maybe this will help point to the problem… some temp files still remain locked/undeleted after backup jobs finish.


#56

@warwickmm: I don’t have a methodology for the problem but I can tell you that my system accumulates a new one of these files after each backup run. The file has no extension (“dup-” + GUID) but it’s a zip archive containing 2 files: filelist.json and manifest. The filelist.json file invariably contains an empty array ("") and an example of the manifest contents is:

{“Version”:2,“Created”:“20190213T233759Z”,“Encoding”:“utf8”,“Blocksize”:102400,“BlockHash”:“SHA256”,“FileHash”:“SHA256”,“AppVersion”:“2.0.4.12”}

I don’t know if these files being orphaned is related to the larger problem I reported above and, frankly, at <1K each, these don’t bother me in the least but if leads to the identification and rectification of a bug then it’s worth reporting.


#57

Thanks @aureliandevel. I think I have a fix that removes the small dup-xxxx files. I have another fix that involves removing files during a compact process, possibly related to the large dup-xxxx files that people are finding. However, I haven’t been able to do much testing for the latter case yet.


Clearing out remnants of a partially-uploaded, extremely large file?
#58

Are these fixes actually resolving the underlying cause of them not being cleaned up (my guess is still threading) or is it a final “delete dup-* from temp folder” type step?

I’ve found the small files (I believe they end up being dindex files at the destination) are created whether or not there is anything to back up.

I still suspect that the existing cleanup code is tied to the upload code so if there are no changes to be uploaded, the temp files never get deleted.


#59

The fixes address the underlying issue of not cleaning up the temporary files. One fix removes the temp file containing the filelist.json that currently remains whether or not any files are uploaded. The second fix removes temporary files remaining from a compact operation. The latter one needs more testing, as I’ve been successful at finding the cause of some of the remaining files, but not all of them (i.e., a few of them still remain).


#60

Awesome, thanks!