Out of space on /tmp/, 74 gb free

V_V · December 5, 2022, 6:47pm

My main Linux desktop rig has 96 gb of ram. The size of my backup is roughly 200 gb in size.

Unless I set asynchronous-concurrent-upload-limit and asynchronous-upload-limit both to 2, I cannot complete a backup to Google Drive. If I try the default values or rise any of them above 2, Duplicati aborts the backup claiming there is no space left on the /tmp/ drive.

df reports that I’m using around 10 gb of space on /tmp. free reports that I’m using 22 gb of ram altogether and have 74 gb of ram free. I’ve monitored the amount of free space whilst the backup is running and it never exceeds these values. Neither does the ram usage of mono/duplicati ever exceed anything but a few mere gb.

I’ve tried manually altering the size of tmps in fstab to 64 gb. Which doesn’t help, it still claims there is no space left on /tmp/ even though there is plenty of space left.

Any ideas how to fix this?

JimboJones · December 5, 2022, 8:24pm

Did you by chance change the --blocksize option? I seem to recall this happening when the blocksize is set to too large a value.

You could also try the --use-block-cache option to use the temp folder less, at the cost of RAM, which should not be an issue for you.

V_V · December 5, 2022, 8:42pm

Yep, I increased it.

Exactly my point. I’d understand if /tmp/ was 99% full and/or I had less than 1 gb of free ram available. But it’s nowhere near its limits.

gpatel-fr · December 5, 2022, 9:56pm

@V_V

I remember dimly (I have no personal interest in having a separate /tmp) seeing a poster having exactly this problem a few weeks (or months ? time fly…) ago. IIRC the key was that when one deletes an open file on Linux, it can be kept open for some time, and while its size is added to the ‘free’ space, in fact it is not yet available leading to perplexing results.

Hmm, I did not find it (did not search very hard) but there is a reference a long time ago already:

JimboJones · December 5, 2022, 10:12pm

Did you try it with the default size?

Out of curiosity, what did you set it to?

ts678 · December 5, 2022, 10:18pm

Choosing sizes in Duplicati explains sizes. blocksize won’t eat space as much as dblock-size a.k.a.

ts678 · December 5, 2022, 10:46pm

More importantly, what does df say is available? Is it supporting the idea that it ran out of space?
What does ls -lrt show passing through? Duplicati files typically have names starting with dup.

SQLite temporary files can apparently be invisible, so ls might not spot them, but I expect df will.

What sees space? I see nothing posted. How exactly does Duplicati claim, e.g. what messages?
Possibly you’ll need to look in About → Show log → Stored. A backup failure lacks its normal log.

If you’re changing any sizes away from default, please post the new values. Changes can matter.

If you have a different large folder that’s actually on disk, you could use tempdir to use it. Manual information appears to be a bit stale. Currently tempdir changes both Duplicati and SQLite uses.

But if you’ve got something misconfigured, it would be better to figure out what that is to correct it.

V_V · December 6, 2022, 9:16am

Not yet, I’ll set up a new backup for it since it’s not possible to change blocksize when a backup is already made.

blocksize = 1 MByte
dblock-size = 100 MByte

ts678 · December 6, 2022, 5:59pm

Those seem reasonable. For 200 GB backup, default 100 KB blocksize is a bit small. Overhead builds.
100 MB for dblock-size is just a slight multiple from 50 MB default. What is Remote volume size set to? Typically that’s the one that gets set huge by accident, so huge files try to go through /tmp and don’t fit.

You could look in your destination to see if you have any files bigger than you expect from that 100 MB.

Is this by any chance NUMA, probably implying multiple CPU sockets? tmpfs limitations there may vary, meaning the free amount of RAM might not all be available, apparently depending on memory zone use, although this sounds pretty obscure.

https://www.kernel.org/doc/html/latest/filesystems/tmpfs.html

calls that oversize and risking deadlock. I’m not a tmpfs expert.

V_V · January 6, 2023, 4:10pm

I actually have NUMA turned by a kernel parameter, so nope can’t be it.

I got that one under control too.

For what it’s worth, my issue is gone now. I don’t know why or what I did that could possibly change this, other than updating to kernel 6.1.x. But I’m pretty sure that it’s not related to any kernel changes. I just hope the error won’t come back again.

V_V · January 6, 2023, 6:29pm

Well, fudge. I spoke too soon.

The error is only gone as long as the backup doesn’t grow in size beyond some unknown limitation being set. I suspect it’s mono that has some limit set up somewhere, but I know just about nothing about mono or .net so I don’t know where to start.

I’ve checked and made sure it’s not a limitation of inodes, open files and stuff lke that. And yep, I still have over 80 gb of free ram and over 40 gb free on /tmp/.

ts678 · January 6, 2023, 8:57pm

If that’s with df and fairly steady (so as to not miss temporary fullness) as in original post, it’s odd…
Fixing “Insertion failed because the database is full database or disk is full” would be worth reading.

I suppose you could try filling /tmp with a big dd from /dev/zero to see if it can hold what it claims.
Tmpfs (kernel.org) bad config may deadlock, if you use it – you originally spoke about tmps in fstab.

Do you have enough real drive space to see if moving tempdir away from /tmp solves space error?
This is especially important if /tmp is tmpfs now, although I can’t find any df oddity documented…
I did test it with a deleted-but-still-open file, and it wasn’t fooled by that. Free space remained down.

dd if=/dev/zero bs=1024 count=1000 of=fill; tail -f fill in one window
df -k . and rm fill in another. Control-C the tail to see the free space increase
I did this on my /tmp which seems to be on the regular VM drive space, same as / is.

EDIT:

Was there an actual error message or stack trace (even better) posted here to see how it got to that?
Possibly you’ll have to look in About → Show log → Stored (and click line) to get needed information.

EDIT 2:

How are you sure that it’s size? Other things run only occasionally, such as the compact (as needed).
This is in the regular job log, but log-file=<path> and log-file-log-level=information gives a better view.
That would also make it easier to see when in the process it fails. Do you have any descriptions now?

It’s also odd that you use so much /tmp space. Do you see more 100 MB files in there than expected?
This ignores any invisible files, but if you want you could probably see those with lsof per other topic.

V_V · January 7, 2023, 10:15am

Hmm, the original poster wrote it didn’t help moving the temp directory to a place different than /tmp/. However, I’ll try that out if (when?) this error occurs again. I wrote in the bottom of this post what I did to hopefully work around this error.

Yes, I tried that and also copied a few large files to /tmp/ almost filling it up. Everything looks to be correct even if I use up 60 out of 48 gb of space (I got a 32 gb swapfile setup), and I can keep using the computer as well with no issues nor crashes.

This isn’t a deadlock issue. I had a faulty gfx card years ago which caused a deadlock under heavy load. A deadlock means the computer completely locks up, and you cannot shutdown nor reboot it by any ways other than using the physical reset button or powercycling. It’s one degree worse than a kernel panic, when you can actually do a reboot and get an error displayed.

Thank you for pointing me to this, it was a wee bit more comprehensive:

System.IO.IOException: Disk full. Path /tmp/dup-406eb4a4-3dcf-488e-aec5-cf0729d2d345
at System.IO.FileStream.FlushBuffer () [0x00081] in <282c4228012f4f3d96bdf0f2b2dea837>:0
at System.IO.FileStream.Flush () [0x00018] in <282c4228012f4f3d96bdf0f2b2dea837>:0
at System.IO.StreamWriter.Flush (System.Boolean flushStream, System.Boolean flushEncoder) [0x00096] in <282c4228012f4f3d96bdf0f2b2dea837>:0
at System.IO.StreamWriter.Flush () [0x00006] in <282c4228012f4f3d96bdf0f2b2dea837>:0
at Newtonsoft.Json.JsonTextWriter.Flush () [0x00000] in :0
at Duplicati.Library.Main.Volumes.FilesetVolumeWriter.AddFilelistFile () [0x0000b] in :0
at Duplicati.Library.Main.Volumes.FilesetVolumeWriter.Close () [0x00008] in :0
at Duplicati.Library.Main.Volumes.FilesetVolumeWriter.Dispose () [0x00000] in :0
at Duplicati.Library.Main.Operation.BackupHandler.RunAsync (System.String sources, Duplicati.Library.Utility.IFilter filter, System.Threading.CancellationToken token) [0x01048] in :0
at CoCoL.ChannelExtensions.WaitForTaskOrThrow (System.Threading.Tasks.Task task) [0x00050] in <9a758ff4db6c48d6b3d4d0e5c2adf6d1>:0
at Duplicati.Library.Main.Operation.BackupHandler.Run (System.String sources, Duplicati.Library.Utility.IFilter filter, System.Threading.CancellationToken token) [0x00009] in :0
at Duplicati.Library.Main.Controller+<>c__DisplayClass14_0.b__0 (Duplicati.Library.Main.BackupResults result) [0x0004b] in :0
at Duplicati.Library.Main.Controller.RunAction[T] (T result, System.String& paths, Duplicati.Library.Utility.IFilter& filter, System.Action`1[T] method) [0x0026f] in :0
at Duplicati.Library.Main.Controller.Backup (System.String inputsources, Duplicati.Library.Utility.IFilter filter) [0x00074] in :0
at Duplicati.Server.Runner.Run (Duplicati.Server.Runner+IRunnerData data, System.Boolean fromQueue) [0x00349] in <156011ea63b34859b4073abdbf0b1573>:0

Though I’ve no clue what any of this means other than the top line “Disk full” message

That’s the thing, I’m not using so much /tmp space. I’ve carefully monitored memory and /tmp/ usage right up to the moment the error appears and it’s never even close of using up all the space available.

I restarted the backup, it’s been running all night and so far - no errors. What I changed is:
asynchronous-concurrent-upload-limit, from 4 to 2
asynchronous-upload-limit, from 4 to 2
concurrency-compressors, from 6 to 2

Right now duplicati uses around 8 gb:

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
mono-sgen 45943 1000 15r REG 0,41 4294330109 19612 /tmp/dup-9f83e287-36f6-4d88-9cac-aa2e1923f261
mono-sgen 45943 1000 19w REG 0,41 525897799 19736 /tmp/dup-96989d31-8697-4f0f-b172-3ab3d3980071
mono-sgen 45943 1000 21w REG 0,41 4258396500 19592 /tmp/dup-23829714-7aca-4e28-aeca-c0238653eeb2
mono-sgen 45943 1000 23w REG 0,41 157 14703 /tmp/dup-e0c1bcb5-1606-405a-9c2e-cba515cdb800
mono-sgen 45943 1000 24u REG 0,41 0 14704 /tmp/dup-5e7297e2-a773-4c7f-b7ca-e3af1a3de5e7

The downside is that my upload speed is now half of what it was, to around 25 MB/s.

I can understand how these options force duplicati to use less /tmp/ storage, but why am I somehow prevented from fully using all my available /tmp/s space and RAM? If 2 concurrent threads use 8 gb of storage space in /tmp/, I should be able to use 10 times as many if I want to since i got 96 gb of RAM. And yet something prevents from using more than 2 threads, hence my suspicion that it’s a configurable limitation in mono.

ts678 · January 7, 2023, 1:44pm

The line you quoted was not for the original issue but for the space test stress of the previous quote, because it might have stopped your system (assuming deadlock only is a risk under memory stress).

It seemed like it would be worth warning about the risk of the suggestion.

Does that file still exist? If Duplicati didn’t clean it up, how big is it and what’s inside, e.g. using less?

How the backup process works refers to a filelist.json which eventually goes into a dlist file to describe source files from some backup version. The Fileset term also means a set of source files.

github.com

duplicati/duplicati/blob/666b2281032460254839fdc3b6e1055fdf7ce1db/Duplicati/Library/Main/Volumes/FilesetVolumeWriter.cs#L153-L174


      
          private void AddFilelistFile()

          {

              m_writer.WriteEndArray();

              m_writer.Flush();

              m_streamwriter.Flush();

          

              try

              {

                  using (Stream sr = m_compression.CreateFile(FILELIST, CompressionHint.Compressible, DateTime.UtcNow))

                  {

                      m_tempStream.Seek(0, SeekOrigin.Begin);

                      m_tempStream.CopyTo(sr);

                      sr.Flush();

                  }

              }

              finally

              {

                  m_writer.Close();

                  m_streamwriter.Dispose();

                  m_streamwriter = null;

This file has been truncated. show original

is (I think – any C# developers care to look?) trying to copy a /tmp/dup-* file into the fileset.json that’s inside a .zip file when the mono library gets told (presumably by Linux) that the disk became full.

Below that, one could strace to look for the system call that reported the fill, but that’s deep looking…

Because a copy is happening, there might be two copies there at once, which adds to the space stress. How large is a typical dlist file in your destination? If small, having several might not add much stress.

I’m still interested in /tmp/dup-* file behavior and leftovers. Maybe even poll ls -lhrt /tmp/dup*.
Poll speed for this and df depends on how fast you think a file copy could maybe temporarily fill /tmp.

V_V · January 8, 2023, 10:45am

Once again I let it run over night, this time raisning the number of threads just to reproduce the error. And this time it’s more legit, as /tmp/ now “only” has 8 gb of free space left.

asynchronous-concurrent-upload-limit: 3
asynchronous-upload-limit: 3
concurrency-compressors: 3

COMMAND    PID     USER   FD   TYPE DEVICE   SIZE/OFF   NODE NAME
mono-sgen 3251     1000   17r   REG   0,40 4294942492 324547 /tmp/dup-6a7054a5-81cf-44a5-81f6-a80d4091987a
mono-sgen 3251     1000   18r   REG   0,40 4294189957 324524 /tmp/dup-302e2abd-2ed3-42a1-8a9e-3f5dc44ca4b6
mono-sgen 3251     1000   20u   REG   0,40 1071972620 324568 /tmp/dup-a7018257-bfb4-43a3-8dbc-0197262731ab
mono-sgen 3251     1000   21w   REG   0,40        157 323826 /tmp/dup-ffa39589-be7e-4169-8438-1be84388b24a
mono-sgen 3251     1000   22u   REG   0,40          0 323827 /tmp/dup-39cae202-85a4-44a1-8323-0ab6729d2317
mono-sgen 3251     1000   24u   REG   0,40 3154182412 324543 /tmp/dup-e192cb67-b8f0-40a2-a922-5574ec607176
mono-sgen 3251     1000   26r   REG   0,40 4294263693 324179 /tmp/dup-f21461d1-3b45-4a31-b8b6-fd330fcf411b
mono-sgen 3251     1000   28r   REG   0,40 4293997645 324412 /tmp/dup-4050d319-50cc-4174-ac5a-94531fff93ec

Maybe I’m not doing the math correctly? How do you calculate the theoretical memory usage when the remote volume size is set to 4 gb with the number of threads set to 3, 3 and 3?

What’s new compared to yesterday is that the backup still continues despite the disk full error. Now it’s using 42 gb out of 48 in /tmp/, perhaps I managed to time it perfectly this time to watch it crash completely within a few minutes? I think you were right in that it only fills up for mere fractions of a second and then deletes all the dup-files making space available again.

Thank you @ts678 for all your help and support, it’s VERY appreciated and I realize my issue is far more complicated than what most others would indulge themselves in trying to solve.

ts678 · January 8, 2023, 8:26pm

Up from the default 50 MB. That’s where a lot of /tmp space is going… The following threw me off:

because a dblock is a remote volume. If you set both, apparently the GUI remote volume size rules.
Choosing sizes in Duplicati section Increasing the Remote Volume Size talks about the implications.

Duplicati is so far removed from that that it’s probably not possible. It’s all automatic as mono wishes.

Since tmpfs lives completely in the page cache and on swap

according to the kernel doc, there’s a very fuzzy line between memory and drive that I hope doesn’t worsen the puzzle here, but if you mean /tmp usage, I think it’s supposed to be as in Duplicati docs.

Earlier on, math was simpler because there was not a concurrent uploader, and just one queue per:

  --asynchronous-upload-limit (Integer): The number of volumes to create ahead
    of time
    When performing asynchronous uploads, Duplicati will create volumes that
    can be uploaded. To prevent Duplicati from generating too many volumes,
    this option limits the number of pending uploads. Set to zero to disable
    the limit
    * default value: 4

The initial block collection is an unencrypted .zip file, and encryption is done as a separate step.
I’m not sure if the queue is encrypted or not, but you can run file on any dup-* for a guess by it.

Later came concurrent uploads, adding upload speed, and muddying queues question somewhat:

  --asynchronous-concurrent-upload-limit (Integer): The number of concurrent
    uploads allowed
    When performing asynchronous uploads, the maximum number of concurrent
    uploads allowed. Set to zero to disable the limit.
    * default value: 4

I certainly hope each of the uploaders doesn’t have its own queue of default size 4. You can test it.
This is why I had suggested ls -lhrt /tmp/dup*. You can watch future dblocks flowing through.

  --concurrency-compressors (Integer): Specify the number of concurrent
    compression processes
    Use this option to set the number of processes that perform compression of
    output data.
    * default value: 2

github.com

duplicati/duplicati/blob/666b2281032460254839fdc3b6e1055fdf7ce1db/Duplicati/Library/Main/Operation/BackupHandler.cs#L209-L213


      
          // Spawn additional compressors

          .Union(

              Enumerable.Range(0, options.ConcurrencyCompressors - 1).Select(x =>

                  Backup.DataBlockProcessor.Run(database, options, taskreader))

          )

github.com

duplicati/duplicati/blob/666b2281032460254839fdc3b6e1055fdf7ce1db/Duplicati/Library/Main/Operation/Backup/DataBlockProcessor.cs#L26-L31


      
          /// <summary>

          /// This class receives data blocks, registers then in the database.

          /// New blocks are added to a compressed archive and sent

          /// to the uploader

          /// </summary>

          internal static class DataBlockProcessor

It “looks” like AsynchronousUploadLimit is handled by the backend code, which is after the above.

One more wrinkle you might run into (and even take advantage of…) is asynchronous-upload-folder

The pre-generated volumes will be placed into the temporary folder by default, this option can set a different folder for placing the temporary volumes, despite the name, this also works for synchronous runs.

which is another control beyond tempdir to either help in your analysis, or maybe help space usage.

I’m pretty sure the upload is done from the encrypted version of the dblock file. I’m not sure when the unencrypted version is deleted though, but you might have both at once (encrypted a little bit larger). There’s won’t be any tmp file with a dblock name. That name is set by giving the name to the remote.

file can probably identify .zip files. Programs like less can get clues too, usually near the file start.
If you really want to count and trace files, have at it, and please tell use what files are flowing where…

Alternatively, using smaller remote volumes would probably go a long way towards solving space lack.