Duplicati crashes and disappears

Some time back I had a problem that Duplicati chrashed and dissapeared. That time it was due to one Duplicati file on the backup storage Minio server wasn’t readable due to disk corruption. Now I have the same crash again. But I’ve verified that all files stored on the Minio server are readable.

Three other backup jobs storing their Duplicati files on the exact same Minio server works without a hitch.

2.0.4.23_beta_2019-07-14 running on Windows 2016 server

I do get this email below, just like earlier! So the server manages to send this out before it goes down.

Any idea where to start??

Failed: A WebException with status ConnectFailure was thrown.
Details: Amazon.Runtime.AmazonServiceException: A WebException with status ConnectFailure was thrown. ---> System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it [192.168.13.220:31210](http://192.168.13.220:31210/)
   at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
   at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Exception& exception)
   --- End of inner exception stack trace ---
   at System.Net.HttpWebRequest.GetResponse()
   at Amazon.Runtime.Internal.HttpRequest.GetResponse()
   at Amazon.Runtime.Internal.HttpHandler`1.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RedirectHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Unmarshaller.InvokeSync(IExecutionContext executionContext)
   at Amazon.S3.Internal.AmazonS3ResponseHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeSync(IExecutionContext executionContext)
   --- End of inner exception stack trace ---
   at Duplicati.Library.Main.AsyncDownloader.AsyncDownloaderEnumerator.AsyncDownloadedFile.get_TempFile()
   at Duplicati.Library.Main.Operation.CompactHandler.DoCompact(LocalDeleteDatabase db, Boolean hasVerifiedBackend, IDbTransaction& transaction, BackendManager sharedBackend)
   at Duplicati.Library.Main.Operation.DeleteHandler.DoRun(LocalDeleteDatabase db, IDbTransaction& transaction, Boolean hasVerifiedBacked, Boolean forceCompact, BackendManager sharedManager)
   at Duplicati.Library.Main.Operation.BackupHandler.CompactIfRequired(BackendManager backend, Int64 lastVolumeSize)
   at Duplicati.Library.Main.Operation.BackupHandler.<RunAsync>d__19.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at CoCoL.ChannelExtensions.WaitForTaskOrThrow(Task task)
   at Duplicati.Library.Main.Controller.<>c__DisplayClass13_0.<Backup>b__0(BackupResults result)
   at Duplicati.Library.Main.Controller.RunAction[T](T result, String[]& paths, IFilter& filter, Action`1 method)

Log data:
2019-08-10 21:25:36 +02 - [Error-Duplicati.Library.Main.Operation.BackupHandler-FatalError]: Fatal error
Amazon.Runtime.AmazonServiceException: A WebException with status ConnectFailure was thrown. ---> System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it [192.168.13.220:31210](http://192.168.13.220:31210/)
   at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
   at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Exception& exception)
   --- End of inner exception stack trace ---
   at System.Net.HttpWebRequest.GetResponse()
   at Amazon.Runtime.Internal.HttpRequest.GetResponse()
   at Amazon.Runtime.Internal.HttpHandler`1.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RedirectHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Unmarshaller.InvokeSync(IExecutionContext executionContext)
   at Amazon.S3.Internal.AmazonS3ResponseHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeSync(IExecutionContext executionContext)
   --- End of inner exception stack trace ---
   at Duplicati.Library.Main.AsyncDownloader.AsyncDownloaderEnumerator.AsyncDownloadedFile.get_TempFile()
   at Duplicati.Library.Main.Operation.CompactHandler.DoCompact(LocalDeleteDatabase db, Boolean hasVerifiedBackend, IDbTransaction& transaction, BackendManager sharedBackend)
   at Duplicati.Library.Main.Operation.DeleteHandler.DoRun(LocalDeleteDatabase db, IDbTransaction& transaction, Boolean hasVerifiedBacked, Boolean forceCompact, BackendManager sharedManager)
   at Duplicati.Library.Main.Operation.BackupHandler.CompactIfRequired(BackendManager backend, Int64 lastVolumeSize)
   at Duplicati.Library.Main.Operation.BackupHandler.<RunAsync>d__19.MoveNext()

https://www.google.com/search?q=“AmazonServiceException%3A+A+WebException+with+status+ConnectFailure”

I haven’t read through all of them, but at least one asked for network traces, which is what I’d do partly because I’ve had some practice. Some of the names shown are probably internal, but one API is here:

Socket.Connect Method

which basically looks like a TCP connection didn’t connect. Was this a temporary or permanent issue?

Running netstat can give you an idea of your TCP interaction with Minio. Depending on the remote OS, there might be fancier tools, but there most likely will at least be netstat (linking to the version on Linux).

The telnet program available on Windows, Linux, etc. can be used to test connections, e.g. to port 443. Servers can have limits on how many simultaneous connections they take, or how fast they take them.

TCP connection backlog - a struggling server gets into some of this from a netstat or packet viewpoint.

Back to simpler things, does Minio keep any logs to reveal what might have been happening at failure?

I would assume either the IP or port number is wrong, or the Minio service is not running.

In either case it’s a bug that it causes Duplicati to crash - it should definitely handle this more gracefully!

Got to agree with that. If this isn’t reproducible, maybe whoever takes it on could dummy up the exception?

Thanks for quoting further. I found my word wrap wasn’t on, so I missed it. “actively refused” may mean an RST or something other than the usual TCP three-way handshake. You can telnet to a given port to see if connection can be made. Don’t type to server, just exit however yours exits, e.g. Linux Control-] then quit.

Unfortunately I don’t know S3 protocol well enough to know if port 31210 is anything special. A web search doesn’t pick up much, and some sites specialize in documenting special ports (which are generally lower). Sometimes ports are dynamically assigned, even in protocols as ancient as FTP. One side tells the other where to connect, then hopefully sets up to receive the connection before the other side tries to make it…

Actually no. The Minio server is replying all the time and is up, I have another similar Duplicati job running on the same computer backing up to the same Minio server, it’s just another bunch of files. This one works flawlessly. And two more Duplicati jobs on a totally different computer backing up to the exact same Minio server without any problems ever. So out of the four jobs, the only difference being four different sets of files, one crashes Duplicati, the other three works fine.

EDIT: added comment below, it might be Minio isn’t replying during a few seconds while it’s waiting for for example a disk read error time out… or something.

Good idea!!! Will check. Last time this problem scenario occured it was due to a disk error where when you tried to open one particular file (a duplicati file) you just got an error like it was protected by user rights. But it was just not possible to open the file regardless rights. And regardless what tool you tried to open it with, Linux command line tools or whatever. So it’ll be interesting to see if I can see in the Minio logs what kind of problem Minio is hitting this time. Still very very odd that Duplicati crashed last time due to a “can’t open file” error on the disk Minio was using. The Duplicati error message that time was also saying not being able to connect. Which might be true come to think of it now, Minio might be for a short duration of time be hanging, if it’s waiting for a disk read time out. Maybe maybe not replying to network calls during that time.

That port number is my own choice, it’s the port Minio is using on my server.

It was just my initial assumption (guess) at the problem. I saw your messages about other backups working against the same Minio back end, but I was thinking that possibly there was a typo in the config on this particular backup job.

I don’t believe so. The error “actively refused it” indicates the connection was terminated during the initial handshake - not that the connection timed out due to some sort of latency.

Does Minio have a concurrent connection limit put in place? I could see a service deciding to actively refuse connections if more than a certain number of active connections currently exist.

So all separate jobs have been runing fine for a long time. But then this particular job with it’s file set to back up suddenly just started crashing like this a few weeks ago.

I’ve done no change to the job or the server but to turn on the option “upload-verification-file” but two days ago the job ran again, this time it ran for many many many hours staying at job status “deleting files” (can’t remember the exact phrasing, sorry). And it finished without crashing! And now it runs fine everytime I run it (so far…)

Does that give anyone any idea? I’m currently happy so I’ll not dig more into this. If anyone wants me to test something to try to understand the crashing just give me a holler!

How many computers do you have backing up to this Minio back end?

Also, is the back end on the same LAN as the computer you are backing up? Or is it going over the internet?

I have two computers I’m backing up. A bunch of virtual machine disks on one of them, work documents and a lot of music on the other one.

I then have two identical backup storage machines that I backup everything to, everything backed up twice, one local and one off site. So every backup job is set up twice, one for each storage destination.

The job crashing is the virtual machine disks to the off site server. The identical job to the local backup storage server has run fine all the time. And all other jobs run fine both to local and off site storage.