When I start a backup in runs 10 to 15 minutes before “something” happens which leads to the docker hypervisor going down and thus taking all other services running on that docker host with it.
It seems like every time shortly before shutting down (crashing ?) the kernel log of the docker host shows:
br-b8f6e33cb802: port 1(veth81ceb62) entered disabled state
vethedc97a1: renamed from eth0
br-b8f6e33cb802: port 1(veth81ceb62) entered disabled state
device veth81ceb62 left promiscuous mode
br-b8f6e33cb802: port 1(veth81ceb62) entered disabled state
But I’m not really sure if it is related.
The source and target folders for the backup each reside on a nfs share on different hos
I don’t Docker but I’d suggest trying it with this official image to see if anything changes. Not sure who lscr.io is but it seems to be down at the moment.
For reference, this appears to be the image’s page: Docker
I DO use Docker (on a Synology NAS) but I am using the official Docker image.
I poked around looking for anything that might sound like your issue, but couldn’t find anything and have never experienced anything like that before. You could try opening an issue on github if you want to pursue using the linuxserver.io version: Issues · linuxserver/docker-duplicati · GitHub
Do any files manage to get uploaded over NFS from this run? Is it the initial backup or an update?
Duplicati is just scanning source files, packaging blocks that are not yet in backup, and writing file.
About → Show log → Live → Verbose will show both parts. What happens with a smaller backup?
Watching some performance monitoring tool on the host might help. Maybe something exhausted?
You can also probably poke NFS (if that’s a suspect) from inside the container to see if it can break.
If you know your Duplicati files, you can run mono Duplicati.CommandLine.BackendTester.exe
giving it a URL from Export As Command-line then modified to an empty folder to use for some test.
After getting a better idea of what provokes the crash, maybe try Internet search to get further ideas.
I’ve switched to the official image duplicati/duplicati as you suggested but the problem persisted.
Also I couldn’t find anything suspicious in the duplicati log as suggested by @ts678
So I did some more testing and was able to reproduce a crash by just using rsync to copy between those shares, so duplicati has to be innocent and I had to look elsewhere.
I previously checked the docker machines logs, the docker host logs, and the hypervisors libvirt logs. But I missed to check the hypervisors kernel log.
Turns out @ts678 was right with his suspicion. I over-committed too much memory and the hypervisor randomly killed vm processes.