Based on what I am seeing with my backups recently, seems to me that Duplicati does not like datasets consisting of over 1 million files and/or files containing strange names such as extremely long strings of randomised numbers and letters that just happen to look like how Duplicati names the files it creates in the remote backend.
I did exactly have such datasets and as a result either Duplicati behaved slowly or it simply crashed out badly; we are talking about error messages that exceed the character limit of any post here! I decided to zip up some of the folders and Duplicati worked smoothly (except that I lost some of its deduplication benefits, which is compensated by the upfront space savings from zipping the folders)
Are you all having similar issues with such datasets, or rather, are you all having issues despite your datasets not being like as described?
I’m not sure what the comment about “ok” means. I don’t know their rules but I’ve seen people post.
Public posting is often good enough for anybody interested enough to go look to see what’s wrong.
Redacting private information is good. Most is obvious, but I’ve seen people have accidents before.
Generally the most useful part of the message is at the top, with the details of the situation below it.
Was there a filename in there that’s relevant? I couldn’t spot one, but two of those were indeed extremely large.
Maybe you can just describe “containing strange names such as extremely long strings of randomised numbers and letters” further. There’s an OS limit (what OS is this?) on both the length between slashes, and total length. Maybe if a reproducible test can be found, it can be looked at. Also, is failure with or without GUI commandline?
Sorry, I forgot to omit something from that filename. I think that’s the problem; that line is the full directory of the file. Each backslash is a folder. So to break it down:
Thanks for testing! This is quite the conudrum, I am quite unsure what else could be the trigger other than possibly lack of temporary storage but I can’t recall if I have set a HDD/SSD as the temp storage instead of using /tmp/
In addition to disk space, are you monitoring for memory exhaustion? For example you could run top.
You could possibly see if fewer files with strange names work better. My testing used only a single file.
Duplicati keeps every file path of every backup version in its database. This can slow things over time. Removing older versions using retention policy is one way to keep the database from growing forever.
No problem! For time being, I am going through all these files to see what I don’t need anyway, which will hopefully shrink the database to something more manageable. I was backing them up to Gsuite via Duplicati first in case I accidentally deleted something that I needed
In the meantime, zipping up these files is a good strategy seeing how it helped me workaround this issue and also reduced disk usage on both server and client side