I know S3 has versioning. Unfortunately I can not change any single option of the storage.
The second run of backup stops almost immediately with an error:
Found remote files reported as duplicates, either the backend module is broken or you need to manually remove the extra copies.
The following files were found multiple times:
Is any possibility to ignore duplicates or avoid its creating?
Can you say anything more about the WebDav implementation, e.g. its name or how it connects to S3? According to some sources, S3 doesn’t do WebDav by itself. Whatever you use, does it do file logging?
Without that, logging could be done at Duplicati level, e.g. –log-file and –log-file-log-level=retry might be good for looking at this, assuming it’s reproducible, and a reliable case is the key to having hopes for fix.
You can get a rough idea of whether you’re getting retries (maybe done improperly) in the backup job log “RetryAttempts” in “BackendStatistics”. You should look through the ones BEFORE the error is noticed. Your case might be easy because it sounds like perhaps the first run left damage the second run found.
Do you have a fine-grained timestamp (e.g. down to the second) on duplicates? How far apart are they?
Do you have sizes? Are the sizes the same? How about the file content? That’s the issue with ignoring duplicates – how do you know the one you ignored wasn’t the only one that was good. One “could” test, however figuring out where duplicates come from, and how to keep them from happening seems better.
I started the investigation. It looks like WebDav implementation is very bad. A cloned file displayed only once in web UI. If I move file displayed in two lines in WebDav into another folder using web site UI, I have 1 name entry only, not two entries. When I move file back I have 2 entries in WebDav again
Using your suggestions I going to investigate farther. Therefore, I need some days for that.
Some storage systems (Google Drive being a notable example) can easily store duplicate file names, however I’m not sure if your WebDAV is supposed to do that. A Google Drive name is just an attribute
Your WebDAV supplier should say how theirs is supposed to work. Better proof would be if you have a Duplicati list of operations that shows the file put to storage only once. A retry should use a new name.
Finally, the company who owns the webdav server fixed the implementation. Now I use webdav disk without problems. Definitely it was the bug of webdav protocol implementation of the service. Funny bug actually. If I upload many files into one directory, since 501st directory entry files have two entries instead of one
Now it’s fixed, Duplicati works fine!
Thanks!