Backup to WebDav doubles *.dindex.zip.aes files

Viktor · August 17, 2019, 8:58am

I setup backup from local drive to WebDav. The first run makes remote clones like these:

duplicati-ifb16793cda6b4098befe4d68a0e46056.dindex.zip.aes

duplicati-ifb91adb9f9e04dbca953b0392b46b6b4.dindex.zip.aes

(Two copies of the same name)

This is company’s WebDav based on Amazon’s S3.

I know S3 has versioning. Unfortunately I can not change any single option of the storage.

The second run of backup stops almost immediately with an error:

Found remote files reported as duplicates, either the backend module is broken or you need to manually remove the extra copies.
The following files were found multiple times:

Is any possibility to ignore duplicates or avoid its creating?

I use 2.0.4.23_beta_2019-07-14 version.

ts678 · August 19, 2019, 5:18pm

Welcome to the forum @Viktor

Can you say anything more about the WebDav implementation, e.g. its name or how it connects to S3? According to some sources, S3 doesn’t do WebDav by itself. Whatever you use, does it do file logging?

Without that, logging could be done at Duplicati level, e.g. –log-file and –log-file-log-level=retry might be good for looking at this, assuming it’s reproducible, and a reliable case is the key to having hopes for fix.

You can get a rough idea of whether you’re getting retries (maybe done improperly) in the backup job log “RetryAttempts” in “BackendStatistics”. You should look through the ones BEFORE the error is noticed. Your case might be easy because it sounds like perhaps the first run left damage the second run found.

Do you have a fine-grained timestamp (e.g. down to the second) on duplicates? How far apart are they?

Do you have sizes? Are the sizes the same? How about the file content? That’s the issue with ignoring duplicates – how do you know the one you ignored wasn’t the only one that was good. One “could” test, however figuring out where duplicates come from, and how to keep them from happening seems better.

Viktor · August 19, 2019, 7:01pm

Thank you for your answer and ideas!

I started the investigation. It looks like WebDav implementation is very bad. A cloned file displayed only once in web UI. If I move file displayed in two lines in WebDav into another folder using web site UI, I have 1 name entry only, not two entries. When I move file back I have 2 entries in WebDav again

Using your suggestions I going to investigate farther. Therefore, I need some days for that.

I use WebDav service of Megafon company – disk.megafon.ru

File time looks like this

Everting the same in two lines. Simple clones.

I am going to contact company’s support shoving them an error. I hope someone pay attention at this bug.

Thank you!

ts678 · August 19, 2019, 8:44pm

Some storage systems (Google Drive being a notable example) can easily store duplicate file names, however I’m not sure if your WebDAV is supposed to do that. A Google Drive name is just an attribute

google drive API allows duplicates?

Your WebDAV supplier should say how theirs is supposed to work. Better proof would be if you have a Duplicati list of operations that shows the file put to storage only once. A retry should use a new name.

Viktor · December 7, 2019, 8:47am

Thanks to everyone for suggestions and ideas!

Finally, the company who owns the webdav server fixed the implementation. Now I use webdav disk without problems. Definitely it was the bug of webdav protocol implementation of the service. Funny bug actually. If I upload many files into one directory, since 501st directory entry files have two entries instead of one
Now it’s fixed, Duplicati works fine!
Thanks!