Duplicati detects spurious timestamp changes

anon91846751 · February 18, 2020, 4:18pm

While investigating a different issue (see this post) I noticed that Duplicati 2.0.5.1 (running on Windows 10) detects a timestamp change even when the file hasn’t been modified in days or months. See the examples below:

Feb 18, 2020 5:07 PM: Checking file for changes D:\Users\Roberto\Pictures\Trips\Malaysia\LX5\20170809 - Gaya\P1040518.xmp, new: False, timestamp changed: True, size changed: False, metadatachanged: False, 12.02.2020 20:15:03 vs 12.02.2020 20:15:03

Feb 18, 2020 5:07 PM: Checking file for changes D:\Users\Roberto\Pictures\Trips\Malaysia\LX5\20170809 - Gaya\P1040518.JPG, new: False, timestamp changed: True, size changed: False, metadatachanged: False, 11.09.2017 16:59:51 vs 11.09.2017 16:59:51

Feb 18, 2020 5:06 PM: Checking file for changes D:\Users\Roberto\Pictures\Trips\South Africa & Mauritius\D810\20190722 - Panorama Route\_DSC0436.xmp, new: False, timestamp changed: True, size changed: False, metadatachanged: False, 25.09.2019 15:13:23 vs 25.09.2019 15:13:23

This seems to happen always with the same files. Deleting and recreating the DB does not help.

This issues, added to the fact that USN does not seem to work for some backups, makes for pointlessly long backup times.

ts678 · February 18, 2020, 5:02pm

This can make things worse because the remote has resolution to the second, whereas NTFS has 100 nanosecond resolution (other filesystems have other resolutions). So first backup may do lots of reads, attempting to make sure that the false-alarm timestamp changes (not visible in log due to its resolution) are truly nothing to worry about, and that file contents are as expected… What’s odd though is this part:

Above link shows where to look in the DB for time records if you like, e.g. using DB Browser for SQLite.

If you can cut this down into reproducible test steps for repeatable time mismatches, it’s worth an issue. Support requests in this forum are not tracked, and the developers tend to work from the GitHub issues.

When coming up with steps, it would also be worth testing whether USN use has an effect on this issue, however I’m not seeing this on backups of files from Windows 10 NTFS with –snapshot-policy=required.

anon91846751 · February 18, 2020, 6:21pm

I think you are onto something. If indeed there is a time resolution mismatch AND the value in the DB is not updated when the file is scanned, then the same files will be scanned over and over.

Look at the screenshot below: the same number of files is scanned at every backup, even if there are no changes. This seems to be a clear bug.

ts678 · February 18, 2020, 8:03pm

“Clear bug” doesn’t help unless someone can reproduce it. Possibly something else in your settings is contributing? Do you have any Advanced options set besides --snapshot-policy? Below is my result, running a backup twice with --snapshot-policy=required, then again with --snapshot-policy=off. It’s fine.

Can you come up with an exact setup of a new backup that does this? If so, please file an issue on it.

EDIT 1:

Creating a bug report is another option if you want someone to look at the DB timestamps for you, but actual file names are obscured for privacy. It might not matter much because you got about 30% extra opens, so sampling a bit might say whether or not there are some that have time issues. Because you can see the pathnames in your log, you can see if there’s a pattern, but first glance didn’t seem to find.

EDIT 2:

Lots of files seem to be considered ‘changed’ points to some tools to use if you look in the DB yourself.
Being on Windows, I assume this isn’t the mono issue (it started showing full time resolution) some got.

anon91846751 · February 18, 2020, 9:04pm

The only other option I use in this backup is --snapshot-policy=required, and the destination is another disk on the same machine (this is a local backup).

I will try to create a scenario to reproduce the issue. However I cannot believe that the one shown above is the expected behavior of the software. In other words: it’s a bug.

Now I deleted the backup (data and DB, but I kept the configuration) and run it again. I’ll lose the history but it’s not an issue. Let’s see if it keeps doing the same.

I will anyway create a bug report, even if I don’t find a way to reproduce it. Maybe what I found could already point someone in the right direction.

ts678 · February 18, 2020, 9:25pm

Not arguing that, but bugs that can’t be reproduced tend not to be fixed until someone works out how.

Sure, somebody could take up the challenge, but a reproducible issue tends to lead to a quicker fix…