Metadata was reported as not changed, but still requires being added?

My experience with backuping Thunderbird Portable folder on Windows 7 with VSS:
Sometime is ok, sometimes warning, ratio may be 50:50

So in my case:

  • No long paths
  • No permissions changes/limitations

It looks like that coincidence is involved.

This is sample result for problematic backup job - no changes was made to backup job or OS
image

@JonMikelV I set it to write logs to a file, looked through the file to see what files was coming back with the various errors and 99% of them were due to permission errors. Permissions issues were caused by the items I mentioned previously:
with these specific files (that displayed the errors) permissions were set by the user (via Citrix, linux servers, and other means), so only that specific user had permissions (and local admin). This meant the newer PCs, servers, and domain settings could not override the old permissions, so I had to log in to the local server (Windows 2003). Once there I tracked down all the files and updated permissions to the more modern setup for local and domain admin access. After that no more errors relating to metadata.

I have same problems and no old permissions are involved. :confused:

Example of one I see with my latest backup, which I learned was not actually permissions related:

2018-07-06 00:21:32 -05 - [Warning-Duplicati.Library.Main.Operation.Backup.MetadataPreProcess.FileEntry-TimestampReadFailed]: Failed to read timestamp on “Z:\ACCTG\HUD\zzzzz\Backup\mydoc\cdoc\doc.ico”,
2018-07-06 00:21:32 -05 - [Warning-Duplicati.Library.Main.Operation.Backup.MetadataGenerator.Metadata-TimestampReadFailed]: Failed to read timestamp on “Z:\ACCTG\HUD\zzzzz\Backup\mydoc\cdoc\doc.ico”,
2018-07-06 00:21:32 -05 - [Warning-Duplicati.Library.Main.Operation.Backup.FileBlockProcessor.FileEntry-UnexpextedMetadataLookup]: Metadata was reported as not changed, but still requires being added? Hash: c8M3XXXXVi/HJe3v29xxxxcGBMtVucpGZxbbM/SP5ZE=, Length: 106, ID: 382696

According to the actual log file, it gives more information:
2018-07-06 00:21:32 -05 - [Warning-Duplicati.Library.Main.Operation.Backup.MetadataPreProcess.FileEntry-TimestampReadFailed]: Failed to read timestamp on “Z:\ACCTG\HUD\zzzzz\Backup\mydoc\cdoc\doc.ico”
System.ArgumentOutOfRangeException: Not a valid Win32 FileTime.
Parameter name: fileTime
-at System.DateTime.FromFileTimeUtc(Int64 fileTime)
-at Duplicati.Library.Snapshots.SystemIOWindows.FileGetLastWriteTimeUtc(String path)
-at Duplicati.Library.Main.Operation.Backup.MetadataPreProcess.<>c__DisplayClass2_0.<b__0>d.MoveNext()

In my case, the extended error message shows the reason:
Not a valid Win32 FileTime

In this case I checked the file, and it shows “Modified: Thursday, October 3, 24943 20:55:54”. Yes you read that right, something happened that set the year of the file to display 24943. Thus the reason for this error.

I could see this happening with virus infected files, other files with faulty timestamps, or any number of file related problems like this including permissions.

I scanned all my files and none have date which differs more than couple years from now. there needs to be yet another cause of issue.

This sounds like what is described here (where the “fix” was to manually adjust the timestamps to something more sensible):

I agree since if it was only the timestamp then the same files should be causing the error on every run, which isn’t the case for me, mr-fibble, etc.

What it does potentially tell us is that there are either multiple causes for that error or (more likely) that there are multiple failure points in the code that aren’t being caught until the metadata error check is reached.

Hopefully once your pull request reaches a canary version we’ll get enough additional detail to help narrow it down.

In that case I suggest going into the Duplicati settings and enable the log-file to something like C:\DuplicatiTemp\log.txt or /sda/home/DuplicatiTemp/log.txt so it logs the full errors, not just the truncated limited info shown on the web access page. That will tell you the full errors and why it is kicking back.

Yes, error is definitely bound to concurrent processing (c# async things). Multiple things are run in parallel in Duplicati which is awesome in terms of speed. Downside is, that is way harder to predict how code is behaving - reason of different causes.

More tests - 185 from 197 files in backup are affected.

I took same data as backup, same Duplicati, same OS, same backend, same DB and put to my virtual dev machine for debug. Guess what - backup went fine, no warnings.

Thanks for the tips to enable logging to a file. Didn’t realize that would have more detail.

After that, I see the following that looks a bit suspicious but I haven’t been able to glean a cause yet:

$ grep foobar.mp4
2018-07-09 19:48:14 -04 - [Verbose-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-IncludingPath]: Including path as no filters matched: /foobar.mp4
2018-07-09 19:48:14 -04 - [Verbose-Duplicati.Library.Main.Operation.Backup.FilePreFilterProcess.FileEntry-CheckFileForChanges]: Checking file for changes /foobar.mp4, new: False, timestamp changed: False, size changed: True, metadatachanged: False, 2/9/2016 1:05:41 AM vs 2/9/2016 1:05:41 AM
2018-07-09 19:50:16 -04 - [Verbose-Duplicati.Library.Main.Operation.Backup.FileBlockProcessor.FileEntry-FileMetadataChanged]: File has only metadata changes /foobar.mp4

In between the “Checking file for changes” and the “File has only metadata changes” there’s an associated “Metadata was reported as not changed, but still requires being added?” warning.

So it looks like it looks up the metadata for the file. Finds that everything is the same except the file size(!?). Determines that the metadata has not changed and then queries the database for the metadata id. That seems to fail to find a matching entry in the db, so it then reprocesses the file and finds that actually, nothing has changed.

Given this, I’m not quite sure what’s wrong. The file size mismatch is a clue perhaps, or it could be a red herring. Looking at the code and intuition would tell me that if the metadata hash mismatches then it would be updated in the database such that the next time it runs it would not mismatch. Unless some value in the metadata is uninitialized, random, or constantly changing, then it should be able to recover from these mismatches. Yet for me, every file reliably has these metadata hash mismatches.

I’m using xfs for a filesystem, and stat reports everything fine about these files:

$ stat /foobar.mp4
  File: /foobar.mp4
  Size: 2268649487      Blocks: 4430960    IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 8613937208  Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2018-07-09 19:48:14.675413810 -0400
Modify: 2016-02-08 20:05:41.318177515 -0500
Change: 2016-02-08 20:07:02.310180112 -0500
 Birth: -

I suspect, the real hint here is that both myself and @ethauvin are using check-filetime-only. I see in code like: duplicati/MetadataPreProcess.cs at 4b9c0ab9a223334d52b0b9ad6c4fb0bbb9199f29 · duplicati/duplicati · GitHub that file size is set to -1 if that setting is selected. Yet the logic at duplicati/FilePreFilterProcess.cs at 914b17055ce244e2b336843c6afc6ffc8963d9c5 · duplicati/duplicati · GitHub seems to unconditionally check the file size of the given file and add it. And a -1 value according to the logic on line 71 would always indicate it has changed.

1 Like

That might explain why you get it on every file…I don’t use filetime-only and I used to get it occasionally, but the last few days it seems to happen every run - but only once.

I’ll try to update once logging catches it.

@Chipmaster nailed it. :+1:

For kicks, I removed check-filetime-only from all of my backups, and every single one of them is working just fine now.

I am getting a lot of 503 errors on HubiC, but I am assuming that it is on their end. I’ll give it a few days before I start hollering. :grin:

@Chipmaster, FYI, my backups are faster now than there were before the metadata issues. Something that used to take 30 minutes, takes 8 now on Windows, although I have not seen much improvements on Linux.

Thanks for letting us know that improved things for you.

Guess we’ll have to review the --check-filetime-only code to figure out where things went wrong. :blush:

It might not be the only problem. I started a new backup on pCloud using WebDav, and on the second run the metadata errors happened for all files in the backup.

I’m also getting some error about not being able to load the Microsoft Azure dll, which seems to correlate with the backup pausing (no activity) for at least 30 minutes. Not sure if that’s related or not though.

Jul 16, 2018 2:31 AM: Failed to load assembly C:\ProgramData\Duplicati\updates\2.0.3.9\Microsoft.WindowsAzure.Storage.dll, error message: Could not load file or assembly 'Microsoft.Data.Services.Client, Version=5.8.1.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35' or one of its dependencies. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)

I have other backups starting on pCloud, we’ll see soon enough if the same is happening or not.

I’ve also seen performance improvements by disabling check-filetime-only . On the order of 13hrs -> between 30min and 1hr.

FWIW - I strangely have this issue on most of my backups every Monday (i keep for 30d so unsure why), I back up again and they go away.

Still to try the options mentioned in this thread though!

TL;DR:
Experiencing the same issue after migrating from User to Service on Windows 10 for VSS.
Problem appears both with check-filetime-only enabled (basically every file) and disabled (a few files).
All mismatches occur with size changed: True in the log. All other metadata is the same, but the files actually haven’t changed at all.
Disabling VSS reduces the number of Warnings but still doesn’t eliminate them.
The issue persists on new backup jobs.


I have the same issue with “Metadata was reported as not changed, but still requires being added?” warnings on 2.0.3.9_canary_2018-06-30

I am doing a backup of a lot of files on Windows 10 (most current updates) with VSS activated and set to required. Duplicati was recently migrated from user to service to use VSS, and thats when this problem started to appear. The backup happens completely locally from an SSD (C:) and HDD (D:) to another local HDD (E:).

I am seeing the issue with check-filetime-only either enabled or disabled:
check-filetime-only=true
It happens with basically every file… I got more than 100k Warnings on my backup.
check-filetime-only=false
It happens with a few files, generally <100 files affected.

All metadata mismatches show up with

new: False, timestamp changed: False, size changed: True, metadatachanged: False

in the log on Verbose log-level. Unfortunately I haven’t found a way for the Log to show me the actual file sizes it compares, neither the stored nor the newly seen ones, to do a sanity check on them.

Finally I tried to disable VSS which reduced the number of Warnings, but didn’t elimate them.

So my guess would be that something screws with the file sizes that Duplicati sees when running it as a Service (perhabs being an Admin is also involved?) because none of these Problems happened with a User installation.

Edit:
I just made a new backup job to eliminate any weird behaviour that might have come from the switch from User to Service and the result is the same.
The initial backup runs through without and issue but all following passes have the “Metadata was reported as not changed, but still requires being added?” warnings with size changed: True

Yes, that turned out to be the problem. I made a much faster query in the database that only fetches the LastModified value. But the logic then checked BOTH LastModified and LastSize causing it to think the size changed (current size vs missing size) and then submitted the file for full scanning.

Also, since the metadata was not previously fetched, it would think the metadata was changed as well and instruct Duplicati to re-store the metadata, causing the warning message about unchanged metadata.

Additionally, there was a wrong flag check that caused the OP warning to be triggered on most (all?) files.

For this error, at least the “… but still requires being added?” part should go away.

Not sure how to deal with the “invalid” time. Problem is we cannot read it into .Net, so we are stuck with a missing “current” time. This repeatedly causes Duplicati to thing the metadata has changed, and then it tries to generate new metatdata, only to find that it can’t. Next run we have not stored a valid timestamp, so it repeats.

With the changes from @mnaiman, we can at least see the problematic path. Is it acceptable to require users to fix the timestamps?

2 Likes

If I understood @mikaitech correctly then his files really had a completely invalid timestamp and he also mentioned that this was probably because of an external factor:

So IMHO this has nothing to do with Duplicati and can’t really be fixed by it in any way. Therefore I would argue that if the user has a corrupted timestamp like this, he should get an appropriate warning informing him of the exact issue, but he has to resolve it himself.

I agree. If Duplicati found invalid timestamp (never happened to me) or missing “current” time it should log warning with info “Correct your files timestamp - current time missing” but not to touch files at all. User should correct files.