Case sensitivity regarding filenames

Another try, same result.
Still Win11. New backup, only option set:USN
One directory added to backup with one file: dAILY nEWS.txt. Backup. All right.
rename to Daily News.txt. Backup. Nothing happens. Rename to dailynews.txt. Backup.
Result: restore of version 0 shows both versions.
Difference is that I rename to another name, not only differing in case, and I don’t turn off USN. In previous case I turned off USN only to verify the problem is fixed without USN.

Thanks for the detail. That reproduces it.

select parent folder
USN on
database deleted
destination deleted
create dAILY nEWS.txt
backup (versions: 1)
rename Daily News.txt
backup (versions: 1)
rename dailynews.txt
backup (versions: 2)
restore version 0, see dAILY nEWS.txt dailynews.txt

Database views:

image

image

A fileset corresponds to a backup version (except for numbering). It’s a complete list, like dlist file has.
Most of the previous list is copied forward for unchanged files, but USN may have missed the change:

probably should have seen the old name as Deleted, but the statistics show it didn’t. It did see Added.

Underneath that is lots more code, but that’s one theory about how the now-gone file name got carried.

that’s an interesting problem, even if it’s not the original one in this thread.
I know that I could browse the 1000 issues in the project, and I will but if by any chance your memory (longer than mine) has some trace of a similar problem already seen there, it would be nice :slight_smile:

From a short search, this sounds similar:

A problem occurs when a folder is removed from the source selection list. Duplicati doesn’t see or process those files any more during backup. It mistakenly interprets this as unscanned (unchanged) files and brings them forward in the fileset, per the previous paragraph.

Unless I’m missing something, it does not look to be the same problem since no source has been removed in this case.

Not claiming it is. It’s also fixed. You asked:

and if the cause here is similar, it might help.

Regardless, it probably merits a new issue…

It wasn’t clear to me if the USN journal gave enough information, especially on the case-only rename. Testing with Everything which monitors the USN journal, even those renames change result list fast, suggesting it’s possible but maybe not done in Duplicati yet. I don’t know USN, but did spot this code:

which looks like it’s trying to equate new name of a rename as a file add, and its old name as a delete.
Searching for documentation on USN_REASON_RENAME_OLD_NAME found some Microsoft pages on that:

READ_USN_JOURNAL_DATA_V1 structure
USN_RECORD_V2 structure (winioctl.h)

and neither one appears to say if it’s case sensitive or case insensitive. Everything figures it out though.

1 Like

Yes I have looked at the USN journal with fsutil and not found any obvious difference in the ‘rename with just case change’ and ‘rename with fully other name’ cases that could explain the bug. But it’s an interesting problem even if complicated. USN seems a very good feature to have when one wants to backup big setups very frequently (say, hourly)

2 Likes

Wow, a lot has happened since my last visit. Sorry for my late reply, I was on my bread job, and also I’ve been reading to get the basics about USN journal, but I am only scratching the surface by now.

You pricisely hit the black: after removing usn-policy from the advanced options, renamings of just capital changes were recognized by Duplicati (no need to deliver more details, after your extensive tests) - I just confirm that it solved my problem. USN-policy was the difference in our settings I was looking for. I turn off the USN option until one day the code gets developed further:

From reading throug articles and skimming the code (far from understanding), I spotted some hints, which I want to share (you decide if useful of not):

From the source code comments it looks as the code still relies on the USN_RECORD_V2 structure
(Refer to code line # 242 //TODO: add support for V3 records),
while Win 10 delivers USN_RECORD_V3 minimum and Win 11 USN_RECORD_V4 for sure. Microsoft points out to design principles to stay compatible with changing USN Journal versions:

USN_RECORD_V3 structure (winioctl.h)
USN_RECORD_V4 structure (winioctl.h)

Although the USN_REASON_RENAME_OLD_NAME and
USN_REASON_RENAME_NEW_NAME fields stayed unchanged, one difference is that USN_RECORD_V4 structure lacks of a filename member. But a USN_RECORD_V4 is followed by minimum one USN_RECORD_V3 where the file name can be retrieved from.

This could also be the reason for multiple file names appearing in a single version and USN Structure changes would explain why the situation is worse with Win11.

Without going into deep how the code currently retrieves the file names, USN version compatibility would be a good strategy to start with.

1 Like

@runmode

thanks for taking the time to look up all this information, it may be useful. I’ll keep it in mind for sure.

USN is not the culprit, even if using USN_RECORD_V2 it provides all necessary information to handle the case change. In both cases (rename with only case change, complete rename), the USN journal puts the same info into the Duplicati pipeline (the previous name)

The different handling comes from here:

since Windows is not case sensitive, passing the previous name succeeds in finding the file with GetFileEntryAsync, and the entry gets an OldId defined. When the name has a complete name change, OldID = -1. In the succeeding stage of the Duplicati pipeline, in FilePreFilterProcess.cs, the file is transferred to the next stage or not depending on that (if OldID != -1, the file is triaged as ‘not modified’).

Now my next step is finding a fix :slight_smile:

After looking further into the case, that’s not exactly what is happening either.
USN itself is indeed not problematic with the V2. However it’s what is happening afterwise but still in the USN handling that fails. First thing, when there is a file renaming, the code is supposed to store and handle the 2 versions. However, a hash table is used with case insensitivity:

so only the old version is kept. In normal handling, the old file name is removed from further handling here:

in spite of the comment that says the opposite :frowning:

When there is a case change renaming, the condition where (m_Snapshot.FileExists) does not remove the old file name, because it is found by Windows (not sensitive case search). So Duplicati basically sees nothing changed.

The other problem I found is only a follow-up of this first problem (because the database still holds the initial version of the file name).

Fixing the hash is easy, however fixing the file exists condition is a bit more involved.

1 Like

Key findings, hep! :+1:
with cheers :raised_hands:

Issue:

Preliminary pre-alpha tentative attempt at a fix:

2 Likes