Processing files which are not in backup sources

I think some nasty bug lured to Duplicati…

I think is that it reads files which is not supposed, because they are not in backup set.

Second is, that reading some system files (again not in backup set) are causing UnauthorizedAccessExceptions.

System.UnauthorizedAccessException: (5) Access denied: [\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy21\System Volume Information\21{3808876b-c176-4e48-b7ae-04046e6cc752}]
v Alphaleonis.Win32.NativeError.ThrowException(UInt32 errorCode, String readPath, String writePath)
v Alphaleonis.Win32.Filesystem.File.CreateFileCore(KernelTransaction transaction, String path, ExtendedFileAttributes attributes, FileSecurity fileSecurity, FileMode fileMode, FileSystemRights fileSystemRights, FileShare fileShare, Boolean checkPath, PathFormat pathFormat)
v Alphaleonis.Win32.Filesystem.File.OpenCore(KernelTransaction transaction, String path, FileMode mode, FileSystemRights rights, FileShare share, ExtendedFileAttributes attributes, Nullable`1 bufferSize, FileSecurity security, PathFormat pathFormat)
v Duplicati.Library.Snapshots.SystemIOWindows.FileOpenRead(String path)
v Duplicati.Library.Main.Operation.Backup.FileBlockProcessor.<>c__DisplayClass1_0.<b__0>d.MoveNext()

I‘m using Duplicati as a windows service under Windows 7 and have updated from 2.0.3.5 to 2.0.3.8 on 30.6.2018 and to 2.0.3.9 on 2.7.2018. The backup job is performed with --snapshot-policy=on.

Since then I have seen the phenomenon of negative numbers at the progress bar described here:

What is worse, however, is that data has been backed up that is not in backup sources. Now there are tons of useless GB in the backup set. :frowning:

Questions:

  1. How can I safely rollback to 2.0.3.5?
  2. Can the data that should not be backed up be deleted from the backup set? The file paths are known.

Thanks for reporting what you are seeing - and entering the GitHub issue!

I’m using 2.0.3.9 and not running into this issue, is it possible you’ve got a symlink or something in one of your included folders that points to your drive root?

This How-To topic should cover downgrading:

Yes, but there are some things you should be aware of such as unless you adjust other settings it’s likely a large compacting will happen (meaning lots of downloads, re-compress, uploads) so if you’re on a bandwidth limited connection or provider you may not want to do that.

Also, if you have a retention policy in place that might (depending on settings) automatically take care of the cleanup once the wrongly included files are no longer being included.

I’m curious, can you share some of what paths were included unexpectedly with 2.0.3.8 / 2.0.3.9 but NOT with 2.0.3.5? It’s possible some default exclude code got screwed up at some point so identifying that would be good.

Thanks for the link. Unfortunately the database version has changed, so the downgrade fails. :frowning: I’m back to 2.0.3.9 and have deactivated the backup until the problem is solved.

My --retention-policy is “7D:1h,3M:1D,1Y:1W,10Y:1M”. So I think the unwanted data will stay in the backup for 10 years, right?

My Backup-Paths are:
“C:\Users\Jens\”
“D:\Users\Jens\”
“C:\totalcmd\”
“D:\ablage\”
“D:\old_d\”

I have several -exclude parameters for filtering files and directories. But the hole backup configuration has remained unchanged for weeks. Now the backup has processed directories like “c:\windows” “d:\virtual machines” and so on, which are not included in the above backup paths.

@JonMikelV - Look at my github issue, there is screenshot, which clearly show, which function is failing.
Bad thing is that this is not predictable, sometime it happens, sometimes not.

Thanks for the details!

I can confirm that my 2.0.3.9 running on Windows 10 has also started including items not listed in the source. In my case it’s just C:\Users\ but like you I have a bunch of excludes and nothing has changed on my system.

@mnaiman, I have looked at the GitHub issue but am not familiar with that piece of code so didn’t really follow it other than you saying it was VSS related.

For me the issue started with the first backup after updating from 2.0.3.8 to 2.0.3.9, so I suspect the issue started with 2.0.3.9. If I can pin down a repeatable failure I’ll try downgrading to 2.0.3.8 and see if the issue goes away.

In the mean time I’ll flag this post as a #bug and be sure to let @kenkendk and @renestach know it’s confirmed on multiple machines now and should probably be reviewed ASAP since it silently effects existing backups.

Edit: I can confirm that using the test-filters command DOES show files NOT under any Source folders (either directly or through links) as being included:
Including path as no filters matched: C:\Windows\SystemApps\Microsoft.Windows.Cortana_cw5n1h2txyewy\Cortana.Internal.Search.winmd.

Will try running it again with VSS disabled to see if they go away.

I’ve been able to replicate the issue on two machines using the test-filters command so @kenkendk I’m flagging this as a #bug and have updated the @mnaiman GitHub issue.

Here’s what I did to verify the issue:

  • created a test backup that has C:\Users\ as the source
  • went to the job Commandline menu and selected a “Command” of test-filters
  • removed the contents of the “Target URL >” field
  • added following “Commandline arguments”
    • --snapshot-policy=Off
    • --log-file=C:\_Backups\Duplicati\VSSFilterTest-SnapshotOff.log
    • --log-file-log-filter=+*.Controller*;+*.FileEnumerationProcess*;-*
  • clicked ‘Run "test-filters’ command now’ button

I then repeated the above but changed the first 2 of the “Commandline arguments” to be:

  • --snapshot-policy=On
  • --log-file=C:\_Backups\Duplicati\VSSFilterTest-SnapshotOn.log

The resulting log files are HUGELY different and with snapshot-policy=off work as expected, but with snapshot-policy=on seems to be including the entire hard drive (even though C:\Users\ was the only Source location).

One possible clue about why the snapshot setting is causing the issue is both runs reported an access denied error to a particular path, but the REFERENCE to the file differed as follows:

with snapshot-policy=off, a regular drive path was used:

2018-07-15 23:22:24 -05 - [Warning-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-FileAccessError]: Error reported while accessing file: C:\Users\Public\Documents\My Videos\
System.UnauthorizedAccessException: Access to the path 'C:\Users\Public\Documents\My Videos' is denied.
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.FileSystemEnumerableIterator`1.CommonInit()
   at System.IO.Directory.GetDirectories(String path)
   at Duplicati.Library.Snapshots.NoSnapshotWindows.ListFolders(String localFolderPath)
   at Duplicati.Library.Utility.Utility.<EnumerateFileSystemEntries>d__23.MoveNext()

However with snapshot-policy=on a GLOBALROOT path was used:

2018-07-16 00:21:59 -05 - [Warning-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-FileAccessError]: Error reported while accessing file: C:\Users\Public\Documents\My Videos\
System.UnauthorizedAccessException: (5) Access is denied: [\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy10\Users\Public\Documents\My Videos\*]
   at Alphaleonis.Win32.NativeError.ThrowException(UInt32 errorCode, String readPath, String writePath)
   at Alphaleonis.Win32.Filesystem.FindFileSystemEntryInfo.FindFirstFile(String pathLp, WIN32_FIND_DATA& win32FindData)
   at Alphaleonis.Win32.Filesystem.FindFileSystemEntryInfo.<Enumerate>d__3`1.MoveNext()
   at System.Linq.Buffer`1..ctor(IEnumerable`1 source)
   at System.Linq.Enumerable.ToArray[TSource](IEnumerable`1 source)
   at Duplicati.Library.Snapshots.WindowsSnapshot.ListFolders(String localFolderPath)
   at Duplicati.Library.Utility.Utility.<EnumerateFileSystemEntries>d__23.MoveNext()

GLOBALROOT paths are ok, they are representing snapshot.
Whats not ok is that if in backup set is D:\software even D:\VirtualMachine is processed (not backup-ed).

I figured that was what was going on, but thanks for confirming it. :slight_smile:

Since the source and filters seem to work correctly with snapshots turned OFF, I’m assuming somewhere along the way we’re not correctly converting between standard paths (as used for Source and filter paths) and the GLOBALROOT paths (as used with VSS snapshots) thus we’re getting the entire drive (possibly ALL drives?) included in the backup.

This seems to have appeared in 2.0.3.9, though I haven’t rolled anything back yet to test if it happens in 2.0.3.8 now that I’ve got a repeatable success/failure scenario.

Edit:
I can confirm the issue does NOT happen in 2.0.3.8.

I wanted to work on removing the unwanted files from the backup. For this I have listed all files with Duplicati-CommandLine find ... –all-versions=true. Strangely, paths like c:\windows or d:\vitual machines are not included in the list, although they were definitely listed as files currently being backed up in the above backup job. Also, the backup files are larger than usual.

Therefore I used Duplicati-CommandLine find ... -version=X *.* >filelist-versionX.txt to create and compare different file lists before and after the update. There are several files which suddenly have a different size but have to be unchanged in any case! This can be easily traced, for example, by looking at text files (logs) or images. I randomly restored and looked at some of these files with wrong file sizes. The content is from other files! - usually from the same directory.

So, the backup sets are no longer trustworthy for me since the update to 2.0.3.9 (or 2.0.3.8?), when --snapshot-policy=On is used.

I am not yet very firm with the Duplicati.CommandLine. With the parameter purge --version=X *.* I can probably remove all files of a backup set. Is there a better method to remove a complete backup set?

While you could use the delete command (which flags versions as ready for deletion but may not ACTUALLY remove the files immediately) the purge command can do the same thing AND trigger a compact run which will immediately remove the flagged-for-deletion files / versions.

I investigated, that files ale processed (listed, read end2end) but they are NOT included in backup set.
But to be sure, I deleted last backup sets and downgraded.

I don’t like the unpredictable part…

> set AUTOUPDATER_Duplicati_SKIP_UPDATE=1
> C:\2.0.3.9\Duplicati.CommandLine.exe test-filters C:\Users\ --snapshot-policy=required --log-file=c:\withsnapshot-2.0.3.9.txt --log-file-log-filter=+*.Controller*;*.FileEnumerationProcess*;-*

... matched files, all in C:\Users\ ...

Matched 1414 files (247,97 MB)

Then I tried the same with --snapshot-policy=off and then repeat with 2.0.3.8, all giving the exact same result. I never get files included from any other folders …

Oh! That is very bad! I think this happens because the mapping fails, so suddenly similarly named files get a wrong path prefix.

Yes. The GLOBALROOT string is from the AlphaFS library and it reveals the real mapping. In the example from @JonMikelV, the mapping is happening correctly.

In the screenshot from @mnaiman, I can see the problem. It shows that the localFolderPath variable is being mapped to the root of the snapshot volume (same as when root is mapped to volumePath). This is wrong, because the local path should be simply have the C:\ replaced with volumePath.

I have examined the code many times, but still not found the reason for this.
My best bet is that this line is causing issues, by keeping a cache for the most recently constructed path, and for some unknown reason, it returns the previously constructed value.

Even though I do not see why this would happen, it would explain why @JensK suddenly gets random file names.

For lack of a better solution, I have removed the cache (I doubt it makes much difference anyway) and rewritten the code to use more straightforward string operations.

Edit: thinking some more, I think this is because I consider the assignment to be atomic, but in fact it is not. Since KeyValuePair<,> is a struct, it has value semantics. The initial copy was intended to make sure we did not do a check on one value, only to have it change afterwards. However, when assigning m_lastLookup the assignment is actually non-atomic due to the value semantics, so we could actually copy one value from KeyValuePair while the other was changed. It is a very narrow window, so it should not be as reproducible as reported.

My small daily Duplicati 2.0.3.9 backup (with snapshots on, but no actual filtering or exclude options) just announced it was backing up C:\pagefile.sys even though Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\BackupRestore\FilesNotToBackup includes \Pagefile.sys. I assumed registry excludes were always on (I’m not certain what’s on and what’s off), but I also know Duplicati is wandering.

I’m not sure backup would have finished. Process Monitor saw Duplicati reading C:\$ConvertToNonresident sequentially out to about 50 MB, repeatedly, and getting EOFs, while putting dblock and dindex files to the backend. After I stopped it (or it died), leftovers were in C:\Windows\Temp, but more backups fixed this all. Getting my backend use back down from over 10 times its proper size without even a repair was pleasant.

Thanks for letting us know. It’s on the “to do” list for the next release, but hasn’t been fixed yet:

Thx, problem is fixed.

1 Like