Processing files which are not in backup sources

bug

#1

I think some nasty bug lured to Duplicati…
https://github.com/duplicati/duplicati/issues/3307

I think is that it reads files which is not supposed, because they are not in backup set.

Second is, that reading some system files (again not in backup set) are causing UnauthorizedAccessExceptions.

System.UnauthorizedAccessException: (5) Access denied: [\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy21\System Volume Information\21{3808876b-c176-4e48-b7ae-04046e6cc752}]
v Alphaleonis.Win32.NativeError.ThrowException(UInt32 errorCode, String readPath, String writePath)
v Alphaleonis.Win32.Filesystem.File.CreateFileCore(KernelTransaction transaction, String path, ExtendedFileAttributes attributes, FileSecurity fileSecurity, FileMode fileMode, FileSystemRights fileSystemRights, FileShare fileShare, Boolean checkPath, PathFormat pathFormat)
v Alphaleonis.Win32.Filesystem.File.OpenCore(KernelTransaction transaction, String path, FileMode mode, FileSystemRights rights, FileShare share, ExtendedFileAttributes attributes, Nullable`1 bufferSize, FileSecurity security, PathFormat pathFormat)
v Duplicati.Library.Snapshots.SystemIOWindows.FileOpenRead(String path)
v Duplicati.Library.Main.Operation.Backup.FileBlockProcessor.<>c__DisplayClass1_0.<b__0>d.MoveNext()


Release: 2.0.3.9 (canary) 2018-06-30
#2

I‘m using Duplicati as a windows service under Windows 7 and have updated from 2.0.3.5 to 2.0.3.8 on 30.6.2018 and to 2.0.3.9 on 2.7.2018. The backup job is performed with --snapshot-policy=on.

Since then I have seen the phenomenon of negative numbers at the progress bar described here:

What is worse, however, is that data has been backed up that is not in backup sources. Now there are tons of useless GB in the backup set. :frowning:

Questions:

  1. How can I safely rollback to 2.0.3.5?
  2. Can the data that should not be backed up be deleted from the backup set? The file paths are known.

#3

Thanks for reporting what you are seeing - and entering the GitHub issue!

I’m using 2.0.3.9 and not running into this issue, is it possible you’ve got a symlink or something in one of your included folders that points to your drive root?

This #howto topic should cover downgrading:

Yes, but there are some things you should be aware of such as unless you adjust other settings it’s likely a large compacting will happen (meaning lots of downloads, re-compress, uploads) so if you’re on a bandwidth limited connection or provider you may not want to do that.

Also, if you have a retention policy in place that might (depending on settings) automatically take care of the cleanup once the wrongly included files are no longer being included.

I’m curious, can you share some of what paths were included unexpectedly with 2.0.3.8 / 2.0.3.9 but NOT with 2.0.3.5? It’s possible some default exclude code got screwed up at some point so identifying that would be good.


#4

Thanks for the link. Unfortunately the database version has changed, so the downgrade fails. :frowning: I’m back to 2.0.3.9 and have deactivated the backup until the problem is solved.

My --retention-policy is “7D:1h,3M:1D,1Y:1W,10Y:1M”. So I think the unwanted data will stay in the backup for 10 years, right?

My Backup-Paths are:
“C:\Users\Jens\”
“D:\Users\Jens\”
“C:\totalcmd\”
“D:\ablage\”
“D:\old_d\”

I have several -exclude parameters for filtering files and directories. But the hole backup configuration has remained unchanged for weeks. Now the backup has processed directories like “c:\windows” “d:\virtual machines” and so on, which are not included in the above backup paths.


#5

@JonMikelV - Look at my github issue, there is screenshot, which clearly show, which function is failing.
Bad thing is that this is not predictable, sometime it happens, sometimes not.


#6

Thanks for the details!

I can confirm that my 2.0.3.9 running on Windows 10 has also started including items not listed in the source. In my case it’s just C:\Users\ but like you I have a bunch of excludes and nothing has changed on my system.

@mnaiman, I have looked at the GitHub issue but am not familiar with that piece of code so didn’t really follow it other than you saying it was VSS related.

For me the issue started with the first backup after updating from 2.0.3.8 to 2.0.3.9, so I suspect the issue started with 2.0.3.9. If I can pin down a repeatable failure I’ll try downgrading to 2.0.3.8 and see if the issue goes away.

In the mean time I’ll flag this post as a #bug and be sure to let @kenkendk and @renestach know it’s confirmed on multiple machines now and should probably be reviewed ASAP since it silently effects existing backups.

Edit: I can confirm that using the test-filters command DOES show files NOT under any Source folders (either directly or through links) as being included:
Including path as no filters matched: C:\Windows\SystemApps\Microsoft.Windows.Cortana_cw5n1h2txyewy\Cortana.Internal.Search.winmd.

Will try running it again with VSS disabled to see if they go away.


#8

I’ve been able to replicate the issue on two machines using the test-filters command so @kenkendk I’m flagging this as a #bug and have updated the @mnaiman GitHub issue.

Here’s what I did to verify the issue:

  • created a test backup that has C:\Users\ as the source
  • went to the job Commandline menu and selected a “Command” of test-filters
  • removed the contents of the “Target URL >” field
  • added following “Commandline arguments”
    • --snapshot-policy=Off
    • --log-file=C:\_Backups\Duplicati\VSSFilterTest-SnapshotOff.log
    • --log-file-log-filter=+*.Controller*;+*.FileEnumerationProcess*;-*
  • clicked ‘Run "test-filters’ command now’ button

I then repeated the above but changed the first 2 of the “Commandline arguments” to be:

  • --snapshot-policy=On
  • --log-file=C:\_Backups\Duplicati\VSSFilterTest-SnapshotOn.log

The resulting log files are HUGELY different and with snapshot-policy=off work as expected, but with snapshot-policy=on seems to be including the entire hard drive (even though C:\Users\ was the only Source location).

One possible clue about why the snapshot setting is causing the issue is both runs reported an access denied error to a particular path, but the REFERENCE to the file differed as follows:

with snapshot-policy=off, a regular drive path was used:

2018-07-15 23:22:24 -05 - [Warning-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-FileAccessError]: Error reported while accessing file: C:\Users\Public\Documents\My Videos\
System.UnauthorizedAccessException: Access to the path 'C:\Users\Public\Documents\My Videos' is denied.
   at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
   at System.IO.FileSystemEnumerableIterator`1.CommonInit()
   at System.IO.Directory.GetDirectories(String path)
   at Duplicati.Library.Snapshots.NoSnapshotWindows.ListFolders(String localFolderPath)
   at Duplicati.Library.Utility.Utility.<EnumerateFileSystemEntries>d__23.MoveNext()

However with snapshot-policy=on a GLOBALROOT path was used:

2018-07-16 00:21:59 -05 - [Warning-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-FileAccessError]: Error reported while accessing file: C:\Users\Public\Documents\My Videos\
System.UnauthorizedAccessException: (5) Access is denied: [\\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy10\Users\Public\Documents\My Videos\*]
   at Alphaleonis.Win32.NativeError.ThrowException(UInt32 errorCode, String readPath, String writePath)
   at Alphaleonis.Win32.Filesystem.FindFileSystemEntryInfo.FindFirstFile(String pathLp, WIN32_FIND_DATA& win32FindData)
   at Alphaleonis.Win32.Filesystem.FindFileSystemEntryInfo.<Enumerate>d__3`1.MoveNext()
   at System.Linq.Buffer`1..ctor(IEnumerable`1 source)
   at System.Linq.Enumerable.ToArray[TSource](IEnumerable`1 source)
   at Duplicati.Library.Snapshots.WindowsSnapshot.ListFolders(String localFolderPath)
   at Duplicati.Library.Utility.Utility.<EnumerateFileSystemEntries>d__23.MoveNext()


#9

GLOBALROOT paths are ok, they are representing snapshot.
Whats not ok is that if in backup set is D:\software even D:\VirtualMachine is processed (not backup-ed).


#10

I figured that was what was going on, but thanks for confirming it. :slight_smile:

Since the source and filters seem to work correctly with snapshots turned OFF, I’m assuming somewhere along the way we’re not correctly converting between standard paths (as used for Source and filter paths) and the GLOBALROOT paths (as used with VSS snapshots) thus we’re getting the entire drive (possibly ALL drives?) included in the backup.

This seems to have appeared in 2.0.3.9, though I haven’t rolled anything back yet to test if it happens in 2.0.3.8 now that I’ve got a repeatable success/failure scenario.

Edit:
I can confirm the issue does NOT happen in 2.0.3.8.