Ignored folders are not really ignored

Running 2.0.5.104_canary_2020-03-25 on Ubuntu 20.04.

I have a folder in my backup that is marked as being ignored, yet Duplicati keeps on trying to check it.

I tried every kind of rule I could think of, regex, etc. No matter what, it always attempts to read the folder. Since the folder is not accessible, it results in multiple warnings per backup.

Frankly, I’m surprised that this has not been dealt with a long time ago. With NTFS, SMB, or cloud mounted volumes, there are many instances where visible files or directories are not directly accessible.

Here’s the warning I’m getting:

2020-05-11 20:00:00 -07 - [Warning-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-FileAccessError]: Error reported while accessing file: /home/erik/pCloudDrive/

It is obviously trying to scan the folder, even though I have -/home/erik/pCloudDrive/ explicitly set.

Thanks,

E.

Edit your backup job and go to the filters on page 3. Click the three-dot menu to the right of the Filters section and select “Edit as Text”. Can you show us exactly what it contains?

I already posted what it contains:

-/home/erik/pCloudDrive/

Ok I saw it in your original post but wanted to make sure it was what shown the same when you “edit as text”

You have no other filters?

Just other ignores:

-/home/erik/Android/
-/home/erik/snap/
-/home/erik/Dropbox/
-/home/erik/Downloads/
-/home/erik/pCloudDrive/
-/home/erik/.AndroidStudio3.6/system/
-/home/erik/.dropbox-dist/
-/home/erik/.filebot/tmp/
-/home/erik/.filebot/cache/
-/home/erik/.dropbox/
-/home/erik/.pcloud/
-/home/erik/Videos/series/
-/home/erik/Videos/movies/
-/home/erik/pCloudDrive/
-/home/erik/.cache/
-/home/erik/.groovy/
-/home/erik/.mozilla/
-/home/erik/.gnome/
-/home/erik/.local/
-/home/erik/.AndroidStudio3.6/
-/home/erik/.xsession-errors.old
-/home/erik/.sudo_as_admin_successful
-/home/erik/.filebot/
-/home/erik/.x2goclient/
-/home/erik/.x2go/
-/home/erik/.android/
-/home/erik/.xsession-errors
-/home/erik/.config/
-/home/erik/.java/fonts/
-{CacheFiles}

I tried it with no other filters at first, and it makes no difference.

I think I read elsewhere on the forum that the rules are applied after the scanning, which would explain it.

I’d like to try and reproduce this. Do you know why Duplicati can’t access this folder? Is it just because of the permissions?

It’s not an actual directory on the disk, it points to a directory on pCloud. Kinda like Dropbox.

What happens if trying that to ignore an ordinary on-disk folder that gives you no access per ls -ld?
I tried a chmod 0 on one, and it was ignored, except for the lstat that’s done to exclude block devices.
This check is not logged, but the filtering is, e.g. at About --> Show log --> Live --> Verbose. Example:

$ ls -l /home/erik
total 8
-rw-r--r-- 1 root root    5 May 13 20:04 ordinary_file
d--------- 2 root root 4096 May 13 20:06 pCloudDrive
$ ls -ld /home/erik/pCloudDrive
d--------- 2 root root 4096 May 13 20:06 /home/erik/pCloudDrive
$ stat /home/erik/pCloudDrive
  File: '/home/erik/pCloudDrive'
  Size: 4096      	Blocks: 8          IO Block: 4096   directory
Device: 801h/2049d	Inode: 1106553     Links: 2
Access: (0000/d---------)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-05-13 20:06:51.014037935 -0400
Modify: 2020-05-13 20:06:51.014037935 -0400
Change: 2020-05-13 20:06:51.014037935 -0400
 Birth: -
$ ls /home/erik/pCloudDrive
ls: cannot open directory '/home/erik/pCloudDrive': Permission denied
$ 
From live log of a backup of /home/erik with an
Edit as text filter of -/home/erik/pCloudDrive/
(the live log uses reverse-chronological order)

May 15, 2020 11:42 AM: Backend event: Put - Completed: duplicati-20200515T154249Z.dlist.zip (707 bytes)
May 15, 2020 11:42 AM: Backend event: Put - Started: duplicati-20200515T154249Z.dlist.zip (707 bytes)
May 15, 2020 11:42 AM: Backend event: Put - Completed: duplicati-ib65103ee0d37485db0a94929c2a982f0.dindex.zip (652 bytes)
May 15, 2020 11:42 AM: Backend event: Put - Started: duplicati-ib65103ee0d37485db0a94929c2a982f0.dindex.zip (652 bytes)
May 15, 2020 11:42 AM: Backend event: Put - Completed: duplicati-bce9635ccfddc4e33850f96b2bee5ba4f.dblock.zip (998 bytes)
May 15, 2020 11:42 AM: Backend event: Put - Started: duplicati-bce9635ccfddc4e33850f96b2bee5ba4f.dblock.zip (998 bytes)
May 15, 2020 11:42 AM: New file /home/erik/ordinary_file
May 15, 2020 11:42 AM: Checking file for changes /home/erik/ordinary_file, new: True, timestamp changed: True, size changed: True, metadatachanged: True, 5/14/2020 12:04:42 AM vs 1/1/0001 12:00:00 AM
May 15, 2020 11:42 AM: Adding directory /home/erik/
May 15, 2020 11:42 AM: Including path as no filters matched: /home/erik/ordinary_file
May 15, 2020 11:42 AM: Excluding path due to filter: /home/erik/pCloudDrive/ => (/home/erik/pCloudDrive/)
May 15, 2020 11:42 AM: Including source path: /home/erik/ 

The low-level way to see what’s going on at Linux system call level is with strace. That may be next.
Stepping up a level, knowing your results on the ls and stat tests I showed above might be useful.
Is Duplicati running as root or as the /home/erik user? Please test access with whatever Duplicati is.

The live log can give you a better view of the Duplicati-level error too, by clicking on the line of error.
Here’s the error if I stop filtering out /home/erik/pCloudDrive. After the one-line summary are details:

May 15, 2020 12:00 PM: Error reported while accessing file: /home/erik/pCloudDrive/
{"ClassName":"System.UnauthorizedAccessException","Message":"Access to the path '/home/erik/pCloudDrive' is denied.","Data":null,"InnerException":{"ClassName":"System.IO.IOException","Message":"Permission denied","Data":null,"InnerException":null,"HelpURL":null,"StackTraceString":null,"RemoteStackTraceString":null,"RemoteStackIndex":0,"ExceptionMethod":null,"HResult":13,"Source":null},"HelpURL":null,"StackTraceString":"  at System.IO.Enumeration.FileSystemEnumerator`1[TResult].CreateDirectoryHandle (System.String path, System.Boolean ignoreNotFound) <0x7f31d22b7490 + 0x000bb> in <285579f54af44a2ca048dad6be20e190>:0 \n  at System.IO.Enumeration.FileSystemEnumerator`1[TResult]..ctor (System.String directory, System.IO.EnumerationOptions options) <0x7f31d22b70e0 + 0x00101> in <285579f54af44a2ca048dad6be20e190>:0 \n  at System.IO.Enumeration.FileSystemEnumerable`1+DelegateEnumerator[TResult]..ctor (System.IO.Enumeration.FileSystemEnumerable`1[TResult] enumerable) <0x7f31d22b5740 + 0x00045> in <285579f54af44a2ca048dad6be20e190>:0 \n  at System.IO.Enumeration.FileSystemEnumerable`1[TResult]..ctor (System.String directory, System.IO.Enumeration.FileSystemEnumerable`1+FindTransform[TResult] transform, System.IO.EnumerationOptions options) <0x7f31d22b5480 + 0x000e3> in <285579f54af44a2ca048dad6be20e190>:0 \n  at System.IO.Enumeration.FileSystemEnumerableFactory.UserDirectories (System.String directory, System.String expression, System.IO.EnumerationOptions options) <0x7f31d22b61d0 + 0x00116> in <285579f54af44a2ca048dad6be20e190>:0 \n  at System.IO.Directory.InternalEnumeratePaths (System.String path, System.String searchPattern, System.IO.SearchTarget searchTarget, System.IO.EnumerationOptions options) <0x7f31d2286360 + 0x00084> in <285579f54af44a2ca048dad6be20e190>:0 \n  at System.IO.Directory.GetDirectories (System.String path, System.String searchPattern, System.IO.EnumerationOptions enumerationOptions) <0x7f31d2286160 + 0x00040> in <285579f54af44a2ca048dad6be20e190>:0 \n  at System.IO.Directory.GetDirectories (System.String path) <0x7f31d2286040 + 0x00052> in <285579f54af44a2ca048dad6be20e190>:0 \n  at Duplicati.Library.Snapshots.SnapshotBase.ListFolders (System.String localFolderPath) [0x00000] in <e590a2a7b22945658e3565a2bb0865c2>:0 \n  at Duplicati.Library.Utility.Utility+<EnumerateFileSystemEntries>d__20.MoveNext () [0x00113] in <3a02d8196bff4b0fabc92721476cc6a0>:0 ","RemoteStackTraceString":null,"RemoteStackIndex":0,"ExceptionMethod":null,"HResult":-2147024891,"Source":"mscorlib"}

Is this pCloud Drive for Linux? If not, what is it? That may impact how hard it is for someone to test it.

@ts678 Yes, pCloud for Linux on Ubuntu 20.04.

Some results…

The chmod 0 behavior is the same for me as what you’ve described. Duplicati is running as a service (root).

As far as stat

As the user running pCloud:

erik@havok 11:09 ~
$ stat pCloudDrive/
  File: pCloudDrive/
  Size: 4096      	Blocks: 1          IO Block: 4096   directory
Device: 42h/66d	Inode: 1           Links: 11
Access: (0755/drwxr-xr-x)  Uid: ( 1000/    erik)   Gid: ( 1000/    erik)
Access: 2020-05-08 23:34:26.000000000 -0700
Modify: 2020-05-08 23:34:26.000000000 -0700
Change: 2020-05-08 23:34:26.000000000 -0700
 Birth: -

As root:

root@havok 11:09 ~
# stat /home/erik/pCloudDrive 
stat: cannot stat '/home/erik/pCloudDrive': Permission denied

The log data:

2020-05-15 11:28:37 -07 - [Warning-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-FileAccessError]: Error reported while accessing file: /home/erik/pCloudDrive/
System.UnauthorizedAccessException: Access to the path '/home/erik/pCloudDrive' is denied. ---> System.IO.IOException: Permission denied
   --- End of inner exception stack trace ---
  at System.IO.FileStatus.EnsureStatInitialized (System.ReadOnlySpan`1[T] path, System.Boolean continueOnError) [0x00044] in <12b418a7818c4ca0893feeaaf67f1e7f>:0
  at System.IO.FileStatus.GetAttributes (System.ReadOnlySpan`1[T] path, System.ReadOnlySpan`1[T] fileName) [0x00000] in <12b418a7818c4ca0893feeaaf67f1e7f>:0
  at System.IO.FileSystemInfo.get_Attributes () [0x0001c] in <12b418a7818c4ca0893feeaaf67f1e7f>:0
  at System.IO.FileSystem.GetAttributes (System.String fullPath) [0x00009] in <12b418a7818c4ca0893feeaaf67f1e7f>:0
  at System.IO.File.GetAttributes (System.String path) [0x00007] in <12b418a7818c4ca0893feeaaf67f1e7f>:0
  at Duplicati.Library.Snapshots.SnapshotBase.GetAttributes (System.String localPath) [0x00007] in <e590a2a7b22945658e3565a2bb0865c2>:0
  at Duplicati.Library.Utility.Utility+<EnumerateFileSystemEntries>d__20.MoveNext () [0x00144] in <3a02d8196bff4b0fabc92721476cc6a0>:0
2020-05-15 11:28:37 -07 - [Warning-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-PathProcessingError]: Failed to process path: /home/erik/pCloudDrive/
System.UnauthorizedAccessException: Permission denied ---> Mono.Unix.UnixIOException: Permission denied [EACCES].
   --- End of inner exception stack trace ---
  at Mono.Unix.UnixMarshal.ThrowExceptionForLastError () [0x00005] in <4ff7dabffd4741b3888f0180ff944c2e>:0
  at Mono.Unix.UnixFileSystemInfo.GetFileSystemEntry (System.String path) [0x0000f] in <4ff7dabffd4741b3888f0180ff944c2e>:0
  at UnixSupport.File.GetFileType (System.String path) [0x00001] in <3a2c307381104a73ba105ad3712d666f>:0
  at Duplicati.Library.Snapshots.NoSnapshotLinux.IsBlockDevice (System.String localPath) [0x00006] in <e590a2a7b22945658e3565a2bb0865c2>:0
  at Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess.AttributeFilter (System.String path, System.IO.FileAttributes attributes, Duplicati.Library.Snapshots.ISnapshotService snapshot, Duplicati.Library.Utility.IFilter sourcefilter, Duplicati.Library.Main.Options+HardlinkStrategy hardlinkPolicy, Duplicati.Library.Main.Options+SymlinkStrategy symlinkPolicy, System.Collections.Generic.Dictionary`2[TKey,TValue] hardlinkmap, System.IO.FileAttributes fileAttributes, Duplicati.Library.Utility.IFilter enumeratefilter, System.String[] ignorenames, System.Collections.Generic.Queue`1[T] mixinqueue) [0x00000] in <d35eee4c56ca4ec6b30033967e3dac7d>:0
2020-05-15 11:28:38 -07 - [Warning-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-FileAccessError]: Error reported while accessing file: /home/erik/pCloudDrive/
System.UnauthorizedAccessException: Access to the path '/home/erik/pCloudDrive' is denied. ---> System.IO.IOException: Permission denied
   --- End of inner exception stack trace ---
  at System.IO.FileStatus.EnsureStatInitialized (System.ReadOnlySpan`1[T] path, System.Boolean continueOnError) [0x00044] in <12b418a7818c4ca0893feeaaf67f1e7f>:0
  at System.IO.FileStatus.GetAttributes (System.ReadOnlySpan`1[T] path, System.ReadOnlySpan`1[T] fileName) [0x00000] in <12b418a7818c4ca0893feeaaf67f1e7f>:0
  at System.IO.FileSystemInfo.get_Attributes () [0x0001c] in <12b418a7818c4ca0893feeaaf67f1e7f>:0
  at System.IO.FileSystem.GetAttributes (System.String fullPath) [0x00009] in <12b418a7818c4ca0893feeaaf67f1e7f>:0
  at System.IO.File.GetAttributes (System.String path) [0x00007] in <12b418a7818c4ca0893feeaaf67f1e7f>:0
  at Duplicati.Library.Snapshots.SnapshotBase.GetAttributes (System.String localPath) [0x00007] in <e590a2a7b22945658e3565a2bb0865c2>:0
  at Duplicati.Library.Utility.Utility+<EnumerateFileSystemEntries>d__20.MoveNext () [0x00144] in <3a02d8196bff4b0fabc92721476cc6a0>:0
2020-05-15 11:28:38 -07 - [Warning-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-PathProcessingError]: Failed to process path: /home/erik/pCloudDrive/
System.UnauthorizedAccessException: Permission denied ---> Mono.Unix.UnixIOException: Permission denied [EACCES].
   --- End of inner exception stack trace ---
  at Mono.Unix.UnixMarshal.ThrowExceptionForLastError () [0x00005] in <4ff7dabffd4741b3888f0180ff944c2e>:0
  at Mono.Unix.UnixFileSystemInfo.GetFileSystemEntry (System.String path) [0x0000f] in <4ff7dabffd4741b3888f0180ff944c2e>:0
  at UnixSupport.File.GetFileType (System.String path) [0x00001] in <3a2c307381104a73ba105ad3712d666f>:0
  at Duplicati.Library.Snapshots.NoSnapshotLinux.IsBlockDevice (System.String localPath) [0x00006] in <e590a2a7b22945658e3565a2bb0865c2>:0
  at Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess.AttributeFilter (System.String path, System.IO.FileAttributes attributes, Duplicati.Library.Snapshots.ISnapshotService snapshot, Duplicati.Library.Utility.IFilter sourcefilter, Duplicati.Library.Main.Options+HardlinkStrategy hardlinkPolicy, Duplicati.Library.Main.Options+SymlinkStrategy symlinkPolicy, System.Collections.Generic.Dictionary`2[TKey,TValue] hardlinkmap, System.IO.FileAttributes fileAttributes, Duplicati.Library.Utility.IFilter enumeratefilter, System.String[] ignorenames, System.Collections.Generic.Queue`1[T] mixinqueue) [0x00000] in <d35eee4c56ca4ec6b30033967e3dac7d>:0

One thing I didn’t mention before is that it’s trying to access the folder 4 times, so I get 4 warnings per backups.

Let me know if there’s anything else I can try out.

Thanks,

E.

The problem appears to come from pCloudDrive (and possibly other FUSE file systems) not following:

stat, fstat, lstat, fstatat - get file status

       These  functions return information about a file, in the buffer pointed to by statbuf.  No
       permissions are required on the file itself, but—in the case  of  stat(),  fstatat(),  and
       lstat()—execute (search) permission is required on all of the directories in pathname that
       lead to the file.

and I’m pretty sure an lstat done by the IsBlockDevice is where the error is, and this is before the filter is applied. I’m not certain the order has to be that way, but it’s been that way for a very long time.

What’s interesting is that many Linux systems do have a FUSE file system, specifically GVfs, creating /run/user/1000/gvfs for my user 1000. This denies root access, and one can see the error exactly:

# ls --color=never /run/user/1000
dconf  gvfs  keyring  pulse  systemd
# ls /run/user/1000
ls: cannot access '/run/user/1000/gvfs': Permission denied
dconf  gvfs  keyring  pulse  systemd
# stat /run/user/1000/gvfs
stat: cannot stat '/run/user/1000/gvfs': Permission denied
# strace -f -e trace=file -P /run/user/1000/gvfs stat /run/user/1000/gvfs
lstat("/run/user/1000/gvfs", 0x7ffe62aa0440) = -1 EACCES (Permission denied)
stat: cannot stat '/run/user/1000/gvfs': Permission denied
+++ exited with 1 +++
# 

User 1000 has an easy time, although there’s not really anything to see (possibly unlike pCloudDrive):

$ ls -ln /run/user/1000
total 0
drwx------ 2 1000 1000  60 Mar 29 19:36 dconf
dr-x------ 2 1000 1000   0 Feb 29 20:06 gvfs
drwx------ 2 1000 1000 100 Feb 29 20:06 keyring
drwx------ 2 1000 1000  80 Feb 29 20:06 pulse
drwxr-xr-x 2 1000 1000  80 Feb 29 20:06 systemd
$ ls -l /run/user/1000/gvfs
total 0
$ 

pCloud has a console client at https://github.com/pcloudcom/console-client which has an option for

-m [ --mountpoint ] arg Mount point where drive to be mounted.

Does pCloudDrive provide a similar option? That would allow the pCloudDrive folder to be below a folder on disk whose purpose was to allow filter at that level, where file checking by root is allowed.

Alternatively, if Duplicati is run as you rather than as root, it will probably be able to filter “as usual”.

Alternatively, it’s possible to grant root access to FUSE filesystems but it needs application support.
fuse - format and options for the fuse file systems describes the allow_root and allow_other options.

Maybe pCloud support could answer these questions. All you need is enough root access to lstat. Further access to files as root is not needed, I assume, because you’re trying to exclude that folder.

@ts678 Nice detective work. Thanks.

As far as I can tell the pCloudDrive directory is not relocatable, that’s the first thing I looked at. Unfortunately, the pCloud CLI client does not provide some of the functionality that I need.

Running Duplicati as me is not an option either. I need it to backup root-only accessible directories too. I could run two instances, but I’d have to split the backups, which is not ideal either.

I understand the historic aspect of how it works internally, but I have to ask myself how efficient that is. If Duplicati is ignoring filters when scanning directories/files (as indicated elsewhere on the forum), it is potentially scanning thousands of items that it does not need to.

I think we’ve entered “not ideal” territory. If ideal, lstat would work as Linux documents it should work.
Duplicati volunteer levels are also far from ideal, and the focus is on bugs that have the most impact.

Somewhat surprisingly, this is the first forum or GitHub issues report like this I find (feel free to look).
Although pCloudDrive is possibly rather rare, GVfs is not. Maybe people filtered it at a higher folder?

What’s also not ideal is getting too risky about breaking block devices which tend to be very precious:

Got a link? I think it’s wrong, otherwise the normal-subholder workaround for FUSE bug wouldn’t work because the theory says (right?) that it would keep going down, and eventually fail lstat. But it doesn’t, tested with -/run/user/1000/ where it’s also very clear in –console-log-level=Verbose log where it is:

2020-05-17 08:17:26 -04 - [Information-Duplicati.Library.Main.Controller-StartingOperation]: The operation Backup has started
2020-05-17 08:17:26 -04 - [Verbose-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-IncludingSourcePath]: Including source path: /run/user/
2020-05-17 08:17:26 -04 - [Verbose-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-ExcludingPathFromFilter]: Excluding path due to filter: /run/user/1000/ => (/run/user/1000/)
2020-05-17 08:17:26 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Started:  ()
2020-05-17 08:17:26 -04 - [Information-Duplicati.Library.Main.BasicResults-BackendEvent]: Backend event: List - Completed:  (13 bytes)
2020-05-17 08:17:26 -04 - [Verbose-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-IncludingSourcePath]: Including source path: /run/user/
2020-05-17 08:17:26 -04 - [Verbose-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-ExcludingPathFromFilter]: Excluding path due to filter: /run/user/1000/ => (/run/user/1000/)
2020-05-17 08:17:26 -04 - [Verbose-Duplicati.Library.Main.Operation.Backup.MetadataPreProcess.FileEntry-AddDirectory]: Adding directory /run/user/

With a more normal folder, you should easily be able to examine for yourself whether this claim is true:

When I tested that with Duplicati as root, backing up /home and excluding the main user folder, it said:

Excluding path due to filter

at the excluded folder and went no further down. BTW the descent likely uses stat to recognize folders.

Filters which I consider more authoritative than a random forum post (though still far from perfect) says:

Duplicati’s filter engine processes folders first then files. The reason for that behavior is performance. If a folder is excluded from a backup, the files inside that folder don’t have to be processed anymore.