Filters not working

Shane · March 12, 2021, 4:00am

on a windows VM. I manually add folders to the filter using text.

duplicati filters

but looking at the config the folder names are incorrect.

"Filters": [
  {
    "Order": 0,
    "Include": false,
    "Expression": "C:\\vzquota\\"
  },
  {
    "Order": 1,
    "Include": false,
    "Expression": "C:\\Sphinx\\"
  }

any thoughts?

Thanks

Shane

ts678 · March 15, 2021, 2:52pm

How so, and according to what?

JSON String Escape / Unescape may be helpful here.

Shane · March 15, 2021, 10:15pm

Hmm I thought the double backslash was invalid but it seems to be ok. Any idea as to why the folders are still trying to be backed up?

Thanks

Shane

ts678 · March 16, 2021, 1:43pm

Are there any include (+) lines in the filter that can include them?
Mixing includes and excludes in a set of filters can get confusing.
The TEST-FILTERS command may be helpful to explore filtering.
Note that Windows Command Prompt may need paths to go inside double quotes, and any trailing backslash must be doubled to be backslash, otherwise next-character closing quote is literal quote.
Although the GUI dropdown’s filter-builder is buggy in 2.0.5.1, you could click-exclude folders in tree.

Danny_Pike · March 18, 2021, 9:50am

I’m new to Duplicati but I had endless problems with filtering named directories. In the end, I gave up and swapped to using regular expressions to match against the full path names, using wildcards for the filename part.

That seemed to go a lot better, here’s one of my filters:

+[D:\\root\\[^\\]*$]
+[D:\\root\\((addresses)|(Administration)|(info)|(personal)|(serial)|(voicemail))\\.*$]

This makes Duplicati back up all the files in the “d:\root” folder itself and all of the files in the sub-folders “addresses”, “Administration”, “info”, “personal”, “serial” and “voicemail”.

It ignores all of the other sub-folders in “d:\root” on the machine (there are 22 of them).

Hope that helps,

Dan

ts678 · March 18, 2021, 1:34pm

Welcome to the forum @Danny_Pike and thanks for your help. The filters can definitely be hard to write.

Another note I’ll mention is that in 2.0.5.1 the GUI filter builder puts out some bad ones, so using the three-dot Edit as text option and reading Filters may help, but I think everyone here so far is using text edit.

Another caveat is that people who are used to how wildcard * works in something like a Linux shell might be surprised to find that filter * (or the regular expression equivalent) doesn’t stop at slash as they expect.

Having said that, I’m having success with test-filters like in original post, so would encourage testing. Possibly knowing the rest of the context (what’s being backed up, are there any include filters?) may help.

Although it may be easier to use test-filters, one can also use logging at Verbose level to see backup filtering decisions. Look for output with Including and Excluding. Here’s an example of a folder exclude:

2021-03-16 20:03:31 -04 - [Verbose-Duplicati.Library.Main.Operation.Backup.FileEnumerationProcess-ExcludingPathFromFilter]: Excluding path due to filter: C:\backup source\exclude test\sub2\ => (@C:\backup source\exclude test\sub2\)

That was from a log-file with suitable log-file-log-level, but one might also watch About → Show log → Live.

Danny_Pike · March 18, 2021, 4:25pm

Thank you. Getting file name handling working properly is not as trivial as it may seem at first sight. I think that Duplicati is missing some “degrees of freedom” in the way that it attempts to do it. This causes it to get in a mess because it becomes impossible to distinguish between equally valid but logically different situations.

For example, I would separate the concept of the whole path from an set of sub-folders, because sometimes I want to filter on one the whole thing and other times I would like to filter on a subtree.

I would like Duplicati to allow me to say whether the file name should be case-sensitive (Unix) or insensitive (Windows) when matching. That also should apply to the Restore window. It is confusing to me that two files can be listed far apart when, in fact, they should be listed next to each other, simply because one begins with an upper-case letter and the other with a lower-case letter. Duplicati is correct in that it must always use the case exactly as it is supplied by the operating system but, for Windows users, the list should be case-insensitive.

This occasionally causes problems for me because, in my world, the backup destination is a Unix file system and the Source is a Windows file system. So there is room for a lot of errors when the same file can end up having two names by accident and it’s far from obvious when those two files are listed a long way apart in the Restore window. Duplicati can’t fix the underlying problem (it’s obviously my error) but it could make my life a lot easier by listing the two files right next to each other. I can then fix the underlying problem myself (case-sensitivity typos in two different scripts, for example) and ensure that I restore the one that I really want (typically the most recent) and delete the other one from the backup image.

Another thing that I noticed is that Duplicati’s regular expression matcher appears to match against the whole path. I think that this is wrong. If I give a regular expression, it should match any substring in the path. If I wanted to match the whole path, then I would have to include the ‘^’ and ‘$’ tokens, as appropriate.

This is only the start on a huge and complex topic, but this reply is already very long. The above ones are those that stuck in my mind as being the most confusing when I first tried to get it to work.

Perhaps you could start a whole topic / test suite for matching filters and I’d be happy to contribute to that? I have a little bit of experience in coding this sort of thing over the years (and not a little bit of OCD for supporting all of the various possibilities!)

ts678 · March 19, 2021, 12:49am

Sort file and folder listing in web UI #4150 forces a sort order, whereas it seems it used to be whatever the underlying API did. Whether or not it’s what you like is TBD. Windows File Explorer keeps upper and lower cases together, and that’s what I’m seeing at least one Duplicati screen doing. It’s in Canary and next Beta.

I don’t know what you mean. Anyone can start a forum topic. How exactly would tests get run, and by who?

I would love it if somebody could enhance the Filters article and Creating a new backup job which has only brief information. It would be good to get a bit more in-depth and tie the two sections together to help users.

A problem is that I’m not a filters expert either, so somebody may have to figure out how some things work. Some of the hints that I put in this topic could be verified, then put somewhere in the formal documentation.

Anyone can also submit enhancement requests to GitHub Issues or Forum Features category for visibility. Actually getting enhancements done requires an available volunteer who can do it, and those are scarce…

EDIT:
https://github.com/duplicati/duplicati/blob/master/Duplicati/UnitTest/FilterTest.cs and other tests exist and I believe are run automatically when code change requests go in. Would you like to work on improving test?

warwickmm · March 19, 2021, 4:36am

Different users may have different expectations. I believe the sort order in the Restore window is respecting culture-sensitive rules defined by the current culture of your system. For me, on Linux with Duplicati 2.0.5.114 and mono 6.12.0, the sort order for en-US appears to be case-insensitive

Danny_Pike · March 19, 2021, 5:34am

Your points are reasonable but please be assured that I’m not trying to criticise Duplicati for the “wrong” reasons. It’s easily the best backup system that I have found since Crashplan made the mistake of trying to force 2FA on me (that so-called “security” system is a real PITA to use).

I’m very new here - I’ve been using Duplicati for less than a week. So my experience is that of someone who is very familiar with Windows but not at all familiar with Duplicati. And my comments should be taken with that in mind. We needed a backup system quickly, so I’m in the middle of a research-review cycle (Duplicati is, so far, winning hands-down).

As for setting up a topic myself, and helping out with documentation, coding etc, I’d be very happy to help once I know how it all works. That takes a little bit more time than I have yet had to spend using it. Give me a month or two of living with it, and I’ll see what I can help out with.

Danny_Pike · March 19, 2021, 5:47am

Yes, absolutely. Creating a mechanism that works for both Unix and Windows is most definitely not trivial.

As a coder, I like to have fine control, with loads of options to choose between sorting methods, but I know that annoys a lot of people. The best technique seems to be to make it use the same config as the operating system and allow the user to override that. But, as I said above, I’m still too new to Duplicati to state that that would be the right way in this case. I was only saying how it felt to this particular “new user”.