Bug Regular Expression Filter operator '|' (or) does not work inside parenthesis ()

(x|y) should match x OR y. The ‘|’ does not work in duplicati on ubuntu (I run the docker linuxserver)

Acconding to docs Regular expressions should follow the .net syntax:
https://docs.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference

example with
alias duplicati='docker exec -it duplicati mono /app/duplicati/Duplicati.CommandLine.exe'

This excludes everthing: duplicati test-filters /mnt/ --exclude='[/(m)nt/.*]'

This excludes nothing: duplicati test-filters /mnt/ --exclude='[/(m|)nt/.*]'
nor does this: duplicati test-filters /mnt/ --exclude='[/(m|aaa)nt/.*]'

Shouldn’t that work?

I have two layers of (x|y|z) and (a|b|c) in my expression so that will explode to 9 cases which is a pain.

Best Alex

PS: Edit after ts678 response: The * is missing in the last two examples, but its there in the raw post. I guess the forum also messes with regExp to confuse the user :slight_smile:
PPS: Now with proper markdown (thx kenkendk)

Welcome to the forum @arberg

I think you dropped an essential * at the end of the RE in the last two, but even after a fix, it’s acting odd.
Here’s a test I made by touch of three files, one with a vertical bar embedded in it. Notice how it found it:

$ ls -ln /tmp/sub
total 0
-rw-r--r-- 1 1000 1000 0 Sep  4 21:09 1
-rw-r--r-- 1 1000 1000 0 Sep  4 21:09 1|2
-rw-r--r-- 1 1000 1000 0 Sep  4 21:09 2
$ duplicati-cli test-filters /tmp/sub --exclude='[.*(1|2).*]'
Including source path: /tmp/sub/
Including path as no filters matched: /tmp/sub/1
Including path as no filters matched: /tmp/sub/2
Excluding path due to filter: /tmp/sub/1|2 => ([.*(1\|2).*])
Matched 2 files (0 bytes)
$ duplicati-cli test-filters /tmp/sub --include='[.*(1|2).*]'
Including source path: /tmp/sub/
Excluding path due to filter: /tmp/sub/1 => null
Excluding path due to filter: /tmp/sub/2 => null
Including path due to filter: /tmp/sub/1|2 => ([.*(1\|2).*]) || (*/)
Matched 1 files (0 bytes)
$ 

Windows doesn’t backslash-escape the vertical bar. I wonder if something is trying to avoid Linux pipes?

That is the Markdown that eats it. Best way to fix it is to use triple backticks to format a code block.

Ah, thank you, that is a likely problem, that something messes with the pipe char.

I did a test that tells me you are right:

docker exec -it duplicati mono /app/duplicati/Duplicati.CommandLine.exe test-filters

/mnt/user/media/TestDuplicati --exclude='[/mnt/user/media/TestDuplicati/(Test1\|Test2)/.*]'
Including source path: /mnt/user/media/TestDuplicati/
Excluding path due to filter: /mnt/user/media/TestDuplicati/Test2/ => ([/mnt/user/media/TestDuplicati/(Test1\\|Test2)/.*])

The input filter has only one backslash on the pipe \| but the output has two \\|

I tried directly in the UI an there the | worked.

I have created an issue on GitHub.

Docs
I have updated help here in section Syntax, plus added an advanced section How do filters work? · duplicati/duplicati Wiki · GitHub

I think the info under Syntax in the section 'Folder names always end with a slash ’ is incorrect. Are they talking about source set or filters? Its certainly not relevant for regular expressions.

I rarely use filters, and may be wrong, but the following indicates slash is relevant for regular expressions:

The best rule for “but nothing else” is a regular expression that excludes all files. It is -[.*[^/]] on Linux or Mac, and on Windows the rule is -[.*[^\\]]. The rule says “exclude everything that is not a folder”.

Being somewhat more familiar with internals such as the database File table, the Path there follows this backslash rule, with a trailing backslash (of the proper direction for the OS) only applied if it’s a folder, so filter rule syntax might be using the same mechanism to distinguish even if you don’t see the backslash.

I have a good understanding of the how the filters works in duplicati by, though it took some time. Especially because I suck at taking the few minutes to actually properly read the docs :slight_smile: I wrote the Advanced section under filters help because that gives a much better understanding (in my view) of how the filtering works in duplicati, and also illustrates exactly what you pointed to by excluding all remaining files - well almost, because I exclude all files in top dirs.

I also just added a link in the docs to online regexp builder, that helps enourmously when RegExp semantics and my brain is out of sync.

Can you try adding --debug-output to the commandline? This should show what input reaches Duplicati, so we know if the substitution happens before Duplicati is invoked.

Backslash is early and generic. –parameters-file looked for the backslashed version. –log-file wrote to one. Had the backslash both when passing vertical bar in an option, or in the non-option portion of a command. Finally I did test below which appears to suggest that backslash is added when starting the child Duplicati:

$ duplicati-cli help '1|2'         
Topic not found: 1\|2


See duplicati.commandline.exe help <topic> for more information.
  General: example, changelog
  Commands: backup, find, restore, delete, compact, test, compare, purge, vacuum
  Repair: repair, affected, list-broken-files, purge-broken-files
  Debug: debug, logging, create-report, test-filters, system-info, send-mail
  Targets: tahoe, amzcd, aftp, hubic, googledrive, gcs, rclone, jottacloud, mega, ftp, s3, openstack, b2, cloudfiles, webdav, dropbox, azure, od4b, mssp, box, file, ssh, msgroup, onedrive, onedrivev2, sharepoint, sia
  Modules: aes, gpg, zip, 7z, console-password-input, mssql-options, hyperv-options, http-options, sendhttp, sendmail, runscript, sendxmpp, check-mono-ssl
  Formats: date, time, size, encryption, compression
  Advanced: mail, advanced, returncodes, filter, filter-groups, <option>

http://www.duplicati.com/              Version:  - 2.0.4.5_beta_2018-11-28




$ AUTOUPDATER_Duplicati_SKIP_UPDATE=true duplicati-cli help '1|2'
Topic not found: 1|2


See duplicati.commandline.exe help <topic> for more information.
  General: example, changelog
  Commands: backup, find, restore, delete, compact, test, compare, purge, vacuum
  Repair: repair, affected, list-broken-files, purge-broken-files
  Debug: debug, logging, create-report, test-filters, system-info, send-mail
  Targets: tahoe, amzcd, aftp, hubic, googledrive, gcs, rclone, jottacloud, mega, ftp, s3, openstack, b2, cloudfiles, webdav, dropbox, azure, od4b, mssp, box, file, ssh, msgroup, onedrive, onedrivev2, sharepoint, sia
  Modules: aes, gpg, zip, 7z, console-password-input, mssql-options, hyperv-options, http-options, sendhttp, sendmail, runscript, sendxmpp, check-mono-ssl
  Formats: date, time, size, encryption, compression
  Advanced: mail, advanced, returncodes, filter, filter-groups, <option>

http://www.duplicati.com/              Version:  - 2.0.4.5_beta_2018-11-28




$ 

EDIT: Maybe sequence like the below is happening. I don’t have a Linux debugger, so can’t test this well::

EDIT 2:

The correct handling isn’t obvious. The escaping was added as a bug fix. Right parties can look at these:

Fixed the updater to use the argument escaping code from utilities.

Autoupdater process spawning error arguments #2891

I tried --debug-output, but it seems it doesn’t include the filters.

$ docker exec duplicati mono /app/duplicati/Duplicati.CommandLine.exe test-filters /mnt/user/media/TestDuplicati/A/AA --exclude='[/mnt/user/media/TestDuplicati/A]'  --debug-output
Input command: test-filters
Input arguments:
        /mnt/user/media/TestDuplicati/A/AA

Input options:
debug-output:

Including source path: /mnt/user/media/TestDuplicati/A/AA/
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AA/.disk3.md5
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AA/.user.md5
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AA/@Minecraft Install.lnk
Matched 3 files (868 bytes)

Ah, intersteting with the --parameters-file. However I couldn’t get it to work for me, none of my magic encantations below worked (the cat illustrates that the file is there readable from within the docker):

alex@Tower: /mnt/user/home/alex/duplicati
$  docker exec duplicati mono /app/duplicati/Duplicati.CommandLine.exe test-filters --parameters-file=/mnt/user/home/alex/duplicati/testFilterParamFile /mnt/user/media/TestDuplicati/A
Including source path: /mnt/user/media/TestDuplicati/A/
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AA/
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AB/
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AB/.disk3.md5
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AB/.user.md5
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AB/@Minecraft Install.lnk
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AA/.disk3.md5
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AA/.user.md5
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AA/@Minecraft Install.lnk
Matched 6 files (1.70 KB)

$  docker exec duplicati cat /mnt/user/home/alex/duplicati/testFilterParamFile
--source=/mnt/user/media/TestDuplicati/
--exclude='[/mnt/user/media/TestDuplicati/(A|Test1)/]'alex@Tower: /mnt/user/home/alex/duplicati

$  docker exec duplicati mono /app/duplicati/Duplicati.CommandLine.exe test-filters --parameters-file=/mnt/user/home/alex/duplicati/testFilterParamFile
No source paths given

$  docker exec duplicati mono /app/duplicati/Duplicati.CommandLine.exe --parameters-file=/mnt/user/home/alex/duplicati/testFilterParamFile test-filters
No source paths given

also --log-file=/mnt/user/home/alex/duplicati/dup.log gives me an empty log file.

I didn’t quite understand your post, maybe you just refered to a previous issue, I though I could get log.

My --parameters-file test and --log-file tests were just diagnostics to see what file was used. Testing with --parameters-file would show which file (backslashed or not) would be used for input, --log-file for output.

Actually using the --parameters files to supply the filter works, but you don’t quote, and need source path:

$ cat no_mnt
--exclude=[/(m|)nt/.*]
$ duplicati-cli test-filters /mnt/ --parameters-file=no_mnt
Including source path: /mnt/
Excluding path due to filter: /mnt/root-filesystem/ => ([/(m|)nt/.*])
Excluding path due to filter: /mnt/Shared_with_VM/ => ([/(m|)nt/.*])
Excluding path due to filter: /mnt/ramdisk/ => ([/(m|)nt/.*])
Excluding path due to filter: /mnt/dev-sda1/ => ([/(m|)nt/.*])
Excluding path due to filter: /mnt/tmp/ => ([/(m|)nt/.*])
Matched 0 files (0 bytes)
$ AUTOUPDATER_Duplicati_SKIP_UPDATE=true duplicati-cli test-filters /mnt/ --exclude='[/(m|)nt/.*]'
Including source path: /mnt/
Excluding path due to filter: /mnt/root-filesystem/ => ([/(m|)nt/.*])
Excluding path due to filter: /mnt/Shared_with_VM/ => ([/(m|)nt/.*])
Excluding path due to filter: /mnt/ramdisk/ => ([/(m|)nt/.*])
Excluding path due to filter: /mnt/dev-sda1/ => ([/(m|)nt/.*])
Excluding path due to filter: /mnt/tmp/ => ([/(m|)nt/.*])
Matched 0 files (0 bytes)
$ 

Prefix way of setting a temporary environment variable can be long term (if you prefer it that way) by using export AUTOUPDATER_Duplicati_SKIP_UPDATE=true but you need export -n if you change your mind.

Arh, that gets parsed earlier because that is the only option that can be supplied more than once.

If you try:

docker exec duplicati mono /app/duplicati/Duplicati.CommandLine.exe test-filters /mnt/user/media/TestDuplicati/A/AA --invalid-name='[/mnt/user/media/TestDuplicati/(A|B)]'

Then it should show it under Input options (and warn that the option is not supported).

@kenkendk

Right, got it.

$ docker exec duplicati mono /app/duplicati/Duplicati.CommandLine.exe test-filters /mnt/user/media/TestDuplicati/A/AA --anyWrongName='[/mnt/user/media/TestDuplicati/(A|A)]' --debug-output
Input command: test-filters
Input arguments:
        /mnt/user/media/TestDuplicati/A/AA

Input options:
anywrongname: [/mnt/user/media/TestDuplicati/(A\|A)]
debug-output:

The supplied option --anywrongname is not supported and will be ignored
Including source path: /mnt/user/media/TestDuplicati/A/AA/
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AA/.disk3.md5
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AA/.user.md5
Including path as no filters matched: /mnt/user/media/TestDuplicati/A/AA/@Minecraft Install.lnk
Matched 3 files (868 bytes)
  # --log-file=/mnt/user/home/alex/duplicati/dup.log

So indeed, the pipe got backslashed before reaching duplicati debug printer.

On a side note, are you aware Win10 has a linux subsystem? That might make it a lot easier to test such stuff, if you are on Windows.

@ts678

Great, thanks. Now --parameter-file works for me. I suppose the --log-file is for log-output when a backup is made, its still empty with just test-filters.

alex@Tower: /mnt/user/home/alex/duplicati
$ cat ./testFilterParamFile
--exclude=[/mnt/user/media/TestDuplicati/(A|Test1)/]
alex@Tower: /mnt/user/home/alex/duplicati
$ duplicati test-filters /mnt/user/media/TestDuplicati --parameters-file=/mnt/user/home/alex/duplicati/testFilterParamFile
Including source path: /mnt/user/media/TestDuplicati/
Excluding path due to filter: /mnt/user/media/TestDuplicati/A/ => ([/mnt/user/media/TestDuplicati/(A|Test1)/])
Excluding path due to filter: /mnt/user/media/TestDuplicati/Test1/ => ([/mnt/user/media/TestDuplicati/(A|Test1)/])
Including path as no filters matched: /mnt/user/media/TestDuplicati/Test2/
Including path as no filters matched: /mnt/user/media/TestDuplicati/Test22/
Including path as no filters matched: /mnt/user/media/TestDuplicati/TestA/

I think it makes sense to add an example to the page Advanced Options - Duplicati 2 User's Manual though I cannot do that. I’ll probably add the --parameters-file info to the docs on filters tomorrow, its late now here, and my kids are pulling on me to be put to bed…

Best Alex

Getting log file created took a few tries. You can get a two-for-one test of the backslash escape with:

$ duplicati-cli restore '/tmp/no|thing' --log-file='1|2.log'
Restore started at 9/6/2019 4:52:10 PM

Enter encryption passphrase: 
  Listing remote folder ...
  Listing remote folder ...
  Listing remote folder ...
  Listing remote folder ...
  Listing remote folder ...

ErrorID: FolderMissing
The folder /tmp/no\|thing does not exist
Update "2.0.4.23_beta_2019-07-14" detected
$ ls -ln 1*.log
-rw-r--r-- 1 1000 1000 0 Sep  6 16:52 1\|2.log
$ 

That suggests that it is actually the shell/terminal that is doing the escaping. That means that we cannot really fix it inside Duplicati. I think @ts678 had the correct guess:

I am aware, but I am on macOS.

You can send PR’s to the manual from here:

Thx for the links, and info.

I don’t believe its the shell.
$docker exec duplicati echo ‘hi | hi’
hi | hi
There’s no | there. Linux doesn’t escape pipes (maybe unless told to).

Could it be mono (whatever that is?)

Not that it’s really that revelant for me, so its more up to you to decide if its worth investigating for the sake of Duplicati.

Hmm I just realized this is pretty identical to the filter test. This just indicates that the pipe reaches the duplicati option handler escaped. It doesn’t say anything about the cause. I wonder what a mono hello world example which just echo input would output.
https://www.mono-project.com/docs/getting-started/mono-basics/

[minor] Synology prepends backslash () to pipe (|) in (command line) password #3135 is a similar issue.

Is anyone noticing my Backslash is early and generic finding that this happens at first instant after start? Duplicati starts a child process that does all the actual work (except figuring out which update to launch) however child can be avoided with environment variable AUTOUPDATER_Duplicati_SKIP_UPDATE=true.

Don’t look at the later code analysis too hard. Testing shows somehow the end escaping isn’t happening however I’m kind of confused by the Regex. If that’s a character class, the vertical bars aren’t alternation. Tests, however, show the only character in the Regex that’s escaped is vertical bar. Needs more study…

That said, this still seems like a relatively not-super-critical bug compared to others. It may need an issue filed to keep track of it until whenever the time comes that the less critical bugs get their turn for attention.

Those two GitHub issues are definitely the same underlying cause (this and the one you just linked). I have linked the github issues with a comment.

Wow that’s huge indicator. I didn’t see the point before, to much text spammed my mind in the output.

Here some more

Here the problem is seen:
In the last, the variable is set to empty string/null, whatever that means in bash.

$ ANY_DUMMY=1 mono Duplicati.CommandLine.exe help '1|2'
Topic not found: 1\|2
$  AUTOUPDATER_Duplicati_SKIP_UPDATE= mono Duplicati.CommandLine.exe help '1|2'
Topic not found: 1\|2

On this code-path, the problem is avoided:

$ AUTOUPDATER_Duplicati_SKIP_UPDATE=true mono Duplicati.CommandLine.exe help '1|2'
Topic not found: 1|2
$  AUTOUPDATER_Duplicati_SKIP_UPDATE=1 mono Duplicati.CommandLine.exe help '1|2'
Topic not found: 1|2

So definitely this is a duplicati issue, its not a bash or mono issue. Something in duplicati messes the arguments, when AUTOUPDATER_Duplicati_SKIP_UPDATE isn’t running.

Aha, nice detective work!

Then this is the problematic call:

Since the updater spawns a new executable (to allow unloading and restarting on an update), it needs to pass the commandline arguments to the new process.

The escaping code is here:

Ah you found it, great! Then comes the next troubling question of what it was trying to solve and avoid braking that part. The fun life of a programmer, all that detective work :slight_smile:

Best Alex