6

Problem

The behaviour of

!(pattern-list)

does not work the way I would expect when used in parameter expansion, specifically

${parameter/pattern/string}

Input

a="1 2 3 4 5 6 7 8 9 10"

Test cases

$ printf "%s\n" "${a/!([0-9])/}"
[blank]
#expected 12 3 4 5 6 7 8 9 10

$ printf "%s\n" "${a/!(2)/}"
[blank]
#expected  2 3 4 5 6 7 8 9 10

$ printf "%s\n" "${a/!(*2*)/}"
2 3 4 5 6 7 8 9 10
#Produces the behaviour expected in previous one, not sure why though

$ printf "%s\n" "${a/!(*2*)/,}"
,2 3 4 5 6 7 8 9 10
#Expected after previous worked

$ printf "%s\n" "${a//!(*2*)/}"
2
#Expected again previous worked

$ printf "%s\n" "${a//!(*2*)/,}"
,,2,
#Why are there 3 commas???

Specs

GNU bash, version 4.2.46(1)-release (x86_64-redhat-linux-gnu)

Notes

These are very basic examples, so if it is possible to include more complex examples with explanations in the answer then please do.

Any more info or examples needed let me know in the comments.

Have already looked at How does extglob work with shell parameter expansion?, and have even commented on what the problem is with that particular problem, so please don't mark as a dupe.

Community
  • 1
  • 1
123
  • 10,778
  • 2
  • 22
  • 45
  • I think I can explain all of those except the last one (which looks like a bug) – Leon May 29 '17 at 12:29
  • @123, I generally use it like `ls !(*.txt)` (other than files ending with .txt) or `ls !(*.log|*.sh)` (other than files ending with .log or .sh) etc – Sundeep May 29 '17 at 12:33
  • @Leon These are only basic examples, the whole thing seems super buggy but I don't think it is, I reckon it's just that it isn't doing what I thought it did. Feel free to post an answer though! – 123 May 29 '17 at 12:36
  • @Sundeep Yes, it works as expected when matching filenames. – 123 May 29 '17 at 12:36
  • @123 but I don't get why `ls *.!(log|sh)` or `ls foo*!(bar)` (starting with foo but not ending with bar) doesn't do what I expect... – Sundeep May 29 '17 at 12:40
  • 2
    @Sundeep That is because `foo*` will match `foobar` and `!(bar)` will match nothing/null at the end, so the match will still be successful. – 123 May 29 '17 at 12:47

1 Answers1

5

Parameter expansion of the form ${parameter/pattern/string} (where pattern doesn't start with a /) works by finding the leftmost longest substring in the value of the variable parameter that matches the pattern pattern and replacing it with string. In other words, $parameter is decomposed into three parts prefix,match, and suffix such that

  1. $parameter == "${prefix}${match}${suffix}"
  2. $prefix is the shortest possible string enabling the other requirements to be fulfilled (i.e. the match, if at all possible, occurs in the leftmost position)
  3. $match matches pattern and is as long as possible
  4. any of $prefix, $match and/or $suffix can be empty

and the result of ${parameter/pattern/string} is "${prefix}string${suffix}".

For the global replacement form (${parameter//pattern/string}) of this type of parameter expansion, the same process is recursively performed for the suffix part, however a zero-length match is handled as a special case (in order to prevent infinite recursion):

  • if "${prefix}${match}" != ""

    "${parameter//pattern/string}" = "${prefix}string${suffix//pattern/string}"
    

    else suffix=${parameter:1} and

    "${parameter//pattern/string}" = "string${parameter:0:1}${suffix}//pattern/string}"
    

Now let's analyze the cases individually:

  • "${a/!([0-9])/}" --> prefix='' match='1 2 3 4 5 6 7 8 9 10' suffix=''. Indeed, '1 2 3 4 5 6 7 8 9 10' is not a string consisting of a single digit, and therefore it matches the pattern !([0-9]). Hence the empty result of expansion.

  • "${a/!(2)/}" --> prefix='' match='1 2 3 4 5 6 7 8 9 10' suffix=''. Similar to the above, '1 2 3 4 5 6 7 8 9 10' is not a string consisting of the single character '2', and therefore it matches the pattern !(2). Hence the empty result of expansion.

  • "${a/!(*2*)/}" --> prefix='' match='1 ' suffix='2 3 4 5 6 7 8 9 10'. The substring '1 ' doesn't match the pattern *2*, and therefore it matches the pattern !(*2*).

  • "${a/!(*2*)/,}". There were no surprises here, so no need to elaborate.

  • "${a//!(*2*)/}". There were no surprises here, so no need to elaborate.

  • "${a//!(*2*)/,}" --> prefix='' match='1 ' suffix='2 3 4 5 6 7 8 9 10'. Then ${suffix//!(*2*)/,} expands to ",2," as follows. The empty string in the beginning of suffix matches the pattern !(*2*), producing an extra comma in the result. Since the zero-length match special case (described above) was triggered, the first character of suffix is forcibly consumed, leaving us with ' 3 4 5 6 7 8 9 10', which matches the !(*2*) pattern in its entirety and is replaced with the last comma that we see in the final result of the expansion.

Leon
  • 31,443
  • 4
  • 72
  • 97
  • This seems on the right track but I'm still not entirely convinced, for example following this logic say we have `${a/!(*2)/,}` then `1 ` would not match that and would be the longest possible string from the left that doesn't, so surely the output should be `,2 3 4 5 6 7 8 9 10` yet it just leaves a single comma, meaning it matched the entire string. Not saying anything in your answer is incorrect, just that it still isn't entirely clear to me what is happening. – 123 May 29 '17 at 21:24
  • The pattern `*2` means *a string ending with the character "2"*, and `!(*2)` means *a string NOT ending with the character "2"*, that's why the longest possible substring of `'1 2 3 4 5 6 7 8 9 10'` that matches the pattern `!(*2)` is the entire string (as it **doesn't end with 2**). – Leon May 30 '17 at 06:24
  • Makes sense, so basically unless you have `*string*` it's gonna eat the entire string. – 123 May 30 '17 at 13:53