1

I need to write a bash script, that will be executed from an Android device. Among other things, I need this script to count occurrences of a particular character in a string, since wc (word count) utility is not available in Android shell, I am doing it like this:

my_string="oneX two threeX"; x_amount="${my_string//[^X]}"; echo $x_amount; echo "${#x_amount}"

When I run the above command on desktop, it returns (just as I expect):

XX
2

But if I execute the same command on my Android device (via adb shell), the result, to my amazement, is:

one two three
13

I have figured out (just by guess) that if I substitute ! for ^, so that command becomes:

my_string="oneX two threeX"; x_amount="${my_string//[!X]}"; echo $x_amount; echo "${#x_amount}";

then, on Android, it produces the result I expect:

XX
2

While the same command, on desktop, fails with the following message:

event not found: X]

Even thouth I have figured out how to "make it work" I would like to understand the following points:

  1. Where else, besides Android shell, [!X] notation is used, instead of [^X]?

  2. Does such notation have any special name?

  3. Are there any specific reason [^X] is not supported on Android?

P.S.: the device I need to run script on has a pretty old version of Android (4.4), so this 'issue' might be Android version specific, even if this is the case, questions above remain.

  • 1
    You are using `${my_string//[!X]}`, and this is a **parameter expansion** that does not use regex, only bracket expressions and POSIX character classes. It is used in *NIX shells, e.g. in Bash. Probably there was a syntax change in Android. – Wiktor Stribiżew Aug 18 '19 at 15:21

2 Answers2

5

Android's shell is mksh, witch uses a different RegEx or pattern dialect than Bash.

See: File name patterns in mksh's man-page:

    File name patterns
...
     [!...]  Like [...], except it matches any octet not inside the brackets.

Lets test some shell compatibility with string substitution and negative character class pattern [!...] syntax:

#!/usr/bin/env bash

shells=( ash bash dash ksh93 mksh tcsh zsh )
compat=()
not_compat=()
for shell in "${shells[@]}"; do
  if [ "$(
    "$shell" <<'EOF' 2>/dev/null
my_string="oneX two threeX"
x_amount="${my_string//[!X]}"; echo "$x_amount${#x_amount}"
EOF
  )" = "XX2" ]; then
    compat+=("$shell")
  else
    not_compat+=("$shell")
  fi
done
echo "Shells that understands the [!...] negative class syntax:"
printf '%s\n' "${compat[@]}"
echo
echo "Shells that don't understand string substitution:"
printf '%s\n' "${not_compat[@]}"

Output:

Shells that understands the [!...] negative class syntax:
bash
ksh93
mksh
zsh

Shells that don't understand string substitution:
ash
dash
tcsh

Also note that sed does not understand the POSIX negative character group notation [!...], even when disabling its Gnu extensions:

sed --posix 's/[!X]//g' <<<'oneX two threeX'
one two three

but

sed --posix 's/[^X]//g' <<<'oneX two threeX'
XX
Léa Gris
  • 17,497
  • 4
  • 32
  • 41
2

First: there are a bunch of different notations for pattern matching; what the shell uses here isn't a regular expression, it's a "glob" (or "wildcard") pattern — similar to an RE in some ways, very different in others (like the meaning of "*"). And there are variations on those basic pattern types, both different variations on the glob syntax (especially bash's "extended glob" syntax), and many variations on regular expression syntax ("basic" RE, "extended" RE, Perl-compatible RE, etc etc etc...).

It's important in general to know what syntax the tool you're using takes, and adapt your patterns appropriately.

Now, for the case of negated bracket expressions, here's what the POSIX standard from 2004 says:

The description of basic regular expression bracket expressions in the Base Definitions volume of IEEE Std 1003.1-2001, Section 9.3.5, RE Bracket Expression shall also apply to the pattern bracket expression, except that the exclamation mark character ( '!' ) shall replace the circumflex character ( '^' ) in its role in a "non-matching list" in the regular expression notation. A bracket expression starting with an unquoted circumflex character produces unspecified results.

(The 2018 version is similar, but slightly garbled; not sure what happened there.)

So, ! is actually the standard thing to accept here. But bash and zsh both use ! to introduce history expansions, so apparently have decided it's better to accept ^ as well to avoid conflicts with the history mechanism.

bash accepts both "${my_string//[^X]}" and "${my_string//[!X]}", but zsh mistakes the latter for an attempt to reference an earlier command that included X], giving the error you saw.

Gordon Davisson
  • 118,432
  • 16
  • 123
  • 151
  • Thanks for explaining the error in [zsh](https://www.zsh.org/). It might be quite confusing, taking into account that it supports `[!...]` syntax. –  Aug 19 '19 at 07:17