1

For the past 2 days I've been trying to learn bash, I've understood a lot about substitution but I'm still struggling to fully understand how escaping works in different scenarios in bash. this is the example I'm practising with.

set -x
string='mo$ney'
echo ^mo\$ney   # the shell escapes the $ sign and the backslash disappears
if [[ $string =~ ^mo\$ney$ ]]   # the shell keeps the backslash and the string is matched as having a dollar sign.
then
  echo "matched"
else
  echo "no"
fi

this is what I would expect to write to have it work but it doesn't. my thought process is to escape the backslash to keep a real backslash and escape the dollar sign

if [[ $string =~ ^mo\\\$ney$ ]]
then
  echo "matched"
else
  echo "no"
fi

I use regular expression in general and if I were creating a regex normally the dollar sign would match the end of a string. so if I wanted to match an actual dollar sign I would escape it in the context of the regex like this /$/ that's the actual regex if I had to create a string that has a regex inside it, I would have to escape the backslash because most languages would use the backslash as a method to escape special characters in strings. so my string would look like this '\$'

to me, it looks like bash skips a step it goes directly from expecting a parameter ($ney) and when I escape it (\$ney) it goes directly to being a literal dollar sign.

In my mind, I expect in shell context the parameter expectation to be escaped. then in the regex context, it should be seeing a $ sign which represents the end of a string so I expect that to also be escaped to get to \$ in the context of regular expressions.

without this process (the one in my head), I'm not sure how you would match using the regex dollars sign (that is, the end of string token) strangely enough, if I place a $ sign at the end of the matching pattern it works and is taken as being an "end-of-line" regex token ($). this can also be seen outside conditionals

echo money$ # in this situation if the dollar sign is at the end the shell doesn't even need it escaped ‍♂️. it understands that this is a real dollar sign.

Can anyone offer some clarification on what is going on in conditionals? I've read about how it prevents word splitting and how using quotes inside it while matching makes it match against a literal string and not a pattern (so using string is not even an option).

masonCherry
  • 894
  • 6
  • 14
  • 1
    How does it not work? `string='mo$ney' && [[ $string =~ ^mo\$ney$ ]] && echo matched` outputs `matched` for me. Your next attempt adds a literal backslash to the regular expression that is not presenting the string, so the match fails. – chepner Sep 03 '22 at 15:43
  • Note that `\$` in the `=~` operand is *not* subject to quote removal like in a simple command. The `=~` operator still sees a literal backslash and a literal `$` in the regular expression, which is *interpreted by it* as a regular expression that matches a literal dollar sign. In short, compound commands can and often do have evaluation rules that differ from simple commands. (That's why you don't need to quote parameter expansions inside `[[ ... ]]` to avoid word-splitting as well.) – chepner Sep 03 '22 at 15:46
  • Hey, thanks for the comments. I think your second comment is getting to the issue. like I said in the post I've read about the word splitting and quote parameter extension inside `[[ ... ]]`. I think the thing you mentioned about quote removal is at the core of the issue. – masonCherry Sep 03 '22 at 17:45
  • but for me specifically the issue I'm having is if I used `$var` it would try to do parameter expansion right? it would try to find a variable called `var`? – masonCherry Sep 03 '22 at 17:46
  • 1
    Yes. Dollar sign as a regex metacharacter is only meaningful at the end, or immediately before a pipe or a closing parenthesis (`foo$`, `foo$|bar`, `(foo$)`), and in none of those positions can it be confused for a variable expansion. – oguz ismail Sep 03 '22 at 17:56
  • 1
    I understand thanks. I guess it doesn't ever make sense to search a single string for an end of a line in the middle. but tbh I would rather have that type of consistency. – masonCherry Sep 03 '22 at 18:42
  • 3
    Using literal regular expressions with `=~` in Bash is tricky, and doesn't work consistently across different Bash versions. The best practice is to put the regular expression in a variable (e.g. `pattern='^mo\$ney$'`) and use the variable in the test (`[[ $string =~ $pattern ]]`). See [bash regex with quotes?](https://stackoverflow.com/q/218156/4154375) and the [\[\[...\]\]](https://www.gnu.org/software/bash/manual/bash.html#index-_005b_005b) section of the [Bash Reference Manual](https://www.gnu.org/software/bash/manual/bash.html). – pjh Sep 03 '22 at 19:33
  • 1
    Also see [Bash Pitfalls #35 (if \[\[ $foo =~ 'some RE' \]\])](https://mywiki.wooledge.org/BashPitfalls#if_.5B.5B_.24foo_.3D.2BAH4_.27some_RE.27_.5D.5D). – pjh Sep 03 '22 at 19:44

0 Answers0