3
echo "$(expr "title: Purple Haze       artist: Jimi Hendrix" : 'title:\s*\(.*\?\)\s*artist.*' )"

prints

Purple Haze             

With the trailing whitespace, even though I am using the ? lazy operator.

I've tested this on https://regex101.com/ and it works as expected, what's different about bash?

Tom Fenech
  • 72,334
  • 12
  • 107
  • 141
texasflood
  • 1,571
  • 1
  • 13
  • 22

2 Answers2

6

You aren't using bash's regexp matching, you're using expr. expr does not have a “? lazy operator”, it only implements basic regular expressions (with a few extensions in the Linux version, such as \s for whitespace, but that doesn't include Perl-like lazy operators). (Neither does bash, for that matter.)

If you don't want .* to include trailing space, specify that it must end with a character that isn't a space:

'title:\s*\(.*\S\)\s*artist.*'
Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
  • Perhaps also mention that `expr` should generally be avoided. If you are using Bash, it already has built-in replacements for everything `expr` does, which are often also more readable, efficient, versatile, and robust. – tripleee May 03 '23 at 08:29
2

As Gilles points out, you're not using bash regular expressions. To do so, you could use the regex match operator =~ like this:

re='title:[[:space:]]*(.*[^[:space:]])[[:space:]]*artist.*'
details='title: Purple Haze       artist: Jimi Hendrix'
[[ $details =~ $re ]] && echo "${BASH_REMATCH[1]}"

Rather than using a lazy match, this uses a non-space character at the end of the capture group, so the trailing space is removed. The first capture group is stored in ${BASH_REMATCH[1]}.

At the expense of cross-platform portability, it is also possible to use the shorthand \s and \S instead of [[:space:]] and [^[:space:]]:

re='title:\s*(.*\S)\s*artist.*'
Community
  • 1
  • 1
Tom Fenech
  • 72,334
  • 12
  • 107
  • 141