145

I run several substitution commands as the core of a colorize script for maven. One of the sed commands uses a regular expression which works find in the shell as discussed here. The current (not working) implementation can be found here.

When I include one of the variants of the command into the script different behavior occurs:

Variant 1:

$ sed -re "s/([a-zA-Z0-9./\\ :-]+)/\1/g"

Adapted to the script:

-re "s/WARNING: ([a-zA-Z0-9./\\ :-]+)/${warn}WARNING: \1${c_end}/g" \

Error: The shell outputs the same information as if I would type $ sed. Strange!?


Variant 2:

$ sed -e "s/\([a-zA-Z0-9./\\ :-]\+\)/\1/g"

Adapted to the script:

-e "s/WARNING: \([a-zA-Z0-9./\\ :-]\+\)/${warn}WARNING: \1${c_end}/g" \

Error:

sed: -e expression #7, char 59: invalid reference \1 on `s' command's RHS

Community
  • 1
  • 1
JJD
  • 50,076
  • 60
  • 203
  • 339
  • 16
    In my case I had combined a `-i` (edit in place option) with `-re`, resulting in `-ire` (so that `-i` was consuming the `re` fragment as its `SUFFIX` argument and hence the extended regex mode was not being enabled); changing it to `-i -re` fixed the issue. – Janaka Bandara Mar 18 '17 at 06:18
  • It's also to notice that single quotes `'` and double quotes `"` are treated slightly different, especially when interpreting `$vars`. For example: `sudo sh -c "sed -r -i 's/(^.+_supplicant.conf)/\1${MTXT}/' /etc/network/interfaces"` works, but: `sudo sh -c 'sed -r -i "s/(^.+_supplicant.conf)/\1${MTXT}/" /etc/network/interfaces'` does not. – not2qubit Jan 02 '18 at 09:26

4 Answers4

101

Don't you need to actually capture for that to work? i.e. for variant #2:

-r -e "s/WARNING: (\([a-zA-Z0-9./\\ :-]\+\))/${warn}WARNING: \1${c_end}/g" \

(Note: untested)

Without the -r argument back-references (like \1) won't work unless each parenthesis is escaped with a \ character.

With -r, argument back-references (like \1) won't work unless the parenthesis are NOT escaped.

SebMa
  • 4,037
  • 29
  • 39
Denis de Bernardy
  • 75,850
  • 13
  • 131
  • 154
  • 54
    The `-r` option to sed appears to be necessary for the back-reference to work. E.g `sed -e 's/([[:digit:]])/is a digit/'` works but `sed -e 's/([[:digit:]])/\1 is a digit/` produces the original error without `-r` to sed. **NOTE:** the first invocation of sed searches for a literal `()` and **is not** a capture group. – Andrew Falanga Mar 24 '16 at 14:58
  • The comment below the answer is actually an answer. Maybe you can edit your answer to reflect it. – miroxlav Sep 03 '17 at 10:21
  • @AndrewFalanga you should have posted your comment as an answer – sanmai Feb 08 '18 at 06:58
  • 5
    Nevermind my mistake was to use `-ire` instead of use `-ri`. Order matters :-) – m3nda Jun 10 '18 at 15:22
  • `-r, --regexp-extended` = `use extended regular expressions in the script.` In most current versions the `-E` and `-r` both can be used. Extended as opposed to basic. – sastorsl May 12 '22 at 15:38
60

This error is common for parentheses that are not escaped. Escape them and try again.


For example:

/^$/b
:loop
$!{
N
/\n$/!b loop
}
s/\n(.)/\1/g

Should be escaped with backslashes before each parenthesis:

/^$/b
:loop
$!{
N
/\n$/!b loop
}
s/\n\(.\)/\1/g
Dave Jarvis
  • 30,436
  • 41
  • 178
  • 315
e18r
  • 7,578
  • 4
  • 45
  • 40
34

If the -r/--regexp-extended option is not provided, then the capturing parentheses must be escaped.

OrangeDog
  • 36,653
  • 12
  • 122
  • 207
7

You need escape the / after the .

sed -e "s/\([a-zA-Z0-9.\/\\ :-]\+\)/\1/g"

Or if you don't want to worry about escaping, use |

sed -e "s|\([a-zA-Z0-9./\\ :-]\+\)|\1|g"

EDIT:

sed -e "s|WARNING: \([a-zA-Z0-9.-/\\ :]+\)|${warn}WARNING: \1${c_end}|g"
AndyG
  • 39,700
  • 8
  • 109
  • 143
slackmart
  • 4,754
  • 3
  • 25
  • 39