2

First of all sorry for my bad english. I'm a german guy.

The code given below is working fine in PHP:

$string = preg_replace('/href="(.*?)(\.|\,)"/i','href="$1"',$string);

Now T need the same for sed. I thought it should be:

sed 's/href="(.*?)(\.|\,)"/href="{$\1}"/g' test.htm

But that gives me this error:

sed: -e expression #1, char 36: invalid reference \1 on `s' command's RHS

Gumbo
  • 643,351
  • 109
  • 780
  • 844
Seblon
  • 31
  • 1
  • 3

6 Answers6

3

sed does not support non-greedy regex match.

Dyno Fu
  • 8,753
  • 4
  • 39
  • 64
2

You need a backslash in front of the parentheses you want to reference, thus

sed 's/href="\(.*?\)(.|\,)"/href="{$\1}"/g' test.htm
user231967
  • 1,935
  • 11
  • 9
2
sed -e 's|href=\"\(.[^"][^>]*\)\([.,]\)\">|href="\1">|g' file
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
0

You have to escape the block selector characters ( and ) as follows.

sed 's/href="\(.*?\)\(.|\,\)"/href="{$\1}"/g' test.htm
Didier Trosset
  • 36,376
  • 13
  • 83
  • 122
0

If you want to match a literal ".", you need to escape it or use it in a character class. As an alternative to slashing the capturing parentheses (which you need to do with basic REs), you can use the -E option to tell sed to use extended REs. Lastly, the REs used by sed use \N to refer to subpatterns, where N is a digit.

sed -E "s/href=([\"'])([^\"']*)[.,]\1/href=\1\2\1/i"

This has its own issue that will prevent matches of href attributes that use both types of quotes.

man sed and man re_format will give more information on REs as used in sed.

outis
  • 75,655
  • 22
  • 151
  • 221
  • 1
    In my version of sed, it uses `-r` to specify extended regular expressions (which do not require escaping parenthesis) instead of `-E`. – tomlogic Mar 27 '12 at 15:16
0

here is a solution, it is not prefect, only deal with the situation of one extra "," or "."


sed -r -e 's/href="([^"]*)([.,]+)"/href="\1"/g' test.htm
Dyno Fu
  • 8,753
  • 4
  • 39
  • 64