sed replace regex correct syntax

Question

<div data-name="abc" data-location="abcd"><div>Something</div></div>

sed 's/data-name="*"/data-name="helloworld"/g'

I am trying to replace data-name="abc" abc to helloworld

However, my result returns

data-name="helloworld"helloworld"abc"

I am expecting to be data-name="helloworld"

score 1 · Accepted Answer · answered Sep 17 '20 at 06:42

* matches zero or more preeceding character. So "* matches zero or more " characters. So data-name="*" matched data-name=" (the "* matched zero characters) and substituted that.

You seem want to actually match the text inside ". So match everything up until a ".

sed 's/data-name="[^"]*"/data-name="helloworld"/g'

Note that you can't parse html with regex and it's better to use xml aware tools to edit html. It's fun to learn regexes with regex crosswords.

RavinderSingh13 · Answer 2 · 2020-09-17T06:44:51.143

Though html files should be parsed by a html proper parser since OP is already going with tools sed so adding this solution with awk.

awk -v str="helloworld" '
match($0,/"[^"]*/){
  print substr($0,1,RSTART) str substr($0,RSTART+RLENGTH)
  next
}
1
' Input_file

With sed:

sed 's/\(^[^"]*\)\("[^"]*"\)\(.*\)/\1"helloworld"\3/' Input_file

In case you want to look for line which has data-name string then could change match($0,/"[^"]*/){ line to /data-name/ && match($0,/"[^"]*/){ AND s/ to /data-name/s in above solutions.

sed replace regex correct syntax

2 Answers2