3

I have a stream that looks like this (except with more stuff):

<ret:EditUse>Broadcast</ret:EditUse>
<EditUse>Movie</EditUse>

and I'm trying to clean the XML from it using sed:

sed "s_</?(ret:)?EditUse>__"

I've tested the regular expression using RegexPal but it doesn't seem to work in sed. Any ideas as to what's wrong?

Alex Bliskovsky
  • 5,973
  • 7
  • 32
  • 41
  • 2
    [The pony he comes...](http://stackoverflow.com/a/1732454/554546) –  Dec 29 '11 at 16:40
  • 2
    I'm not trying to parse xml, I'm trying to strip it. I believe regex is perfectly suitable for this specific task, especially because EditUse is the only tag that shows up. – Alex Bliskovsky Dec 29 '11 at 16:43

1 Answers1

6

This is the regex that works with sed:

sed "s_</\?\(ret:\)\?EditUse>__g"
  1. Escape with backslash characters ?, ( and )
  2. Use g switch to apply the regex many times in each line.

Result:

Broadcast
Movie
Birei
  • 35,723
  • 2
  • 77
  • 82
  • 2
    It works without escaping if you use `-r` option (it enables extended regular expressions). – KL-7 Dec 29 '11 at 16:53
  • @KL-7: Yes. You are right. It's good to know that command-line option but also that it is a GNU extension and less portable. – Birei Dec 29 '11 at 17:00