0

We have an xml file of the following format:

<mailBox>
...
</mailBox>
<mailBox>demon</mailBox>
<tz>16385</tz>
<Contact>
....
</Contact>
</mailBox>
<mailBox>
...
</mailBox>

Is there a way to extract a particular node out of this xml using a sed/awk/grep one-liner?
I was looking for somethign in the format

`sed -n 'mailBox\>demon,......p`
IUnknown
  • 9,301
  • 15
  • 50
  • 76
  • 4
    Don't attempt to parse XML using regex. `xmlstarlet` might help. – devnull Mar 11 '14 at 09:39
  • Yes, trying to parse XML with regexp is a *bad* idea: http://stackoverflow.com/questions/8577060/why-is-it-such-a-bad-idea-to-parse-xml-with-regex ... – MarcoS Mar 11 '14 at 09:40
  • thanks - but its a pretty simple xml for me(without nested tags).And i needed a quick hit for troubleshooting. – IUnknown Mar 11 '14 at 09:42
  • 1
    Or, similar answer for HTML: http://stackoverflow.com/questions/4231382/regular-expression-pattern-not-matching-anywhere-in-string/4234491#4234491 – choroba Mar 11 '14 at 09:42

2 Answers2

0

on your sample the tag demon is on the same line and not in rest of file. Error or specific

If it's an error (so mailbox tag are on separate lines)

sed -n '1h;1!H;${x
s/.*\(<mailBox>demon.*\)/\1/;s|</mailBox>.*||;p
}' YourFile
jaypal singh
  • 74,723
  • 23
  • 102
  • 147
NeronLeVelu
  • 9,908
  • 1
  • 23
  • 43
0

You could try:

perl -0777 -nE 'foreach (/<mailBox>(.*?)<\/mailBox>/sg) {say $_}' file
Håkon Hægland
  • 39,012
  • 21
  • 81
  • 174