0

I have a file which contains XML tags. Each line has a root element and a couple of sub elements into it. The structure resembles something like this

<document><title>some title1</title><abstract>Some abstract1</abstract></document>
<document><title>some title2</title><abstract>Some abstract2</abstract></document>
<document><title>some title3</title><abstract>Some abstract3</abstract></document>
<document><title>some title4</title><abstract>Some abstract4</abstract></document>

Now I have to find all lines where the tag contains a particular word. eg: get all lines that contain abstract1 inside the <abstract> tag.

How to do it in either grep, awk or sed?

Sudar
  • 18,954
  • 30
  • 85
  • 131

2 Answers2

3

Using sed:

sed -n '/<abstract>[^<]*abstract1/p' input
perreal
  • 94,503
  • 21
  • 155
  • 181
1

Update:

    grep  -nir  "<abstract>.*word.*</abstract>" filename
bhab
  • 169
  • 2
  • 11
  • This will give me all lines that contain "your word". But I want to find only lines that contain "your word" inside a particular tag like . – Sudar Mar 20 '13 at 05:23
  • The updated code works. But I have already accepted the other answer. So I was able to only upvote. – Sudar Mar 20 '13 at 15:45