0

I have to find some patters from a XML file, but i am unable to do it.

<field>
<uniqueid>account
</uniqueod>
<tableid>afs</tableid>
</field>
<field>
<uniqueid>address</uniqueod>
<tableid>afs</tableid>
</field>

what i have to do is to search the entries between these two fields and redirect them to a file.txt.and output should be such that

uniqueid  tableid
uniqueid  tableid

i.e. for each uniqueid tableid should be printed along with it. The entries can be different or same. Guys help me out...

mathematical.coffee
  • 55,977
  • 11
  • 154
  • 194
Nishant
  • 1,635
  • 1
  • 11
  • 24
  • `` tag names don't match? – kev Apr 02 '12 at 05:30
  • possible duplicate of [RegEx match open tags except XHTML self-contained tags](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – jrturton Apr 02 '12 at 09:14

5 Answers5

5

That's because you shouldn't be using grep for this. Try XSLT or XMLStarlet instead.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • If you want to search the file, you should be using XPath. If you want to transform the file, you should be using XSLT. "grep", "sed" and friends might be good enough for quick'n'dirty one-off's ... but for anything more, you'll run against a wall real fast. IMHO... – paulsm4 Apr 02 '12 at 05:34
3
$ xmlstarlet sel -t -m '//field' -v 'concat(normalize-space(uniqueid), " ", normalize-space(tableid))' -n input.xml
account afs
address afs
kev
  • 155,172
  • 47
  • 273
  • 272
2

Agree grep (and other "standard" text tools like awk, sed and friends) are not the best solution to the issue.

However something like what you want to do can be done with awk: https://stackoverflow.com/a/9881009/857132

Community
  • 1
  • 1
John3136
  • 28,809
  • 4
  • 51
  • 69
0

@ignacio is right. But still if you want to try some dirty hacks.Here is one specific to your file :

 grep -e "uniqueid" -e "tableid" sample.xml | sed -e 's/<[^>]*>//g' | sed -e '/^$/d' | sed 'N; s/\n/ /'

 account afs12
 address afs34

Your file "sample.xml" with corrected tags (uniqueod was incorrect) and some data :

<field>
<uniqueid>account
</uniqueid>
<tableid>afs12</tableid>
</field>
<field>
<uniqueid>address</uniqueid>
<tableid>afs34</tableid>
</field>

Explained:

grep -e "uniqueid" -e "tableid" sample.xml  -> find the tags and data
sed -e 's/<[^>]*>//g'             -> remove the tags,only data remains  
sed -e '/^$/d'                    -> remove any empty line i.e. which came due to closing tags
sed 'N; s/\n/ /'                  -> append alternate lines

There could be better ways, but my knowledge of sed and awk is of a beginner level.

DhruvPathak
  • 42,059
  • 16
  • 116
  • 175
0

This might work for you:

sed ':a;$!N;/^<uniqueid>/!D;/^<[^>]*>\n*\([^\n<]*\)\n*<[^>]*>\n*<[^>]*>\n*\([^\n<]*\)\n*<[^>]*>/!ba;s//\1 \2\n/;P;D' XML
potong
  • 55,640
  • 6
  • 51
  • 83