0

I did post this question to a thread from 2011 (Get text inside xml tag using grep). I did try to get the final answer to work using commandline arguments ($1=filname, $2=tagname) instead of the fixed names:

grep -E -m 1 -o "<$2>(.*)</$2>" ./private/$1.xml | sed -e 's,.*<$2>\([^<]*\)</$2>.*,\1,g'

Apparantly this does not work, because the part after the pipe doesn't get the argument $2. I am a total linux noob, but my hunch is that the pipe starts a new process that does not get the parent arguments. I tried Google for quite some time, but do get more confused. Is there a simple work around for this?

Community
  • 1
  • 1
Pieter
  • 1
  • 2

1 Answers1

0

Your command does not work because variables in '-quotes are not replaced.

This should work:

grep -E -m 1 -o "<$2>(.*)</$2>" ./private/$1.xml | sed -e "s,.*<$2>\\([^<]*\\)</$2>.*,\\1,g"

That said, using grep for such an task it not really a good idea. It is better to use an actual html processor like my Xidel (or xpath/xmlstarlet/... ). Then you can just write:

xidel ./private/$1.xml -e //$2
BeniBela
  • 16,412
  • 4
  • 45
  • 52
  • Thanks, works like a charm. I fully agree, his is indeed a bad idea, but I am awaiting a proper XML parser in the software package I use. I 'll have a look at Xidel – Pieter Sep 02 '13 at 16:24