1

I have a REGEX question. I am not particularly familar with regex.

I have a large XML file which I am trying to extract the tags from in order to check that each possible tag is handled correctly by some code which processes them.

Just as an experiment I have written this test file

grep should match this <word> and this <otherword> but should not <match> everything <between> end of line

I tried using this with grep

grep -o "<.*>" regex.txt

however it returned this

<word> and this <otherword> but should not <match> everything <between>

I used the -o switch to return only the matched text and not the entire line. It appears to be matching between the first < seen and the last > seen, rather than each individual match on the line. Also it is printing the brackets themselves. Is there any way to chop the brackets off and prevent this "spanning match" behaviour (sorry not sure how to describe it) where matches span the entire range of text between the first < and last > seen?

FreelanceConsultant
  • 13,167
  • 27
  • 115
  • 225

0 Answers0