0

I have condition where I have to select anything which is not part of span tag.

Input -

the <span class='ptc-highlightedSearchResult'>PISTON</span> has their <span class='ptc-highlightedSearchResult'>ROD</span> ring

regex which selects <span> tag and it's content -

(<span[^>]+class\s*=\s*("|')ptc-highlightedSearchResult\2[^>]*>)[^<]*(</span>)

I'm able to select whatever comes in span and their content but not otherwise. Any help on NOT operation will be appreciated.

Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
Enigma
  • 749
  • 1
  • 13
  • 35
  • Why not use a library that already handles XML or HTML? Regular expressions are not best suited for markup like this, as you can find throughout SO. – BLaZuRE Oct 01 '13 at 06:27
  • @BLaZuRE - I can't use them, I have some limitations about edit and have to do it with regex only. Pls pass on your suggestions. – Enigma Oct 01 '13 at 06:32
  • I'm assuming then that http://docs.oracle.com/javase/1.4.2/docs/api/javax/xml/parsers/package-summary.html won't help you? Try this for more on the not operator: http://stackoverflow.com/questions/7317043/regex-not-operator – BLaZuRE Oct 01 '13 at 06:39
  • @DevendraW You can use this regex in a replace to remove all the span and you'll be left of what's outside. – Jerry Oct 01 '13 at 06:47
  • @Jerry - Sorry but I can't remove the content of span and span either. I just want to select remaining text and want to highlight them as user enters his inputs. so tags will keep adding on page. pls suggest regex to avoid them and their contents and select other string part. – Enigma Oct 01 '13 at 06:56
  • @DevendraW And you want to 'unselect' only the parts inside span tags and no other tags? You could have something like [this](http://regex101.com/r/aI0iE0), but this avoids other tags as well. – Jerry Oct 01 '13 at 07:16
  • You could use `negative lookaround` here. Replace `` with `(?!span)`. That should select all tags but spans. But it won't be able to handle nested tags. – Vince Oct 01 '13 at 08:04
  • @Jerry Your suggestions is close but it selects the content/string between as well. See in link you pasted, in result, Piston and ROD should not be selected and 'the' 'has their' 'ring' should be selected. Pls suggest with this correction. Thanks – Enigma Oct 01 '13 at 08:32

1 Answers1

0

You can use this:

((?:(?![^<>]*(?:>))[^<](?![^<>]*</))+)

regex101 demo

It will match any text not inside or between opening and closing tags. There's a breakdown of the regex in the demo.

Jerry
  • 70,495
  • 13
  • 100
  • 144