1

Im trying to construct a Regex Identifier for the following...

<w:p>
    Some Other XML
        <w:p someatribute="something">
        HERE
        </w:p>
    Some Other XML
</w:p>

The Identifier needs to select just the following part...

        <w:p someatribute="something">
        HERE
        </w:p>

And leave everything else in palce.

My current attempt... <w:p(.*?)Test(.*?):p>

Is selecting everything from the above sample. Can regex help me here to identify just the closest match and any text in between?

Many thanks!

TR

  • Why would you do this with regex as opposed to standard DOM traversal techniques? – Mike Brant Oct 07 '14 at 14:02
  • 2
    Well, if you know that there is only text in between, then `]*?>([^<>]*?)Test([^<>]*?)<\/w:p>` might do - but [in general, **NO**](http://stackoverflow.com/a/1732454/1048572). – Bergi Oct 07 '14 at 14:02
  • Okay thanks chaps perhaps ill open up another question on how to revise what i am doing then. Cheers for your help. – Tom Rebbettes Oct 07 '14 at 14:25

1 Answers1

0
(<w:p[^>]+>(?:(?!<\/wp>).)+?<\/w:p>)

Try this.Set flags s.See demo.

http://regex101.com/r/hQ1rP0/40

vks
  • 67,027
  • 10
  • 91
  • 124
  • Worked Great Thanks, atleast it did what i requested. Didn't solve my issue but helped me narrow it down thanks a bunch. Turns out you cant delete a paragraph from using XML SDK if it leaves a parent of Type TableCell empty. I guess TableCells need atleast one paragraph in them. Have gone back to using standard OpenXML SDK doing a compare on the types of the parents. – Tom Rebbettes Oct 07 '14 at 15:53