2

Given this example text:

<abr:rules>
<abr:ruleTypeDefinition>
<abr:code>ABB</abr:code>
<abr:ownership>
<abr:owner organization="NT" application="DCS" subapplication="FM"/>
...lines...
...........
</abr:rules>
<abr:rules>
<abr:ruleTypeDefinition>
<abr:code>ADE</abr:code>
<abr:ownership>
<abr:owner organization="NT" application="DCS" subapplication="CM"/>
...lines...
...........
</abr:rules> (end of group)

I would like to find and remove all that goes from <abr:rules> to </abr:rules> with the condition that subapplication IS NOT "CM". Organization and application are the same, <abr:code> it's any string.

What I tried so far is

<abr:rules>\n<abr:ruleTypeDefinition>\n<abr:code>[a-zA-Z0-9]{3,}<\/abr:code>\n<abr:ownership>\n<.*"(FM|PSD|SSC)"\/>\n(?s).*?\n<\/abr:rules>\n

which works but only because I know the other subapplication names.

Is there any way to do it with Regex only ?

farbiondriven
  • 2,450
  • 2
  • 15
  • 31

2 Answers2

2

Try the following find and replace:

Find:

<abr:rules>((?!subapplication=).)*subapplication="(?!CM")[^"]+"((?!</abr:rules>).)*</abr:rules>

Replace:

(empty string)

Demo

Note: The above pattern will only work if you enable dot in Notepad++ to match newlines. If you don't want to do that, then you may use [\S\s] instead of dot.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
2

You should not use regex for xml, you can read why here: https://stackoverflow.com/a/1732454/3763374

Instead you can use some parser like Xpath

jixbo
  • 21
  • 2
  • 1
    True, for this specific case xml parser would have been simpler. The accepted answer can be applied for any text. – farbiondriven Apr 13 '18 at 16:13
  • I won't upvote or downvote your answer because you are right, but: 1) It should be posted as a comment (since it isn't really an answer, more an advice), 2) even if the linked question has many upvotes, the accepted answer is more a joke than an useful answer, 3) XPath isn't a parser, it's a query language. – Casimir et Hippolyte Apr 13 '18 at 21:02