0

I am having a XML similar to this

<Level1Node>
.
.
    <Level2Node val="Retain"/>
.
.
</Level1Node>
<Level1Node>
.
.
    <Level2Node val="Replace"/>
.
.
</Level1Node>
<Level1Node>
.
.
    <Level2Node val="Retain"/>
.
.
</Level1Node>

I need to remove only the below node,

<Level1Node>
.
.
    <Level2Node val="Replace"/>
.
.
</Level1Node>

To have it replaced in non-greedy manner, I used the below regex,

perl -0 -pe "s|<Level1Node>.*?<Level2Node val="Retain"/>.*?</Level1Node>||gs" myxmlfile

But the non-geedy terminates the match only at the end of the pattern, not at the start. How to get it started at the last match of <Level1Node>

Kannan Ramamoorthy
  • 3,980
  • 9
  • 45
  • 63
  • Please post your actual input. In your regex you have mentioned in regex as `qpulse-hl7-par` but in your input there is no `qpulse-hl7-par` and `level2node` having the attribute then `level2node` is self closed but you are mentioned as `` in your regex. – mkHun Aug 11 '17 at 06:39
  • @mkHun Updated the regex appropriately bro. – Kannan Ramamoorthy Aug 11 '17 at 07:55

2 Answers2

1

You will need to use a negative lookahead to make sure you do not match closing Level1Node tags where you don't want to:

perl -0 -pe 's|<Level1Node>(?:(?!<\/Level1Node>).)*<Level2Node val="Retain"\/>(?:(?!<\/Level1Node>).)*<\/Level1Node>||gs' tmp.txt

Details:

<Level1Node>
(?:(?!<\/Level1Node>).)* # Everything except </Level1Node>
<Level2Node val="Retain"\/>
(?:(?!<\/Level1Node>).)* # Everything except </Level1Node>
<\/Level1Node>

?: is only here so that the parenthesis are not interpreter as a capturing group.

If you plan to run this on a large file, you should probably check the cost of the negative lookahead, it might be high.

pchaigno
  • 11,313
  • 2
  • 29
  • 54
0

Use a proper parser! It's way simpler.

perl -MXML::LibXML -e'
   my $doc = XML::LibXML->new->parse_file($ARGV[0]);
   $_->unbindNode() for $doc->findnodes(q{//Level1Node[Level2Node[@val!="Retain"]]});
   $doc->toFH(\*STDOUT);
' tmp.txt
ikegami
  • 367,544
  • 15
  • 269
  • 518