0

I want to remove by Perl parts of xml file (elements of JMeter test plan) which are:

              <hashTree>
                <JSONPathAssertion guiclass="JSONPathAssertionGui" testclass="JSONPathAssertion" testname="SON-ASS" enabled="true">
                  <stringProp name="JSON_PATH">$.status</stringProp>
                  <stringProp name="EXPECTED_VALUE">ok</stringProp>
                  <boolProp name="JSONVALIDATION">true</boolProp>
                  <boolProp name="EXPECT_NULL">false</boolProp>
                  <boolProp name="INVERT">false</boolProp>
                  <boolProp name="ISREGEX">false</boolProp>
                </JSONPathAssertion>
                <hashTree/>
              </hashTree>

1st time JSONPathAssertion occur in tst.jmx is:

grep -A 10 JSONPath test7.jmx | head -n 10:

</HTTPSamplerProxy>
          <hashTree>
            <JSONPathAssertion guiclass="JSONPathAssertionGui" testclass="JSONPathAssertion" testname="SON-ASS" enabled="true">
              <stringProp name="JSON_PATH">$.status</stringProp>
              <stringProp name="EXPECTED_VALUE">ok</stringProp>
              <boolProp name="JSONVALIDATION">true</boolProp>
              <boolProp name="EXPECT_NULL">false</boolProp>
              <boolProp name="INVERT">false</boolProp>
              <boolProp name="ISREGEX">false</boolProp>
            </JSONPathAssertion>
            <hashTree/>
            <JSR223PreProcessor guiclass="TestBeanGUI" testclass="JSR223PreProcessor" testname="JSR223 PreProcessor" enabled="true">

There is no </hashTree>. after <hashTree/>. separated by spaces only for that 1st occurrence.

I write then:

cat test7.jmx | perl -0777pe 's` *<hashTree>. *<JSONPathAssertion.*?</JSONPathAssertion>. *<hashTree/>. *</hashTree>.``gs' > test7_1.jmx

Then grep -A 10 JSONPath test7_1.jmx | head -n 10 and now have empty output. Resulting file does not have JSONPathAssertion at all. Why that particular occurrence was replaced?

P.S. maybe worth separate question, but I could not find how to match single newline character in perl, only as part of larger patterns:

How to match a newline \n in a perl regex? .

Regex to match any character including new lines

ADDED after comments:

Full file test7.jmx below (tested again copying contents from SO and pasting in vi to new file), BTW all done on MacOS Mojave, at one iteration confirmed on CentOS 7:

<?xml version="1.0" encoding="UTF-8"?>
<jmeterTestPlan version="1.2" properties="5.0" jmeter="5.1.1 r1855137">
  <hashTree>
    <TestPlan guiclass="TestPlanGui" testclass="TestPlan" testname="test one" enabled="true">
      <stringProp name="TestPlan.comments"></stringProp>
      <boolProp name="TestPlan.functional_mode">false</boolProp>
      <boolProp name="TestPlan.tearDown_on_shutdown">true</boolProp>
      <boolProp name="TestPlan.serialize_threadgroups">false</boolProp>
      <elementProp name="TestPlan.user_defined_variables" elementType="Arguments" guiclass="ArgumentsPanel" testclass="Arguments" testname="User Defined Variables" enabled="true">
        <collectionProp name="Arguments.arguments"/>
      </elementProp>
      <stringProp name="TestPlan.user_define_classpath"></stringProp>
    </TestPlan>
    <hashTree>
      <com.blazemeter.jmeter.threads.concurrency.ConcurrencyThreadGroup guiclass="com.blazemeter.jmeter.threads.concurrency.ConcurrencyThreadGroupGui" testclass="com.blazemeter.jmeter.threads.concurrency.ConcurrencyThreadGroup" testname="Thread - main" enabled="true">
        <elementProp name="ThreadGroup.main_controller" elementType="com.blazemeter.jmeter.control.VirtualUserController"/>
        <stringProp name="ThreadGroup.on_sample_error">continue</stringProp>
        <stringProp name="TargetLevel">${__P(threads, 1)}</stringProp>
        <stringProp name="RampUp">${__P(time, 1)}</stringProp>
        <stringProp name="Steps">${__P(steps, 1)}</stringProp>
        <stringProp name="Hold">${__P(hold, 3)}</stringProp>
        <stringProp name="LogFilename"></stringProp>
        <stringProp name="Iterations"></stringProp>
        <stringProp name="Unit">M</stringProp>
      </com.blazemeter.jmeter.threads.concurrency.ConcurrencyThreadGroup>
      <hashTree>
        <GenericController guiclass="LogicControllerGui" testclass="GenericController" testname="Simple Controller - main" enabled="true"/>
        <hashTree>
          <HTTPSamplerProxy guiclass="HttpTestSampleGui" testclass="HTTPSamplerProxy" testname="REQUEST1" enabled="true">
            <boolProp name="HTTPSampler.postBodyRaw">true</boolProp>
            <elementProp name="HTTPsampler.Arguments" elementType="Arguments">
              <collectionProp name="Arguments.arguments">
                <elementProp name="" elementType="HTTPArgument">
                  <boolProp name="HTTPArgument.always_encode">false</boolProp>
                  <stringProp name="Argument.value">{&#xd;
    &quot;version&quot;: &quot;1.1&quot;,&#xd;
    &quot;test&quot;: {&#xd;
}</stringProp>
                  <stringProp name="Argument.metadata">=</stringProp>
                </elementProp>
              </collectionProp>
            </elementProp>
            <stringProp name="HTTPSampler.domain"></stringProp>
            <stringProp name="HTTPSampler.port"></stringProp>
            <stringProp name="HTTPSampler.protocol"></stringProp>
            <stringProp name="HTTPSampler.contentEncoding"></stringProp>
            <stringProp name="HTTPSampler.path"></stringProp>
            <stringProp name="HTTPSampler.method">POST</stringProp>
            <boolProp name="HTTPSampler.follow_redirects">false</boolProp>
            <boolProp name="HTTPSampler.auto_redirects">false</boolProp>
            <boolProp name="HTTPSampler.use_keepalive">true</boolProp>
            <boolProp name="HTTPSampler.DO_MULTIPART_POST">false</boolProp>
            <stringProp name="HTTPSampler.embedded_url_re"></stringProp>
            <stringProp name="HTTPSampler.connect_timeout"></stringProp>
            <stringProp name="HTTPSampler.response_timeout"></stringProp>
          </HTTPSamplerProxy>
          <hashTree>
            <JSONPathAssertion guiclass="JSONPathAssertionGui" testclass="JSONPathAssertion" testname="JSON Assertion" enabled="true">
              <stringProp name="JSON_PATH">$.status</stringProp>
              <stringProp name="EXPECTED_VALUE">ok</stringProp>
              <boolProp name="JSONVALIDATION">true</boolProp>
              <boolProp name="EXPECT_NULL">false</boolProp>
              <boolProp name="INVERT">false</boolProp>
              <boolProp name="ISREGEX">false</boolProp>
            </JSONPathAssertion>
            <hashTree/>
            <JSR223PreProcessor guiclass="TestBeanGUI" testclass="JSR223PreProcessor" testname="JSR223 PreProcessor" enabled="true">
              <stringProp name="cacheKey">true</stringProp>
              <stringProp name="filename"></stringProp>
              <stringProp name="parameters"></stringProp>
              <stringProp name="script">// period in the past - year-month-day, set from properties in User Defined Variables

import java.time.Instant;
import java.time.temporal.ChronoUnit;
import groovy.json.JsonOutput;
import org.apache.commons.lang.RandomStringUtils;</stringProp>
              <stringProp name="scriptLanguage">groovy</stringProp>
            </JSR223PreProcessor>
            <hashTree/>
            <JSR223PostProcessor guiclass="TestBeanGUI" testclass="JSR223PostProcessor" testname="JSR223 PostProcessor" enabled="true">
              <stringProp name="cacheKey">true</stringProp>
              <stringProp name="filename"></stringProp>
              <stringProp name="parameters"></stringProp>
              <stringProp name="script">import java.time.Instant;
import java.time.temporal.ChronoUnit;</stringProp>
              <stringProp name="scriptLanguage">groovy</stringProp>
            </JSR223PostProcessor>
            <hashTree/>
          </hashTree>
          <IfController guiclass="IfControllerPanel" testclass="IfController" testname="If Controller" enabled="true">
            <stringProp name="IfController.condition">${__groovy(${random_variable} == 1)}</stringProp>
            <boolProp name="IfController.evaluateAll">false</boolProp>
            <boolProp name="IfController.useExpression">true</boolProp>
          </IfController>
          <hashTree>
            <RandomController guiclass="RandomControlGui" testclass="RandomController" testname="Random Controller" enabled="true">
              <intProp name="InterleaveControl.style">1</intProp>
            </RandomController>
            <hashTree>
              <HTTPSamplerProxy guiclass="HttpTestSampleGui" testclass="HTTPSamplerProxy" testname="REQUEST2" enabled="true">
                <boolProp name="HTTPSampler.postBodyRaw">true</boolProp>
                <elementProp name="HTTPsampler.Arguments" elementType="Arguments">
                  <collectionProp name="Arguments.arguments">
                    <elementProp name="" elementType="HTTPArgument">
                      <boolProp name="HTTPArgument.always_encode">false</boolProp>
                      <stringProp name="Argument.value">{&#xd;
    &quot;version&quot;: &quot;1.1&quot;,&#xd;
    &quot;test&quot;: {&#xd;
}</stringProp>
                      <stringProp name="Argument.metadata">=</stringProp>
                    </elementProp>
                  </collectionProp>
                </elementProp>
                <stringProp name="HTTPSampler.domain"></stringProp>
                <stringProp name="HTTPSampler.port"></stringProp>
                <stringProp name="HTTPSampler.protocol"></stringProp>
                <stringProp name="HTTPSampler.contentEncoding"></stringProp>
                <stringProp name="HTTPSampler.path"></stringProp>
                <stringProp name="HTTPSampler.method">POST</stringProp>
                <boolProp name="HTTPSampler.follow_redirects">false</boolProp>
                <boolProp name="HTTPSampler.auto_redirects">false</boolProp>
                <boolProp name="HTTPSampler.use_keepalive">true</boolProp>
                <boolProp name="HTTPSampler.DO_MULTIPART_POST">false</boolProp>
                <stringProp name="HTTPSampler.embedded_url_re"></stringProp>
                <stringProp name="HTTPSampler.connect_timeout"></stringProp>
                <stringProp name="HTTPSampler.response_timeout"></stringProp>
              </HTTPSamplerProxy>
              <hashTree>
                <JSONPathAssertion guiclass="JSONPathAssertionGui" testclass="JSONPathAssertion" testname="JSON Assertion" enabled="true">
                  <stringProp name="JSON_PATH">$.status</stringProp>
                  <stringProp name="EXPECTED_VALUE">ok</stringProp>
                  <boolProp name="JSONVALIDATION">true</boolProp>
                  <boolProp name="EXPECT_NULL">false</boolProp>
                  <boolProp name="INVERT">false</boolProp>
                  <boolProp name="ISREGEX">false</boolProp>
                </JSONPathAssertion>
                <hashTree/>
              </hashTree>
            </hashTree>
          </hashTree>
        </hashTree>
      </hashTree>
    </hashTree>
  </hashTree>
</jmeterTestPlan>
Alex Martian
  • 3,423
  • 7
  • 36
  • 71
  • Contrary to what you claim, the code you posted doesn't perform any substitutions when provided the text you posted. – ikegami Nov 18 '19 at 16:16
  • Re "*I could not find how to match single newline character in perl*", There's no "new line" character. I presume you mean a line feed character? `\n` – ikegami Nov 18 '19 at 16:18
  • @ikegami good to hear my code looks correct. I will try to obfuscate full file to produce sample for reperformance. – Alex Martian Nov 18 '19 at 16:18
  • 1
    I didn't say the code "looks correct". You didn't even specify what you are trying to do, so I couldn't possibly know if it's correct or not. I'd be more inclined to say it's NOT correct, seeing as you claim it doesn't do what you want (but you also claimed it performed a substitution it doesn't perform). – ikegami Nov 18 '19 at 16:19
  • @ikegami, I updated the answer. Producing some smaller files with part of content up to now results in "correct" non removal of that occurrence. I added exact commands run how I got my results and what I want to achieve. – Alex Martian Nov 18 '19 at 16:39
  • If I read that correctly, you still only provide data for which the code *works*, so you still have yet to demonstrate your problem – ikegami Nov 18 '19 at 16:41
  • @ikegami I deleted extra and sensitive info, file is 145 lines, pasted in full and tested by copying and pasting back to vi. I appreciate your help, please have a look. – Alex Martian Nov 18 '19 at 17:17
  • [This](https://regex101.com/r/R0U8YF/2) shows what is being matched. That's all one match. The `.*?` is matching more than you intend it to. As always, it's far more robust are even easier with a proper XML parser – ikegami Nov 18 '19 at 19:06

2 Answers2

4

This shows what is being matched. The entire highlighted portion of the text is just one match. The .*? is matching more than you intend it to.

I would use a proper XML parser.

use XML::LibXML qw( );

my $xpath = "
   //hashTree[
      count(*)=2 and
      *[position()=1 and name()='JSONPathAssertion'] and
      *[position()=2 and name()='hashTree' and count(*)=0]
   ]
";

my $doc = XML::LibXML->new->parse_file("a.xml");
$_->unlink for $doc->findnodes($xpath);
$doc->toFile("b.xml");

or

use XML::LibXML qw( );

my $doc = XML::LibXML->new->parse_file("a.xml");
for my $node ($doc->findnodes("//hashTree")) {
   my @child_eles = $node->findnodes("*");
   $node->unlink
      if @child_eles == 2
      && $child_eles[0]->nodeName eq "JSONPathAssertion"
      && $child_eles[1]->nodeName eq "hashTree"
      && $child_eles[1]->findnodes("*")->size == 0;
}

$doc->toFile("b.xml");
ikegami
  • 367,544
  • 15
  • 269
  • 518
0

The reason for incorrect work of Perl one-liner in question was discovered by ikegami. For future reference I post correct code in Perl:

cat test.jmx | perl -0777pe 's` *<hashTree>\n( *)<JSONPathAssertion.*\n.*\n.*\n.*\n.*\n.*\n.*\n *</JSONPathAssertion>\n *(<hashTree/>)\n *</hashTree>\n`$1$2`g' > test_.jmx

I noted only today that (as far as I noted) all elements of JMeter test plan xml tree should end with <hashTree> for child elements. If no elements present, than empty branch should be written: <hashTree/>: therefore when deleting element which is single in a branch, its' code should be replaced by code for empty branch: <hashTree/>.

Alex Martian
  • 3,423
  • 7
  • 36
  • 71