-1

I have xml file like this (10k lines):

<?xml version="1.0" encoding="UTF-8"?>
<Translations xmlns="http:...">
    <customApplications>
        <label><!-- Pricing Notifications --></label>
        <name>TEAM_Tesla</name>
    </customApplications>
    <customApplications>
        <label><!-- CRM --></label>
        <name>TEAM_Tender</name>
    </customApplications>
    <customApplications>
        <label>Actualization Portal</label>
        <name>Actualization_Portal</name>
    </customApplications>

I want to remove blocks containing comments (not only commented parts)


Desired output:

<?xml version="1.0" encoding="UTF-8"?>
<Translations xmlns="http:...">
    <customApplications>
        <label>Actualization Portal</label>
        <name>Actualization_Portal</name>
    </customApplications>
Andy
  • 61,948
  • 13
  • 68
  • 95
Mamed
  • 1,102
  • 8
  • 23
  • And what language? Java _or_ JavaScript? – Andy Aug 30 '21 at 15:48
  • I suspect the original poster wants to make the edits right in the editor (VS Code). [Regex Visual Studio Code](https://stackoverflow.com/q/42179046/418950) is probably the right direction. – ScottWelker Aug 30 '21 at 17:17
  • @Andy Java language – Mamed Aug 30 '21 at 17:46
  • How do you define "block"? Will that always be an element named `customApplications`? Or a child of the root element? XSLT can transform XML to XML and is supported in VS code, I guess, by various extensions at least, like the https://marketplace.visualstudio.com/items?itemName=deltaxml.xslt-xpath, for instance. – Martin Honnen Aug 30 '21 at 17:52

2 Answers2

2

XSLT like

<xsl:stylesheet
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>
  
  <xsl:template match="/*/*[descendant::comment()]"/>

</xsl:stylesheet>

would remove any child of the root element that has a comment node as a descendant. You can run XSLT in Java using https://docs.oracle.com/javase/8/docs/api/javax/xml/transform/Transformer.html.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
0

Alternatively, you could use the XPath capabilities in the JDK to find the label element with comment and then remove it from its parent:

// I took the xml sample as an example.
String source =
        "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
        "<Translations xmlns=\"http:...\">\n" +
        "    <customApplications>\n" +
        "        <label><!-- Pricing Notifications --></label>\n" +
        "        <name>TEAM_Tesla</name>\n" +
        "    </customApplications>\n" +
        "    <customApplications>\n" +
        "        <label><!-- CRM --></label>\n" +
        "        <name>TEAM_Tender</name>\n" +
        "    </customApplications>\n" +
        "    <customApplications>\n" +
        "        <label>Actualization Portal</label>\n" +
        "        <name>Actualization_Portal</name>\n" +
        "    </customApplications>\n" +
        "</Translations>";

// We are converting the xml into a document.
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
InputSource inputSource = new InputSource(new StringReader(source));
Document document = documentBuilder.parse(inputSource);

// Evaluate expression result on xml document.
XPathFactory xpathFactory = XPathFactory.newInstance();
XPath xpath = xpathFactory.newXPath();
String xpathExpression = "//*//*//label[descendant::comment()]//parent::customApplications";
XPathExpression xPathExpression = xpath.compile(xpathExpression);
NodeList nodes = (NodeList) xPathExpression.evaluate(document, XPathConstants.NODESET);

// Process of removing the nodes you find.
for (int i = 0; i < nodes.getLength(); i++) {
    nodes.item(i).getParentNode().removeChild(nodes.item(i));
}

// Transformer to console.
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.transform(new DOMSource(document), new StreamResult(System.out));

If you want to output to a file:

Transformer transformer = TransformerFactory.newInstance().newTransformer();
Result output = new StreamResult(new File("output.xml"));
Source input = new DOMSource(document);
transformer.transform(input, output);
İsmail Y.
  • 3,579
  • 5
  • 21
  • 29
  • Your code doesn't look for a `label` element with a comment and removes it from its parent, as the text says, instead it looks for a `customApplications` element with a comment node descendant. So the description does not quite fit the code, even if the intention is the same – Martin Honnen Aug 31 '21 at 13:36
  • I updated the expression to fit this expression, many different methods can be used, I may not be able to write more professionally programmatically, but I am open to improvements. Thank you, @MartinHonnen – İsmail Y. Aug 31 '21 at 13:56