0

I have an XML document like this:

<document>
  <root>
    <field1>
      <a 
      Type="anyType">
        <b>Text</b>
      </a>
    <field1>
  </root>
</document>

I want to remove the whitespace in the XML attribute Type and the XML field a

The output should be like:

<document>
  <root>
    <field1>
      <a Type="anyType">
        <b>Text</b>
      </a>
    <field1>
  </root>
</document>

How to do this in Java?

The code to generate the XML File is below:

DOMSource domSource = new DOMSource(document);
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.transform(domSource, result);

String fileData = format(writer.toString());

// remove CR(carriage return)
fileData = fileData.replaceAll("&#[a-zA-Z0-9]{2};", "");
saveDocumentIntoFile(fileData, fileName);

method format()

public String format(String xml) {
        
        try {
            final InputSource src = new InputSource(new StringReader(xml));
            final Node document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src).getDocumentElement();
            final Boolean keepDeclaration = Boolean.valueOf(xml.startsWith("<?xml"));
            
            final DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
            final DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
            final LSSerializer writer = impl.createLSSerializer();
            
            // Set this to true if the output needs to be beautified.
            writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE);
            // Set this to true if the declaration is needed to be outputted.
            writer.getDomConfig().setParameter("xml-declaration", keepDeclaration);
            
            return writer.writeToString(document);
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

method saveDocumentIntoFile()

protected void saveDocumentIntoFile(String fileData, String fileName) throws IOException {
        FileWriter batchFile = new FileWriter(fileName);
        
        batchFile.write(fileData);
        batchFile.close();
    }
TylerH
  • 20,799
  • 66
  • 75
  • 101
  • Are you generating that XML? (with Java) – Nicolás Alarcón Rapela Apr 06 '21 at 13:35
  • Yes, I have generated that XML. – NIkhil Rohilla Apr 06 '21 at 13:35
  • The basis of what you happen is that a carriage return is being added ... Investigate with this starting point and if you do not find anything, edit the post with the part that generates the XML. – Nicolás Alarcón Rapela Apr 06 '21 at 13:38
  • How does a carriage return get added when i didnt add ? – NIkhil Rohilla Apr 06 '21 at 13:39
  • Better group the part that generates the XML :) – Nicolás Alarcón Rapela Apr 06 '21 at 13:46
  • It looks like there is an issue with `LSSerializer`. See [this](https://stackoverflow.com/questions/28347767/prevent-wrapping-of-lines-when-pretty-printing-xml-string). – Code Maverick Apr 06 '21 at 13:53
  • I checked out the link you shared but it seems the answer is outdated as the Class OutputFormat being used by them is deprecated. Also I have noticed some discrepancies in the way the file is being generated when i execute the application from eclipse vs executing the application jar using java command. – NIkhil Rohilla Apr 06 '21 at 14:18
  • How do you ingest data in 'document'? I have made a small way trying to reproduce the error (inserting my own data) and in my case it works with that data Comment that you see an xml in one way generated with Eclipse and another from a jar, does it have the same Java in both places? – Nicolás Alarcón Rapela Apr 06 '21 at 16:11
  • Yes, its the same java code in both places. the one with eclipse generates the XML with indented tags and the one with jar has indentation only for some tags and not for others. I was able to fix the carriage return inside the xml tag using the second answer available in the link but the formatting is still different in both places. – NIkhil Rohilla Apr 07 '21 at 04:50
  • The data is inserted in the document using createElementNS, createTextNode, appendChild methods – NIkhil Rohilla Apr 07 '21 at 05:12
  • @NIkhilRohilla I see that if you have found a solution to this issue, you should answer how to solve it and open a new question referring to it and explaining your new problem in maximum detail. – Nicolás Alarcón Rapela Apr 07 '21 at 08:36

0 Answers0