1

I've developed a Java application in JDK 16, based on the dom4j library, which adds some specific XML elements as children of a particular element in a provided XML document. Some element of the input XML document, as well as some children elements of the XML element my app creates and adds in the input document, contain Greek strings as content. For example, here is one such element of the input XML file:

<DecPlaHEA394>ΠΕΙΡΑΙΑΣ</DecPlaHEA394>

and here is the XML elements I create and add to the input document:

<VEHDET>
  <FraNumVD1014>LATTCJCY7M1400121</FraNumVD1014>
  <VehCC4006>124</VehCC4006>
  <VehFue4007>ΒΕΝΖΙΝΗ ΑΜΟΛΥΒΔΗ</VehFue4007>
  <VehTyp4008>3</VehTyp4008>
  <VehEngTyp4009>DY152QMI-3C</VehEngTyp4009>
  <ProdYea4010>2021</ProdYea4010>
  <VecTra4023>DY125T-28D</VecTra4023>
  <VehFacTyp4001>DY125T-28D</VehFacTyp4001>
  <VehUseFl4015>0</VehUseFl4015>
  <ArImpDatVECDET01>20210426</ArImpDatVECDET01>
  <ImpCodVECDETGI>457</ImpCodVECDETGI>
  <CarDioEmiVECDET04>-</CarDioEmiVECDET04>
</VEHDET>

Whenever I run my application from Eclipse, any Greek content in the output XML is displayed as plain Greek text, exactly as in the input XML file. Yesterday, after packaging my application to an "uberjar" which contain all the dependencies using the Maven assembly plugin, I noticed that, when I run it by executing the mentioned JAR from the Windows Terminal, all the Greek contents are displayed with their UTF-8 encoding in the output XML file (I need to add it as screenshot here, in order for the problem I'm facing to be understood):

enter image description here

The encoding of the input XML file is "utf-8", but I take care of using the same encoding in my output XML. The code that generates the output XML by copying the XML content of the input XML file to it and adding the mentioned XML elements is as follows:

public int createOutputXML(int customsDeclaration, int howManyModels, ArrayList<InputFrameNumbersFileReader> inputFraNumFileReaders, ArrayList<ModelDescriptor> modelDescriptorsArray) {
    try {
        URL urlOfInputXMLFile = Paths.get(m_InputXMLFile).toUri().toURL();
        
        SAXReader reader = new SAXReader();
        Document document = reader.read(urlOfInputXMLFile);
        String encoding = document.getXMLEncoding();
        
        // Get the root element:
        Element root = document.getRootElement();
        
        // For each different model: 1 to howManyModels
        // - Find its respective GOOITEGDS: Find all the GOOITEGDS, store them in the list, and get them one at a time
        // - Find its respective TAXADDELE100: Same way
        // - Call writeVEHDATBlocksToXMLImportAndHomeUseCases() or writeVEHDATBlocksToXMLWarehousingCase() with the GOOITEGDS and the TAXADDELE100
        //   for the current model.
        
        // Get the node GOOITEGDS (parent node of these nodes)
        List<Element> gooIteGdsElemList = root.elements("GOOITEGDS");
        if ((gooIteGdsElemList != null))  {
            if (gooIteGdsElemList.size() == howManyModels) {
                for (Element gooIteGdsElem : gooIteGdsElemList) {
                    if (customsDeclaration != 3) {
                        Element taxAddEleElem = gooIteGdsElem.element("TAXADDELE100");
                        if (taxAddEleElem != null) {
                            // Start creating the VEHDET XML elements in the DOM tree of the document (as children of the GOOITEGDS, right above the block TAXADDELE100).
                            ArrayList<String> frameNumbersArray = inputFraNumFileReaders.get(gooIteGdsElemList.indexOf(gooIteGdsElem)).getFrameNumbersArray();
                            ModelDescriptor mdElem = modelDescriptorsArray.get(gooIteGdsElemList.indexOf(gooIteGdsElem));
                            writeVEHDATBlocksToXMLImportAndHomeUseCases(gooIteGdsElem, taxAddEleElem, frameNumbersArray, mdElem);
                        } else {
                            System.out.println("OutputXMLGenerator.createOutputXML: The XML element TAXADDELE100 wasn't found in the input XML file.");
                            return 1;   // PROBLEM
                        }
                    } else {
                        Element warIdGiElem = gooIteGdsElem.element("WARIDGI700");
                        if (warIdGiElem != null) {
                            // Start creating the VEHDET XML elements in the DOM tree of the document (as children of the GOOITEGDS, right above the block WARIDGI700).
                            ArrayList<String> frameNumbersArray = inputFraNumFileReaders.get(gooIteGdsElemList.indexOf(gooIteGdsElem)).getFrameNumbersArray();
                            writeVEHDATBlocksToXMLWarehousingCase(gooIteGdsElem, warIdGiElem, frameNumbersArray);
                        } else {
                            System.out.println("OutputXMLGenerator.createOutputXML: The XML element WARIDGI700 wasn't found in the input XML file.");
                            return 1;   // PROBLEM
                        }
                    }
                }
                
                // Write the changed XML document to the output XML file:
                // OBSERVATION - the encoding of the output file (not its XML content) is "UTF-8" while the same encoding of the input file is "UFT-8 BOM".
                try (FileWriter output = new FileWriter(m_OutputXMLFile)) {
                    OutputFormat format = OutputFormat.createPrettyPrint();
                    if (encoding != null) {
                        format.setEncoding(encoding);
                    } else {
                        System.out.println("OutputXMLGenerator.createOutputXML: Αποτυχία αναγνώρισης της κωδικοποίησης XML του αρχείου εισόδου XML.");
                    }
                    
                    XMLWriter writer = new XMLWriter(output, format);
                    writer.write(document);
                    writer.close();
                    System.out.println("Το τροποποιημένο αρχείο εξόδου XML είναι έτοιμο!");
                    //return 0; // SUCCESS
                } catch (IOException ioEx) {
                    System.out.println("OutputXMLGenerator.createOutputXML: IOException when trying to open the output file for writing.");
                    System.out.println(ioEx.getCause());
                    return 1;   // PROBLEM
                }
                
            } else {
                System.out.println("OutputXMLGenerator.createOutputXML: The number of GOOITEGDS elements in the input XML is not equal to the value of howManyModels.");
                //System.exit(-1);
                return 1;   // PROBLEM
            }
        } else {
            System.out.println("OutputXMLGenerator.createOutputXML: Input XML file does no have any GOOITEGDS elements, so it must be wrong.");
            //System.exit(-1);
            return 1;   // PROBLEM
        }
    } catch (DocumentException dEx) {
        System.out.println("OutputXMLGenerator.createOutputXML: DocumentException caught when trying to read the input XML file.");
        //System.exit(-1);
        return 1;   // PROBLEM
    } catch (MalformedURLException murlEx) {
        System.out.println("OutputXMLGenerator.createOutputXML: MalformedURLException caught when trying to form the URL of the XML file. ");
        //System.exit(-1);
        return 1;   // PROBLEM
    } catch (InvalidPathException ipe) {
        System.out.println("OutputXMLGenerator.createOutputXML: The path to the output XML file is invalid.");
        //System.exit(-1);
        return 1;   // PROBLEM
    }
    
    return 0;   // SUCCESS
}

private void writeVEHDATBlocksToXMLImportAndHomeUseCases(Element parentElem, Element siblingElem,
        ArrayList<String> frameNumbersArray, ModelDescriptor mdElem) {
    for (String fraNum : frameNumbersArray) {
        // Create the VEHDET XML element from the mdElem fields, as a DOMElement.
        DOMElement domEl = new DOMElement("VEHDET");
        domEl.addElement("FraNumVD1014").addText(fraNum);
        domEl.addElement("VehCC4006").addText(String.valueOf(mdElem.getM_VehCCforModel()));
        domEl.addElement("VehFue4007").addText(mdElem.getM_VehFuelforModel());
        domEl.addElement("VehTyp4008").addText(String.valueOf(mdElem.getM_VehTypeforModel()));
        domEl.addElement("VehEngTyp4009").addText(mdElem.getM_VehEngTypeforModel());
        domEl.addElement("ProdYea4010").addText(String.valueOf(mdElem.getM_ProdYearforModel()));
        domEl.addElement("VecTra4023").addText(mdElem.getM_VehTraforModel());
        domEl.addElement("VehFacTyp4001").addText(mdElem.getM_VehFacTypeforModel());
        domEl.addElement("VehUseFl4015").addText(mdElem.getM_VehUseforModel());
        domEl.addElement("ArImpDatVECDET01").addText(mdElem.getM_ArImpDateforModel());
        domEl.addElement("ImpCodVECDETGI").addText(mdElem.getM_FacCodeforModel());
        domEl.addElement("CarDioEmiVECDET04").addText(String.valueOf(mdElem.getM_CarbonDioxideEmissionsforModel()));
        
        // Add the newly created element to the XML Document object.
        //System.out.println(domEl.getNodeName());
        List<Element> childElems = parentElem.elements();
        childElems.add(childElems.indexOf(siblingElem), domEl);
    }
}

private void writeVEHDATBlocksToXMLWarehousingCase(Element parentElem, Element siblingElem, ArrayList<String> frameNumbersArray) {
    for (String fraNum : frameNumbersArray) {
        // Create the VEHDET XML element from the mdElem fields, as a DOMElement.
        DOMElement domEl = new DOMElement("VEHDET");
        domEl.addElement("FraNumVD1014").addText(fraNum);
                        
        // Add the newly created element to the XML Document object.
        //System.out.println(domEl.getNodeName());
        List<Element> childElems = parentElem.elements();
        childElems.add(childElems.indexOf(siblingElem), domEl);
    }
}

Additionally, here is the code of my POM.xml, in case that it'll provide any further insights:

<?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>com.proskos.maven</groupId>
<artifactId>MotorbikeContainerProcessor</artifactId>
<version>0.0.1-SNAPSHOT</version>

<name>MotorbikeContainerProcessor</name>

<properties>
  <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  <maven.compiler.release>16</maven.compiler.release>
</properties>

<dependencies>
  <dependency>
    <groupId>junit</groupId>
    <artifactId>junit</artifactId>
    <version>4.11</version>
    <scope>test</scope>
  </dependency>
  <dependency>
    <groupId>org.openjfx</groupId>
    <artifactId>javafx-graphics</artifactId>
    <version>16</version>
  </dependency>
  <dependency>
    <groupId>org.openjfx</groupId>
    <artifactId>javafx-controls</artifactId>
    <version>16</version>
  </dependency>
  <dependency>
    <groupId>org.openjfx</groupId>
    <artifactId>javafx-fxml</artifactId>
    <version>16</version>
  </dependency>
  <dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>3.15</version>
  </dependency>
  <dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>3.15</version>
  </dependency>
  <dependency>
    <groupId>com.opencsv</groupId>
    <artifactId>opencsv</artifactId>
    <version>5.5</version>
  </dependency>
  <dependency>
    <groupId>org.dom4j</groupId>
    <artifactId>dom4j</artifactId>
    <version>2.1.3</version>
  </dependency>
</dependencies>

<build>
  <pluginManagement><!-- lock down plugins versions to avoid using Maven defaults (may be moved to parent pom) -->
    <plugins>
      <!-- clean lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#clean_Lifecycle -->
      <plugin>
        <artifactId>maven-clean-plugin</artifactId>
        <version>3.1.0</version>
      </plugin>
      <!-- default lifecycle, jar packaging: see https://maven.apache.org/ref/current/maven-core/default-bindings.html#Plugin_bindings_for_jar_packaging -->
      <plugin>
        <artifactId>maven-resources-plugin</artifactId>
        <version>3.0.2</version>
      </plugin>
      <plugin>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.8.0</version>
      </plugin>
      <plugin>
        <artifactId>maven-surefire-plugin</artifactId>
        <version>2.22.1</version>
      </plugin>
      <plugin>
        <artifactId>maven-install-plugin</artifactId>
        <version>2.5.2</version>
      </plugin>
      <plugin>
        <artifactId>maven-deploy-plugin</artifactId>
        <version>2.8.2</version>
      </plugin>
      <!-- site lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#site_Lifecycle -->
      <plugin>
        <artifactId>maven-site-plugin</artifactId>
        <version>3.7.1</version>
      </plugin>
      <plugin>
        <artifactId>maven-project-info-reports-plugin</artifactId>
        <version>3.0.0</version>
      </plugin>
    </plugins>
  </pluginManagement>
  <plugins>
    <plugin>
      <artifactId>maven-assembly-plugin</artifactId>
      <configuration>
        <archive>
          <manifest>         <mainClass>com.proskos.maven.MotorbikeContainerProcessor.ContainerProcessorDriver</mainClass>
          </manifest>
        </archive>
        <descriptorRefs>
          <descriptorRef>jar-with-dependencies</descriptorRef>
        </descriptorRefs>
      </configuration>
      <executions>
        <execution>
          <id>make-assembly</id> <!-- this is used for inheritance merges -->
          <phase>package</phase> <!-- bind to the packaging phase -->
          <goals>
            <goal>single</goal>
          </goals>
        </execution>
      </executions>
    </plugin>
  </plugins>
</build>
</project>

Any idea of what could be wrong here?

Kapoios
  • 688
  • 7
  • 22
  • Is there a reason why you are using an old version of apache poi ? https://search.maven.org/search?q=org.apache.poi ? – khmarbaise Jul 31 '21 at 12:14
  • @khmarbaise Nothing in particular. I just followed some tutorials (as I was novice on this library) whach are based on methods that seemingly became deprecated in newer versions of it. – Kapoios Jul 31 '21 at 12:23
  • *I noticed that, when I run it by executing the mentioned JAR from the Windows Terminal, all the Greek contents are displayed with their UTF-8 encoding in the output XML file* Curious. And slightly unexpectedly you show something with colour. Is that really cmd.exe? If somehow it is, did you try running with the Powershell terminal? Either way, there would need to be a font in use with Greek glyphs – g00se Jul 31 '21 at 14:00
  • 2
    *all the Greek contents are displayed with their UTF-8 encoding in the output XML file* is actually not the case. The encoding of the Greek in UTF-8 would be `CE A0 CE 95 CE 99 CE A1 CE 91 CE 99 CE 91 CE A3`, each character being 2 bytes long. Actually what you have looks to be ISO-8859-7 (8-bit enc. for that part of the world). You should set your output encoding to UTF-8 explicitly. And actually, even then you could have problems with Windows since its terminals are not really up-to-date. – g00se Jul 31 '21 at 14:08
  • I'm not certain, as I don't use Windows, but you might have more luck with Powershell terminal in displaying Unicode. You could test it by trying to paste those characters in to the terminal before running your jar. – g00se Jul 31 '21 at 14:12
  • @g00se thanks for noticing that! When I open the output XML file in, e.g., notepad++ and change the encoding to ISO-8859-7, the Greek characters are displayed correctly. As I mentioned in my question, the encoded Greek character are not printed in Windows Terminal, but in the output XML file (from which I took the screenshot, after opening it in notepad++). – Kapoios Jul 31 '21 at 14:27
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/235482/discussion-between-g00se-and-kapoios). – g00se Jul 31 '21 at 14:37

1 Answers1

0

Problem solved when I tried the solution of the selected answer in this Q&A thread.

Kapoios
  • 688
  • 7
  • 22