XML - XPath DOM parser in Java

Question

I have the following XML:

<docs>
<doc>
    <person>
        <name>John Doe</name>
        <description>
            <age>23</age>
            <alias>M C</alias>
        </description>
        <description>
            <age>24</age>
            <alias>John</alias>
        </description>
    </person>
<doc>
<doc>
    <person>
        <name>John Doe</name>
        <description>
            <age>24</age>
            <alias>Steve</alias>
        </description>
    </person>
<doc>
</docs>

I do not have control over the xml. All I get is such xml documents and xpath for the elements. I have write a java program to read the data and convert it to Json object. I am using xPath and dom parser and since we get the xPath for xml, I thought I will make use of it as the xPath may change in the future. So I had xPath for all the elements in the property file, so if there is a change I will have minimum changes in the program. And unfortunately the program should be case insensitive, so I used translate(xPath) to handle it. I have following class

public class Person {
  private List<String> name;
  private List<String> age;
  private List<String> alias;
  //getter and setter
}

The issue is having multiple doc nodes and each can have multiple age and alias elements. Earlier it was not the requirement so I used XPath to get the text but now I can't use it because the xPath of //person/description will return 3 nodes, 2 from first doc and one from another doc. The issue is I need to differentiate the description tag to say whether it is coming from first doc or the other one. So the final Json will look like

{
  "docs":
  {
    "doc":
    [
      {
        "description":
         [
           {
             "age": 23,
             "alias": "M C"
           },
           {
             "age": 24,
             "alias": "John"
           }
         ]
       },
       {
          "description":
           [
            {
              "age": 24,
              "alias": "Steve"
            }
           ]
         }
    ]
  }
 }

So all I could think of is compile the xPath expression - //docs/doc, I will have 2 nodes at this point and get the child nodes and loop through by getting the child nodes and do something like if

element.getTagName().equalsIgnoreCase("age")

then add to age list and then do like list of lists , so I will end up having

docs[[[23, "M C"],[24, "John"]],[[24, "Steve"]]]

Any better ideas?

Are you required to use XPath or are answers using other tools or techniques acceptable? — Justin Albano, Apr 05 '18 at 21:55
No I just used xpath because the client gives us XML with xpath so if they change it , it will be easy to maintain.. I just don't want to try different parser like saxparser — Raghul Raman, Apr 05 '18 at 23:20

score 0 · Answer 1 · answered Apr 05 '18 at 22:35

An alternative to XPath is the Java JSON library. This library has features for both consuming XML and producing JSON. To complete this transform, use the following:

public class Main {

    private static int SPACES_PER_INDENT = 4;

    public static void main(String[] args) throws Exception {

        try {
            // Citation: https://stackoverflow.com/questions/1823264/quickest-way-to-convert-xml-to-json-in-java
            URI file = Main.class.getClassLoader().getResource("docs.xml").toURI();
            String xmlContents = readFromInputStream(file);
            JSONObject jsonContents = XML.toJSONObject(xmlContents);
            String jsonString = jsonContents.toString(SPACES_PER_INDENT);
            System.out.println(jsonString);
        } 
        catch (JSONException e) {
            e.printStackTrace();
        }
    }

    private static String readFromInputStream(URI uri) throws IOException {
        // Citation: http://www.baeldung.com/reading-file-in-java
        Path path = Paths.get(uri);
        StringBuilder data = new StringBuilder();

        Stream<String> lines = Files.lines(path);
        lines.forEach(line -> data.append(line).append("\n"));
        lines.close();

        return data.toString();
    }
}

The contents of docs.xml, which is present on the classpath, is as follows (note that I changed the <doc>...<doc> tags in the question to <doc>...</doc>):

<docs>
    <doc>
        <person>
            <name>John Doe</name>
            <description>
                <age>23</age>
                <alias>M C</alias>
            </description>
            <description>
                <age>24</age>
                <alias>John</alias>
            </description>
        </person>
    </doc>
    <doc>
        <person>
            <name>John Doe</name>
            <description>
                <age>24</age>
                <alias>Steve</alias>
            </description>
        </person>
    </doc>
</docs>

The pom.xml for this project is:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>com.stackoverflow.albanoj2</groupId>
    <artifactId>XmlToJson</artifactId>
    <version>0.0.1-SNAPSHOT</version>

    <dependencies>
        <dependency>
            <groupId>org.json</groupId>
            <artifactId>json</artifactId>
            <version>20180130</version>
        </dependency>
    </dependencies>
</project>

The resulting output is:

{"docs": {"doc": [
    {"person": {
        "name": "John Doe",
        "description": [
            {
                "alias": "M C",
                "age": 23
            },
            {
                "alias": "John",
                "age": 24
            }
        ]
    }},
    {"person": {
        "name": "John Doe",
        "description": {
            "alias": "Steve",
            "age": 24
        }
    }}
]}}

If it's easier to read, the source code for this solution is contained in the following repository: https://github.com/albanoj2/XmlToJsonDocTranslation. Note that the bulk of this answer is derived from Quickest way to convert XML to JSON in Java, but has been tailored to suit the specific needs of this question (such as reading the XML from a file).

I saw this when I started but I don't remember why I didn't go with it. One other requirement is I need to add some other custom extra entities to json that's not there in XML. I'm sure there should be a way in the json library. — Raghul Raman, Apr 05 '18 at 23:32
You can manipulate the `JSONObject` that is returned when you convert the XML to JSON (i.e. `JSONObject jsonContents = XML.toJSONObject(xmlContents)`). You can add new keys, lists, etc. to that `JSONObject`. For more information, see: https://stleary.github.io/JSON-java/org/json/JSONObject.html. — Justin Albano, Apr 05 '18 at 23:53
I think i remember why I did not go with this one. We rename the tag names when converted to Json because some of the tag names in xml is not properly named/structrured. For example — Raghul Raman, Apr 06 '18 at 00:30
Sorry to have in 2 separate comments. I will look into it but I dont think I can rename the tags like if it is tag, the key in Json will be PersonName. — Raghul Raman, Apr 06 '18 at 00:54
You can do a conversion to accomplish that. I.e. create a function that takes in a JSONObject and return another JSONObject with he renamed tags. — Justin Albano, Apr 06 '18 at 01:10

XML - XPath DOM parser in Java

1 Answers1