0

I have an XML document with two element types. One element type that only has attributes and other that only contains the first.

<TagList name="Results">
    <Tag name="type_of_identifier" value="idvalue"/>
    <Tag name="some_other_identifier" value="otheridvalue"/>
    ....
    <Tag name="type" value="asdfaf"/>
    <TagList name="SubList">
        <Tag name="param1" value="value1"/>
        <Tag name="param2" value="value2"/>
    </TagList>
</TagList>

I'm new to XML (also Java) and I'm just at a loss as to why this was set up this way.

Is there a way I can get the value of a node by specifying the name, without having to loop through every node?

horriblyUnpythonic
  • 853
  • 2
  • 14
  • 34
  • 3
    what about Xpath? See this question http://stackoverflow.com/questions/2811001/how-to-read-xml-using-xpath-in-java – Jaroslav Kubacek Feb 25 '14 at 23:21
  • just whatever you do, don't try and use regex, or something along those lines to try and narrow it down without looping through everything. http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags for more info – WillBD Feb 25 '14 at 23:30
  • @JaroslavKubacek Yep, XPath should do it. I was following the example of http://www.mkyong.com/java/how-to-read-xml-file-in-java-dom-parser/ and it was making things very messy. – horriblyUnpythonic Feb 26 '14 at 00:20
  • 1
    Congratulations, your instincts are absolutely correct: This is a good example of how to use XML badly. Whoever designed this document should indeed have simply written `` and `` and so on, unless they had an exceptionally good reason not to -- and the element names they did use suggest that, no, they simply didn't understand XML well enough. Of course to be sure of that you'd have to go back and ask _them_, but I think you're safe in assuming that they just didn't bother learning much about the tool before trying to use it. – keshlam Feb 26 '14 at 02:04

1 Answers1

1

Consider converting the file to something more sanitary before processing it. A simple XSLT stylesheet with two template rules:

<xsl:template match="TagList">
  <xsl:element name="{@name}">
    <xsl:apply-templates/>
  </xsl:element>
</xsl:template>

<xsl:template match="Tag">
  <xsl:element name="{@name}">
    <xsl:value-of select="@value"/>
  </xsl:element>
</xsl:template>

will convert it to something like this:

<Results>
    <type_of_identifier>idvalue</type_of_identfier>
    <some_other_identifier>otheridvalue<some_other_identifier/>
    ....
    <type>asdfaf</type>
    <SubList>
        <param1>value1</param1>
        <param2>value2</param2>
    </SubList>
</Results>

The advantage of doing this is that all subsequent processing of the document becomes much easier.

Sheridan
  • 68,826
  • 24
  • 143
  • 183
Michael Kay
  • 156,231
  • 11
  • 92
  • 164