0

A tried to parse XML from https://www.boardgamegeek.com/xmlapi/boardgame/13/catan and get the value of the highest numvotes of Language Dependence.

This is code:

public class DomParserDemo {

    public static void main(String[] args) {

        try {

            DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
            DocumentBuilder dbBuilder = dbFactory.newDocumentBuilder();
            InputSource is = new InputSource(new StringReader("please paste XML from link");
                    Document doc = dbBuilder.parse(is);
            doc.getDocumentElement().normalize();
            System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
            NodeList nodeList = doc.getElementsByTagName("result") ;

            String targetValue = "";
            int maxNumVotes = 0;
            for (int i = 0; i < nodeList.getLength(); i++) {
                Element element = (Element) nodeList.item(i);
                int numVotes = Integer.parseInt(element.getAttribute("numvotes"));
                if (numVotes > maxNumVotes) {
                    maxNumVotes = numVotes;
                    targetValue = element.getAttribute("value");
                }
            }
            System.out.println("Value: " + targetValue + " NumVotes: " + maxNumVotes);

        }
        catch (Exception e) {
            e.printStackTrace();
        }
    }
} 

Output:

[Fatal Error] :1:10703: The entity name must immediately follow the '&' in the entity reference.
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 10703; The entity name must immediately follow the '&' in the entity reference.
    at java.xml/com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:261)
    at java.xml/com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
    at DomParserDemo.main(DomParserDemo.java:17)

1 Answers1

0

If you open the URL in the browser and search for &, the first hit will find:

BGTG 115 - Spiel des Jahres, Then &amp; Now

That &amp; is a valid entity reference.

If you keep searching, the second hit fill find:

Catan: Cities & Knights

That is invalid XML. An & must be followed by a name and a ;. To have a & in the value, it must be escaped as &amp;.

In short, the XML returns by that URL is invalid, and the Java XML parser tells you so.

Andreas
  • 154,647
  • 11
  • 152
  • 247