0

I want to parse a very long string from an xml file. You can see the xml file here. If you visit the above file, there is a "description" tag from which I want to parse the string. When there is a short short string, say 3-lines or 4-lines string in the "description" tag, then my parser(Java SAX parser) easily parse the string but, when the string is hundreds of lines then my parser cannot parse the string. You can check my code that I am using for the parsing and please let me know where I am going wrong in this regard. Please help me in this respect I would be very thankful to you for this act of kindness.

Here is the parser GetterSetter class

public class MyGetterSetter 
{
    private ArrayList<String> description = new ArrayList<String>();


        public ArrayList<String> getDescription()
        { 
            return description;
        }

        public void setDescription(String description) 
        { 


            this.description.add(description);
        }
} 

Here is the parser Handler class

public class MyHandler extends DefaultHandler 
{
    String elementValue = null;
    Boolean elementOn = false;
    Boolean item = false;

    public static MyGetterSetter data = null;

    public static MyGetterSetter getXMLData() 
    {
        return data;
    }

    public static void setXMLData(MyGetterSetter data) 
    {
        MyHandler.data = data;
    }


    public void startDocument() throws SAXException
    {
        data =  new MyGetterSetter();
    }

    public void endDocument() throws SAXException
    {

    }

    public void startElement(String namespaceURI, String localName,String qName, Attributes atts) throws SAXException
    {
        elementOn = true;

        if (localName.equalsIgnoreCase("item"))
        item = true;
    }

    public void endElement(String namespaceURI, String localName, String qName) throws SAXException
    {
        elementOn = false;

        if(item)
        {

            if (localName.equalsIgnoreCase("description"))
                {   
                data.setDescription(elementValue);


                Log.d("--------DESCRIPTION------", elementValue +" ");

                }


            else if (localName.equalsIgnoreCase("item")) item = false;
        }



    }

    public void characters(char ch[], int start, int length)
    {
        if (elementOn) 
        {
            elementValue = new String(ch, start, length);
            elementOn = false;
        }
    }



}
Harshad Pansuriya
  • 20,189
  • 8
  • 67
  • 95
user2391890
  • 639
  • 1
  • 8
  • 15

1 Answers1

0

Use the org.w3c.dom package.

public static void main(String[] args) {
    try {
        URL url = new URL("http://www.aboutsports.co.uk/fixtures/");

        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(url.openStream());

        NodeList list = doc.getElementsByTagName("item"); // get <item> nodes

        for (int i = 0; i < list.getLength(); i++) {
            Node item = list.item(i);
            NodeList descriptions = ((Element)item).getElementsByTagName("description"); // get <description> nodes within an <item>
            for (int j = 0; j < descriptions.getLength(); j++) {
                Node description = descriptions.item(0);

                System.out.println(description.getTextContent()); // print the text content
            }
        }

    } catch (Exception e) {
        e.printStackTrace();
    }
}

XPath in java is also great for extracting bits from XML documents. Here's an example.

You would use a XPathExpression like /item/description. When you would evaluate it on the XML InputStream, it would return a NodeList like above with all the <description> elements within a <item> element.

If you wanted to do it your way, with a DefaultHandler, you would need to set and unset flags so you can check if you are in the body of a <document> element. The code above probably does something similar internally, hiding it from you. The code is available in java, so why not use it?

Community
  • 1
  • 1
Sotirios Delimanolis
  • 274,122
  • 60
  • 696
  • 724