0

I am parsing an rss XML news feed for an Android app. I am using SAXParser and all is working as it should but I would like to be able to limit the number of stories that I am retrieving and I cannot find a way to do so. For instance, say there are 45 stories from one of the news feeds and I just want the newest 10. As it is now, I am just grabbing them all into an ArrayList and only displaying the ones I want, which most certainly is the most efficient way of doing this I am sure.

I can provide the parsing code if necessary.

Thanks to anyone looking at this!

WeVie
  • 568
  • 2
  • 9
  • 25
  • When you are adding objects to arraylist simply check its size and don't add to it if it has 10 elements. – Misagh Emamverdi Sep 28 '14 at 06:24
  • I thought about that but that would mean that the entire XML will still get parsed. I would rather simply stop parsing once the `ArrayList` gets to the desired size. – WeVie Sep 28 '14 at 06:26
  • You could just break out/return from the parsing operation after hitting the desired number of elements (or until there are no more to process, whichever comes first)? – MH. Sep 28 '14 at 06:37
  • How so @MH. ? Will a simple `break` work if I add a counter to the `startElement()` method where I am adding objects to the list? – WeVie Sep 28 '14 at 06:41
  • Sorry, that comment was a little inaccurate: a break or return won't work. You'll have to throw an exception, as suggested [here](http://www.ibm.com/developerworks/library/x-tipsaxstop/). A quick search on SO yields the same answer. Not the prettiest, but functional, I suppose... Have a look at the [accepted answer here](http://stackoverflow.com/questions/1345293/how-to-stop-parsing-xml-document-with-sax-at-any-time) for some concrete pointers. – MH. Sep 28 '14 at 07:03

2 Answers2

2

You can stop a SAX parser from parsing any more input by having any of your callback methods (e.g. startElement) throw a SAXException.

You will need to make this exception recognizable (e.g. by using special message text, or by using a subclass of SAXException) so that when your original call of parse() comes back with an exception, you can distinguish it from other causes of parser failure.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • Do you have an example of how to do this? – WeVie Sep 28 '14 at 14:12
  • Not a simple one, I'm afraid. There's an example of the general technique at http://grepcode.com/file/repo1.maven.org/maven2/net.sourceforge.saxon/saxon/9.1.0.8/net/sf/saxon/event/PIGrabber.java, but it's not using SAX interfaces directly, so the events and exceptions are slightly different. (This example is to read the xml-stylesheet processing instruction at the start of a file, and then abort the parse when the first element node is encountered.) – Michael Kay Sep 28 '14 at 20:39
0

I am not sure that there is a way to stop sax parser from parsing. However you can use XMLPullParser instead.

XmlPullParserFactory factory = XmlPullParserFactory.newInstance();
XmlPullParser xpp = factory.newPullParser();
xpp.setInput(yourXML);

int eventType = xpp.getEventType();

while (eventType != XmlPullParser.END_DOCUMENT && list.size() <= MAX_SIZE) {

    if (eventType == XmlPullParser.START_TAG) {
        //do something
    } else if (eventType == XmlPullParser.END_TAG) {
        //do something
    } else if (eventType == XmlPullParser.TEXT) {
        //do something
    }
    eventType = xpp.next();
}

You can find lots of examples by searching XMLPullParser tutorial.

Note: I think in case you have just 45 items parsing is very fast and you can let sax to continue parsing.

Update: I think it is what Michael says:

@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {

   if(size == MAX_SIZE){
      throw new SAXException("end");
    }
    //...
 }

And when you are parsing:

try{
saxParser.parse(yourXML);
}catch(SAXException e){
  if(e.getMessage().equals("end"){
    // document has ended
  }
}
Misagh Emamverdi
  • 3,654
  • 5
  • 33
  • 57
  • I just used 45 as an example. What if there are hundreds? It's not so fast anymore. I am aware of PullParser but need to learn more about SAX. – WeVie Sep 28 '14 at 14:16