Suppose you want to read something like this:
<?xml ...?>
<root>
<element>data</element>
...
<otherElement>more data</otherElement>
<ignoredElement> ... </ignoredElement>
... more ignored Elements
</root>
And you want only the first 13 child elements inside root (which happen to be within the first 15 lines of your very large file).
You can use a SAX parser to read the file and abort it as soon as it has read those elements.
You can set up a SAX parser using standard J2SE:
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
XMLReader reader = sp.getXMLReader();
Then you need to create a ContentHandler
class that will be your data handler. I will call it DataSaxHandler
. If you extend DefaultHandler
you just need to implement the methods that you are interested in. This is an example which you can use it as a starting point. It will detect the begin and end of each element and will print it out. It will count 15 end tags (it won't generate a well formed output) and it will ignore attributes. Use it as a starting point (I didn't test it):
public class DataSaxHandler extends DefaultHandler {
private int countTags = 0;
private boolean inElement = false;
@Override
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
System.out.println("<" + qName + ">");
inElement = true;
}
@Override
public void endElement(String uri, String localName, String qName) throws SAXException {
countTags++;
System.out.println("</" + qName + ">");
inElement = false;
if(countTags > 15) {
// throw some exception to stop parsing
}
}
@Override
public void characters(char[] ch, int start, int length) throws SAXException {
if(inElement) {
System.out.println(new String(ch, start, length));
}
}
}
You register it with your SAX reader and use it to parse the file.
reader.setContentHandler(new DataSaxHandler());
reader.parse(new InputSource(new FileInputStream(new File(PATH, "data.xml"))));