I have some data in an xml file and I am using the Process library to parse thru that file. I ran into the BOM marker issue, that caused some errors to be thrown. I found a work around elsewhere, which is very slow: I'm using Apache Commons BOMInputStream to read the file as a bunch of bytes, after skipping the ones that represent that BOM data.
I think that the source of my problem is actually my lack of knowledge about streams, readers and writers. There are so many different readers and writers and all kinds of "streams" (a word I barely understand) that I want to pull my hair out trying to figure out which one to use and how. I think I just picked the wrong implementation.
Question: Can someone show me why my code is so slow, and also help me improve my understanding of file i/o?
Code:
private static XML noBOM(String filename, PApplet p) throws FileNotFoundException, IOException{
ByteArrayOutputStream out = new ByteArrayOutputStream();
File f = new File(filename);
InputStream stream = new FileInputStream(f);
BOMInputStream bomIn = new BOMInputStream(stream);
int tmp = -1;
while ((tmp = bomIn.read()) != -1){
out.write(tmp);
}
String strXml = out.toString();
return p.parseXML(strXml);
}
public static Map<String, Float> lifeExpectancyFromXML(String filename, PApplet p,
int year) throws FileNotFoundException, IOException{
Map<String, Float> dataMap = new HashMap<>();
XML xml = noBOM(filename, p);
if(xml != null){
XML[] records = xml.getChild("data").getChildren("record");
for (XML record : records){
XML[] fields = record.getChildren("field");
String country = fields[0].getContent();
int entryYear = fields[2].getIntContent();
float lifeEx = fields[3].getFloatContent();
if (entryYear == year){
System.out.println("Country: " + country);
System.out.println("Life Expectency: " + lifeEx);
dataMap.put(country, lifeEx);
}
}
}
else {
System.out.println("String could not be parsed.");
}
return dataMap;
}