Team, i have to parse file line by line and in single line i have split by ",". First String would be Name and Second would be count. Finaly i have to display the Key and Count For example
Peter,2
Smith,3
Peter,3
Smith,5
I should display as Peter 5 and Smith 8.
So i was in confusion to choose between BufferedReader vs Scanner. Went through link . I came up with these two approach. i would like to get your concerns.
Approach 1 : use buffered Reader.
private HashMap<String, MutableLong> readFile(File file) throws IOException {
final HashMap<String, MutableLong> keyHolder = new HashMap<>();
try (BufferedReader br = new BufferedReader(new InputStreamReader(
new FileInputStream(file), "UTF-8"))) {
for (String line; (line = br.readLine()) != null;) {
// processing the line.
final String[] keyContents = line
.split(KeyCountExam.COMMA_DELIMETER);
if (keyContents.length == 2) {
final String keyName = keyContents[0];
final long count = Long.parseLong(keyContents[1]);
final MutableLong keyCount = keyHolder.get(keyName);
if (keyCount != null) {
keyCount.add(count);
keyHolder.put(keyName, keyCount);
} else {
keyHolder.put(keyName, new MutableLong(count));
}
}
}
}
return keyHolder;
}
private static final String COMMA_DELIMETER = ",";
private static volatile Pattern commaPattern = Pattern
.compile(COMMA_DELIMETER);
I have used MutableLong ( , since i dont want to create BigInteger for each time . And again it may be very big file and i don't have control on how max a key can occur
Another Approach :
use Scanner and use two Delimiter
private static final String LINE_SEPARATOR_PATTERN = "\r\n|[\n\r\u2028\u2029\u0085]";
private static final String LINE_PATTERN = ".*(" + LINE_SEPARATOR_PATTERN
+ ")|.+$";
private static volatile Pattern linePattern = Pattern.compile(LINE_PATTERN);
My Question is . i have went through the hasNext in Scanner and to me there is no harm to switch the Pattern . And i belive from Java 7, Scanner do has limited buffer can be enough for this kind of file.
Do any one perfer Approach 2 over Approach 1 or do we have any other option other than this. I just did sop for testing purpose. Obviously the same code in approach 1 would replace here. Using split in Approach1 would create multiple String instances. which can be avoided here ( am i right) , by scanning char sequence.
private HashMap<String, BigInteger> readFileScanner(File file)
throws IOException {
final HashMap<String, BigInteger> keyHolder = new HashMap<>();
try (Scanner br = new Scanner(file, "UTF-8")) {
while (br.hasNext()) {
br.useDelimiter(commaPattern);
System.out.println(br.next());
System.out.println(br.next());
br.useDelimiter(linePattern);
}
}
return keyHolder;
}