I have code like so
public static void main(String[] args) throws Exception {
long start = System.currentTimeMillis();
List<String> matches = new Vector<>(); // Race condition for ArrayList??
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream("AHugeFile.txt")));
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("output.txt")));
reader.lines().parallel()
.filter(s -> s.matches("someFancyRegEx"))
.forEach(s -> {
matches.add(s);
try {
writer.write(s);
writer.newLine();
} catch (Exception e) {
System.out.println("error");
}
}
);
out.println("Processing took " + (System.currentTimeMillis() - start) / 1000 + " seconds and matches " + matches.size());
reader.close();
writer.flush();
writer.close();
}
I noticed that if I replace the Vector with an ArrayList on Line 3, I get different results in the matches each time. I'm just about getting my hands dirty on Streams but assume that the forEach executes concurrently trying to write to the ArrayList which misses some writes! With a Vector, the results are consistent.
I have two Questions:
- Is my reasoning about the ArrayList causing a RACE correct?
- Given that the 'write' is also writing to a file in the same terminal operation, could the 'write' potentially miss some lines? In my tests, running the program a few times, the results seem to be consistent with the correct number of lines being written out.