Problem: I have an array of some 700 strings, which I'm reading into a List. I then have a directory containing over 1500 files. I need to open every one of these files and see if any of the 700 strings appear anywhere within each of them.
Current solution: After reading in the 700 strings (which is pretty much instantaneous), this is what I'm doing:
public static void scanMyDirectory(final File myDirectory, final List<String> listOfStrings) {
for (final File fileEntry : myDirectory.listFiles()) {
System.out.println("Entering file: " + currentCount++);
if (fileEntry.isDirectory()) {
scanMyDirectory(fileEntry, listOfStrings);
} else {
BufferedReader br = null;
try {
String sCurrentLine;
br = new BufferedReader(new FileReader(fileEntry.getPath()));
while ((sCurrentLine = br.readLine()) != null) {
for (int i = 0; i < listOfStrings.size(); i++) {
if (org.apache.commons.lang3.StringUtils.containsIgnoreCase(sCurrentLine, listOfStrings.get(i))) {
matchLocations.put(listOfStrings.get(i), fileEntry.getPath());
}
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null) {
br.close();
}
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
}
}
After calling this procedure, I have all the results stored in a HashMap and I can output the results to screen or file.
Question: What is the faster way to do this? It seems extremely slow (taking around 20-25 minutes to run through ~1500 files). I'm not very familiar with threading, but I've considered using it. However, the top answer in this question has put me off a little. What is the best way to speed up performance?