I have a text corpus, which I have to read, split, sort and perform other operations on it. In the very beginning, when I split it, I see that the Scanner only reads one line. This is the code:
public class CorpusTest {
public static void processCorpus(Scanner scanner) throws IOException{
String line="0";
while (scanner.hasNextLine()) {
line = scanner.nextLine();
}
String[] w = line.replaceAll("[^a-zA-Z\\s]","").toLowerCase().split(" ");
for (int i = 0; i < w.length; i++) {
w[i].trim();
}
System.out.println("Word" + "\t" + "Frequency");
System.out.println(Arrays.toString(w));
}
public static void main(String [] args) throws IOException{
File temp = new File("input.txt");
Scanner scanner = new Scanner(temp);
CorpusTest.processCorpus(scanner);
}
}
I tried adding:
String text = new Scanner( new File("input.txt") ).useDelimiter("\\A").next();
But I get errors because in the method above I am working with an array.
The while loop only reads the last line, which is no good.