So I have a .txt file which I am calling using
String[] data = loadStrings("data/data.txt");
The file is already sorted and essentially looks like:
Animal
Animal
Cat
Cat
Cat
Dog
I am looking to create an algorithm to count the sorted list in java, without using any libraries like Multisets or without the use of Maps/HashMaps. I have managed so far to get it print out the top occurring word like so:
ArrayList<String> words = new ArrayList();
int[] occurrence = new int[2000];
Arrays.sort(data);
for (int i = 0; i < data.length; i ++ ) {
words.add(data[i]); //Put each word into the words ArrayList
}
for(int i =0; i<data.length; i++) {
occurrence[i] =0;
for(int j=i+1; j<data.length; j++) {
if(data[i].equals(data[j])) {
occurrence[i] = occurrence[i]+1;
}
}
}
int max = 0;
String most_talked ="";
for(int i =0;i<data.length;i++) {
if(occurrence[i]>max) {
max = occurrence[i];
most_talked = data[i];
}
}
println("The most talked keyword is " + most_talked + " occuring " + max + " times.");
I want rather than just to get the highest occurring word perhaps the top 5 or top 10. Hope that was clear enough. Thanks for reading