I got a list of sentences. I split each sentences and filtered the unwanted words and puncuations. and then store them into
ArrayList<ArrayList<String>> sentence
then I used a hashMap to find the most common word. how could I modify the following hashmap code so I can also find the most common consecutive pairs of words.(N-grams for phrases)
HashMap<String, Integer> hashMap = new HashMap<>();
// Splitting the words of string
// and storing them in the array.
for(int i =0; i < sentence.size(); i++){
ArrayList<String> words = new ArrayList<String>(sentence.get(i));
for (String word : words) {
//Asking whether the HashMap contains the
//key or not. Will return null if not.
Integer integer = hashMap.get(word);
if (integer == null)
// Storing the word as key and its
// occurrence as value in the HashMap.
hashMap.put(word, 1);
else {
// Incrementing the value if the word
// is already present in the HashMap.
hashMap.put(word, integer + 1);
}
}
}
i dont know where to start. should i adjust the way i split or do i no split at all in the first place.