I am trying to remove stop words from a tweet and I am adding the tokens first, then looping over them to see if they match a word in the stopword set, and if yes, removing them. I am getting a Java ConcurrentModificationErorr. Here's a snippet.
while ((line = br.readLine()) != null) {
//store tweet splits
LinkedHashSet<String> tweets = new LinkedHashSet<String>();
//We need to extract tweet and their constituent words
String [] tweet = line.split(",");
String input =tweet[1];
String [] constituent = input.split(" ");
//add all tokens in set
for (String a : constituent) {
tweets.add(a.trim());
}
System.out.println("Before: "+tweets);
//replace stopword
for (String word : tweets) {
if (stopwords.contains(word)) {
tweets.remove(word);
}
}
System.out.println("After: "+tweets);
//System.out.println("Tweet: "+sb.toString());