I'm getting a compilation error when calling this word_frequencies function with a parameter of type List>> with the argument of type ArrayList>> and I have no idea why.
This is the code:
word_frequencies function:
public static TObjectIntHashMap<String> word_frequencies(List<List<Pair<String, String>>> article){
TObjectIntHashMap<String> word_freqs = new TObjectIntHashMap();
for (List<Pair<String, String>> sentence : article){
for (Pair<String, String> word : sentence){
word_freqs.adjustOrPutValue(word.first(), 1, 1);
}
}
return word_freqs;
}
The word_frequencies function is part of the package VectorSpaceModel.
This is the test where I'm trying to test parts of that code.
package Testing;
import Helper.VectorSpaceModel;
import java.io.*;
import java.util.regex.*;
import java.util.*;
import org.apache.commons.lang3.mutable.MutableInt;
import gnu.trove.map.hash.*;
import gnu.trove.iterator.*;
import edu.stanford.nlp.util.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.ling.CoreAnnotations.*;
import edu.stanford.nlp.ling.*;
public class VSM_tests {
public static ArrayList<ArrayList<Pair<String, String>>> tokenize(String text){
Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Annotation a = new Annotation(text);
pipeline.annotate(a);
ArrayList<ArrayList<Pair<String, String>>> article = new ArrayList<ArrayList<Pair<String, String>>>();
for (CoreMap sent : a.get(SentencesAnnotation.class)){
ArrayList<Pair<String, String>> sentence = new ArrayList<Pair<String, String>>();
for (CoreLabel l : sent.get(TokensAnnotation.class)){
String word = l.get(TextAnnotation.class);
String ner_tag = l.get(NamedEntityTagAnnotation.class);
Pair<String, String> word_ner = new Pair<String, String>(word, ner_tag);
sentence.add(word_ner);
}
article.add(sentence);
}
return article;
}
public static void main(String[] args){
File folder = new File("/Users/---/Documents/reuters/reuters/articles");
Pattern p = Pattern.compile(".+DS_Store$");
ArrayList<String> filenames = new ArrayList<String>();
for (File file_entry : folder.listFiles() ){
Matcher m = p.matcher(file_entry.getAbsolutePath());
if (!m.matches()){
filenames.add(file_entry.getAbsolutePath());
}
}
TObjectIntHashMap<String> word_freqs = new TObjectIntHashMap<String>();
MutableInt articles_processed = new MutableInt(0);
word_freqs = VectorSpaceModel.get_initial_word_freqs(filenames, articles_processed);
/*
TObjectIntIterator<String> iter = word_freqs.iterator();
String[] words = word_freqs.keySet().toArray(new String[word_freqs.size()]);
for (int i = 0; i < words.length; i++){
System.out.println(words[i] + " " + word_freqs.get(words[i]));
}
*/
String article_filename = "/Users/--/Documents/reuters/reuters/articles/us-global-technology-bitcoin-idUSKCN0UT2II";
String article_text = VectorSpaceModel.read_file(article_filename);
ArrayList<ArrayList<Pair<String, String>>> article = tokenize(article_text);
ArrayList<String> test_var = new ArrayList<String>();
TObjectIntHashMap article_freqs = VectorSpaceModel.word_frequencies(article);
}
}
It should be perfectly fine to declare the parameter as having the type of the interface and then the argument with type of an implementing class, but why does this not work with nested generic expressions in this case?