0

Hi already have referred to this, this, this and this but still finding it difficult to build a custom name finder model.. Here is the code:

public class CustomClassifierTrainer {

    private static final TokenNameFinderFactory TokenNameFinderFactory = null;
    static String onlpModelPath = "/Users/user/eclipse-workspace/openNLP/OpenNLP_models/en-ner-asiannames.bin";
    // training data set
    static String trainingDataFilePath = "/Users/user/eclipse-workspace/openNLP/trainingData/asiannames.txt";

    public static void main(String[] args) throws IOException {

        Charset charset = Charset.forName("UTF-8");

        ObjectStream<String> lineStream =
                new PlainTextByLineStream(new FileInputStream(trainingDataFilePath), charset);

        ObjectStream<NameSample> sampleStream = new NameSampleDataStream(lineStream);

        TokenNameFinderModel model;

        try {
          model = NameFinderME.train("en", "asian.person", sampleStream, TrainingParameters.defaultParams(),
                  TokenNameFinderFactory nameFinderFactory);
        }
        finally {
          sampleStream.close();
        }

        BufferedOutputStream modelOut = null;
        try {
          modelOut = new BufferedOutputStream(new FileOutputStream(onlpModelPath));
          model.serialize(modelOut);
        } finally {
          if (modelOut != null) 
             modelOut.close();      
        }



    }

}

I keep getting an error when trying to execute line:

ObjectStream<String> lineStream = new PlainTextByLineStream(new FileInputStream(trainingDataFilePath), charset);

asking me to cast the argument 1. when I change it to

ObjectStream<String> lineStream = new PlainTextByLineStream((InputStreamFactory) new FileInputStream(trainingDataFilePath), charset);

then I get a runtime error saying you cant cast this. Here is the error when I cast it Exception in thread "main" java.lang.ClassCastException: class java.io.FileInputStream cannot be cast to class opennlp.tools.util.InputStreamFactory (java.io.FileInputStream is in module java.base of loader 'bootstrap'; opennlp.tools.util.InputStreamFactory is in unnamed module of loader 'app') at openNLP.CustomClassifierTrainer.main(CustomClassifierTrainer.java:35)

The second issue is at line:

try {
  model = NameFinderME.train("en", "asian.person", sampleStream, TrainingParameters.defaultParams(),
              TokenNameFinderFactory nameFinderFactory);
}

giving a syntax error. Not sure whats wrong here. Any help would be appreciated as I have tried all the code snippets on the above-mentioned links.

Regards,

Shery
  • 1,808
  • 5
  • 27
  • 51

1 Answers1

1

First error: your method expects an InputStreamFactory. You're trying to pass an InputStream. An InputStream is not an InputStreamFactory. Just like a Pizza is not a Car.

If someone (the compiler) asks you for a Car, and you give him a Pizza, he won't be able to drive. Pretending that a Pizza is a Car by telling him "trust me, this pizza is a car" (which is what casting does) won't solve the problem.

So you need to actually pass an InputStreamFactory. Look at the javadoc of this interface, and you'll see that it has a single method createInputStream() which takes nothing as argument, and is supposed to create and return an InputStream.

A valid value would thus be

() -> new FileInputStream(trainingDataFilePath)

i.e. a lambda which takes no input and create a new input stream, and can thus be inferred to an InputStreamFactory.

The second error is even simpler: you're not supposed to specify the types of the arguments when calling a method. Only when defining a method. So

NameFinderME.train("en", 
                   "asian.person", 
                   sampleStream, 
                   TrainingParameters.defaultParams(),
                   TokenNameFinderFactory nameFinderFactory);

should be

NameFinderME.train("en", 
                   "asian.person", 
                   sampleStream, 
                   TrainingParameters.defaultParams(),
                   nameFinderFactory);

Practice with simpler stuff to learn the Java syntax. Learn to read error messages instead of ignoring them, and to read the javadoc of the classs you're using. This is critical.

JB Nizet
  • 678,734
  • 91
  • 1,224
  • 1,255
  • Thank you so much for offering to help. I am not a java developer. I want to train this model here and port it to R language. I have fixed the first error as you suggested. The second error is still not resolved and giving an error: `nameFinderFactory cannot be resolved to a variable` – Shery Oct 28 '18 at 10:37
  • The error is self-explanatory. You're trying to pass a variable as argument, but this variable doesn't exist. Sorry, bt this is Java code, so you're effectively a Java developer, and should thus learn the basics of the language. If you're not willing to do that, you should assign that task to a Java developer, not hope for us to do it for you. – JB Nizet Oct 28 '18 at 10:41
  • I really appreciate your help. I meant to say that I declared the variable before the `try{}` but obv it throws null pointer exception. Looked at javadoc as you suggested so not sure what to initialise this variable with. Here is the part where the trouble is `TokenNameFinderFactory nameFinderFactory = null; try { model = NameFinderME.train("en", "asian.person", sampleStream, TrainingParameters.defaultParams(), nameFinderFactory); }` – Shery Oct 28 '18 at 10:47
  • This is not a Java problem. It's an OpenNLP problem. I know nothing about this library. I could of course read its documentation to find out what it is about, what a TokenNameFinderFactory is, what the possible implementations are, etc. But you're really the one who should do that. – JB Nizet Oct 28 '18 at 11:14
  • 1
    I fixed the issue by reading the docs :) just initialised with `TokenNameFinderFactory nameFinderFactory = new TokenNameFinderFactory();` before the `try{}` and all done ... thanks again – Shery Oct 28 '18 at 11:24