4

I want the functionality of the Stanford Core NLP, written in java, to be available in C++. To do this I am making use of the Java Native Interface. I have a Java object that wraps multiple functions in a way that's easier to call from C++. However when I do call those functions, the C++ doesn't wait for the functions to complete before moving onto the next one.

The Java object has a Main function I use for testing, that calls all the appropriate functions for testing purposes. When running just the Java, it works perfectly. The annotation waits for the setup to complete (which does take a while), and the function that gets the dependencies waits for the annotation function to complete. Perfectly expected and correct behavior. The problem comes when I start calling the java functions from C++. Part of the java function will run, but it will quit out and go back to the C++ at certain points, specified below. I would like for the C++ to wait for the java methods to finish.

If it matters, I'm using Stanford Core NLP 3.9.2.

I used the code in StanfordCoreNlpDemo.java that comes with the NLP .jar files as a starting point.

import java.io.*;
import java.util.*;

// Stanford Core NLP imports

public class StanfordCoreNLPInterface {

    Annotation annotation;
    StanfordCoreNLP pipeline;

    public StanfordCoreNLPInterface() {}

    /** setup the NLP pipeline */
    public void setup() {
        // Add in sentiment
        System.out.println("creating properties");
        Properties props = new Properties();
        props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref, sentiment");
        System.out.println("starting the parser pipeline");
        //<---- doesn't get past this point
        pipeline = new StanfordCoreNLP(props);
        System.out.println("started the parser pipeline");
    }

    /** annotate the text */
    public void annotateText(String text) {
        // Initialize an Annotation with some text to be annotated. The text is the argument to the constructor.
        System.out.println("text");
        System.out.println(text);
        //<---- doesn't get past this point
        annotation = new Annotation(text);
        System.out.println("annotation set");
        // run all the selected annotators on this text
        pipeline.annotate(annotation);
        System.out.println("annotated");
    }

    /** print the dependencies */
    public void dependencies() {
        // An Annotation is a Map with Class keys for the linguistic analysis types.
        // You can get and use the various analyses individually.
        // For instance, this gets the parse tree of the first sentence in the text.
        List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class);
        if (sentences != null && ! sentences.isEmpty()) {
            CoreMap sentence = sentences.get(0);
            System.out.println("The first sentence dependencies are:");
            SemanticGraph graph = sentence.get(SemanticGraphCoreAnnotations.EnhancedPlusPlusDependenciesAnnotation.class);
            System.out.println(graph.toString(SemanticGraph.OutputFormat.LIST));
        }
    }

    /** Compile: javac -classpath stanford-corenlp-3.9.2.jar -Xlint:deprecation StanfordCoreNLPInterface.java*/
    /** Usage: java -cp .:"*" StanfordCoreNLPInterface*/
    public static void main(String[] args) throws IOException {
        System.out.println("starting main function");
        StanfordCoreNLPInterface NLPInterface = new StanfordCoreNLPInterface();
        System.out.println("new object");
        NLPInterface.setup();
        System.out.println("setup done");

        NLPInterface.annotateText("Here is some text to annotate");
        NLPInterface.dependencies();
    }
}

I used the code in this tutorial http://tlab.hatenablog.com/entry/2013/01/12/125702 as a starting point.

#include <jni.h>

#include <cassert>
#include <iostream>


/** Build:  g++ -Wall main.cpp -I/usr/lib/jvm/java-8-openjdk/include -I/usr/lib/jvm/java-8-openjdk/include/linux -L${LIBPATH} -ljvm*/
int main(int argc, char** argv) {
    // Establish the JVM variables
    const int kNumOptions = 3;
    JavaVMOption options[kNumOptions] = {
        { const_cast<char*>("-Xmx128m"), NULL },
        { const_cast<char*>("-verbose:gc"), NULL },
        { const_cast<char*>("-Djava.class.path=stanford-corenlp"), NULL },
        { const_cast<char*>("-cp stanford-corenlp/.:stanford-corenlp/*"), NULL }
    };

    // JVM setup before this point.
    // java object is created using env->AllocObject();
    // get the class methods
    jmethodID mid =
        env->GetStaticMethodID(cls, kMethodName, "([Ljava/lang/String;)V");
    jmethodID midSetup =
        env->GetMethodID(cls, kMethodNameSetup, "()V");
    jmethodID midAnnotate =
        env->GetMethodID(cls, kMethodNameAnnotate, "(Ljava/lang/String;)V");
    jmethodID midDependencies =
        env->GetMethodID(cls, kMethodNameDependencies, "()V");
    if (mid == NULL) {
        std::cerr << "FAILED: GetStaticMethodID" << std::endl;
        return -1;
    }
    if (midSetup == NULL) {
        std::cerr << "FAILED: GetStaticMethodID Setup" << std::endl;
        return -1;
    }
    if (midAnnotate == NULL) {
        std::cerr << "FAILED: GetStaticMethodID Annotate" << std::endl;
        return -1;
    }
    if (midDependencies == NULL) {
        std::cerr << "FAILED: GetStaticMethodID Dependencies" << std::endl;
        return -1;
    }
    std::cout << "Got all the methods" << std::endl;

    const jsize kNumArgs = 1;
    jclass string_cls = env->FindClass("java/lang/String");
    jobject initial_element = NULL;
    jobjectArray method_args = env->NewObjectArray(kNumArgs, string_cls, initial_element);

    // prepare the arguments
    jstring method_args_0 = env->NewStringUTF("Get the flask in the room.");
    env->SetObjectArrayElement(method_args, 0, method_args_0);
    std::cout << "Finished preparations" << std::endl;

    // run the function
    //env->CallStaticVoidMethod(cls, mid, method_args);
    //std::cout << "main" << std::endl;
    env->CallVoidMethod(jobj, midSetup);
    std::cout << "setup" << std::endl;
    env->CallVoidMethod(jobj, midAnnotate, method_args_0);
    std::cout << "annotate" << std::endl;
    env->CallVoidMethod(jobj, midDependencies);
    std::cout << "dependencies" << std::endl;
    jvm->DestroyJavaVM();
    std::cout << "destroyed JVM" << std::endl;

    return 0;
}

Compiling the C++ with g++ and -Wall gives no warnings or errors, and neither does compiling the Java with javac. When I run the C++ code I get the following output.

Got all the methods
Finished preparations
creating properties
starting the parser pipeline
setup
text
Get the flask in the room.
annotate
dependencies
destroyed JVM

Following the couts and printlines starting the the C++, you can see how the C++ is able to successfully get the methods and finish JVM and method preparations, before calling the setup method in java. That setup method starts and calls the first printline, creates the properties and assigned the values, then quits before it can start the parser pipeline and goes back to the C++. It's basically the same story moving forward, the annotate text function is called and successfully receives the text from the C++ method call, but quits before it creates the annotation object. I don't have as many debug printlns in dependencies because at that point it doesn't matter, but needless to say none of the existing printlns are called. At the very end the JVM is destroyed and the program ends.

Thank you for any help or insight you can provide.

Medynsky
  • 159
  • 12
  • Check this out -> https://stackoverflow.com/questions/992836/how-to-access-the-java-method-in-a-c-application – Sachith Dickwella Jun 24 '19 at 02:49
  • I'm not sure what I am supposed to get from this, could you be a bit more specific? This answer was one of the first things I used when learning how the JNI works. It's probably due to my inexperience, but I can't see how any of the answers helps me make C++ wait for the java methods to finish. Just to say I tried something, I made the setup method in Java return a string and tried getting that string in C++, but as before it didn't finish and as a result the C++ moved on before the string was returned. This caused me to get "A fatal error has been detected by the Java Runtime Environment:". – Medynsky Jun 24 '19 at 03:54
  • 2
    You write that "java object is created using env->AllocObject();". Do you call the constructor? According to the JNI documentation `AllocObject()` "Allocates a new Java object without invoking any of the constructors for the object". – Thomas Kläger Jun 24 '19 at 05:00
  • I had not done that, so I used `env->NewObject()` and input the class and constructor method ID, then used the returned `jobject` in the `CallVoidMethod()` function calls for the remaining java methods. It runs identically as it did before. One odd thing I noticed is that all of the documentation I could find has env as a member variable, however the compiler wouldn't accept that. – Medynsky Jun 24 '19 at 05:47
  • 1
    The Java code probably throws an exception inside the method that returns too early (I'm guessing it's a `NoClassDefFoundError`, because the classpath is not what you think it is). What does `ExceptionOccured` when put after the method call say? – user2543253 Jun 24 '19 at 15:29
  • Ah ha! That found the problem! I'm getting `java.lang.ClassNotFoundException: edu.stanford.nlp.pipeline.StanfordCoreNLP` on the `pipeline = new` line in the java. My next question is, how do I get the JVM to use the correct class path? I edited my question to show the JavaVMOption array I'm using at the top of the C++ main method. The .class file and all of the NLP .jar files are in a stanford-corenlp folder. The class path I use when running the pure java from inside the stanford-corenlp folder is `-cp .:"*"` which works perfectly. – Medynsky Jun 24 '19 at 21:15
  • 1
    Moral: *every* JNI call must be error-checked. – user207421 Jun 24 '19 at 23:48

1 Answers1

1

JNI method calls are always synchronous. When they return before they have reached the end of the method, it's because the code encountered an exception. This doesn't propagate to C++ exceptions automatically. You always have to check for exceptions after every call.

A common problem for code that runs fine when called from other Java code but not when called with JNI is the VM's classpath. While java.exe will resolve * and add every matching JAR to the classpath, programs using the invocation interface have to do that themselves. The -Djava.class.path in JavaVMOption works with real files only. Also you can only use actual VM options and not arguments like -cp, because they too are only resolved by java.exe and not part of the invocation interface.

user2543253
  • 2,143
  • 19
  • 20
  • That works, thank you. I included every .jar file in the stanford-corenlp folder and the file not-found exceptions went away. After that I kept running out of memory before it could load the entire model, so I removed the VM option `-Xmx128m` when even upping it to 3 gigs, while it worked, still gave me allocation failures. Now it works 100%. For anyone reading this in the future, if you get segfaults when removing a VM option, make sure the `kNumOptions` integer actually matches the number of options. – Medynsky Jun 25 '19 at 21:19