Starting with version 2.0.5, you can now easily use multiple threads with the option -nthreads k
. For example, your command can be like this:
java -mx6g edu.stanford.nlp.parser.lexparser.LexicalizedParser -nthreads 4 edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz file.txt > file.stp
(Releases of version 2 prior to 2013 had no way to enable multithreading from the command-line, but only when using the API.)
Internally, you can simultaneously run as many parsing threads inside one JVM process as you want. You can do this either by getting and using multiple LexicalizedParserQuery objects (via the parserQuery()
method) or implicitly by calling apply(...)
or parseTree(...)
off one LexicalizedParser. The -nthreads k
option does this for you by sending successive sentences to different parsers using the Executor
framework. You can also simultaneously create multiple LexicalizedParser's, e.g., for parsing different languages.
Multiple LexicalizedparserQuery objects share the same grammar (LexicalizedParser), but the memory space savings aren't huge, as most of the memory goes to the transient structures used in chart parsing. So, if you are running lots of parsing threads concurrently, you will need to give a lot of memory to the JVM, as in the example above.
p.s. Sorry, yes, some of the documentation still needs updating. But -tLPP is one flag for specifying language-specific resources. The Stanford Parser has no -t flag.