0

I have a prototype written in Python that I need to port into Java to put into production. Python 2.7.10 has been installed using miniconda. The prototype uses a 3rd party library nltk that I installed using pip.

To void rewriting the code from scratch, at least initially, I want to first try call the prototype code directly from Java using jython.

When I try executing a command like

java -jar jython-standalone-2.7.0.jar myPrototype.py

I get

Traceback (most recent call last):
    File "myPrototype.py", line 4, in <module>
       from nltk import AlignedSent
ImportError: No module named nltk

It works fine when I run python myPrototype.py.

Is there a way of configuring my jython install so that it can find all 3rd party packages that I've added to my python install? I realize that some of those might not run in jython but at least I want to have access to those that do.

I Z
  • 5,719
  • 19
  • 53
  • 100
  • 1. I'm pretty it's possible to produce production code in python =) 2. there's totally no clue of what you're doing. Could you please state what is this prototype you've written? what is the purpose? what is in `myPrototype.py`? Also not all types and class objects work seamlessly across jython. My suggestion is to let python do python stuff and let java do java stuff. Interface them using input/output files (JSON/XML/plaintext) or command-line interface. – alvas Jan 25 '16 at 18:15
  • @alvas 1. Not sure what you mean by this. We've developed production-quality systems written entirely in python. 2. The prototype does some text processing using the `nltk` library. What it does specifically is not important. The question is about configuring `jython` for it to be able to run whatever third-party packages that were added using `pip` to the `python` install doe using `miniconda`. Or at least those packages that are compatible with `jython`. – I Z Jan 25 '16 at 18:42
  • I don''t think `jython` will be able to handle all the libraries you have with `conda`. You'll most likely hit a problem once the library relies on `numpy` or `scipy`. If the main language of your project is Java, then i think the best solution is NOT `jython` but something like a REST i/o pipeline or with data transportation standards like JSON/XML. – alvas Jan 25 '16 at 18:52
  • E.g. http://stackoverflow.com/questions/12738827/how-can-i-call-scikit-learn-classifiers-from-java – alvas Jan 25 '16 at 18:54

1 Answers1

0

The bulk of NLTK is Python code so you should be able to use it from Jython as long as it's in your module search path. If you are on unix just add a link to your nltk in site-packages to the current folder. Or look into the documentation here: http://www.jython.org/jythonbook/en/1.0/ModulesPackages.html NLTK needs to be able to load its data for some stuff. You may want to either make al link to your nltk_data from your home to the current folder or see this answer to set it from code: How to config nltk data directory from code?

Community
  • 1
  • 1
Josep Valls
  • 5,483
  • 2
  • 33
  • 67
  • NLTK might break in `jython` when calling functions that uses `numpy` though. I learnt that the hard way and tried hours hacking it before giving up and ending up with CLI tricks. – alvas Jan 26 '16 at 09:30
  • Agreed completely with @alvas but without knowing what the script is actually doing there is no way to tell. The installation insturctions of NLTK state that numpy is optional and I've used big parts of it w/o numpy. – Josep Valls Jan 26 '16 at 17:41