0

I would like to know how to execute this java process using the windows command line, from inside Python 2.7 on Windows 8.

I thought I had already solved this problem, but I recently changed computers from Windows 7 to Windows 8 and my code stopped working. I have confirmed that the windows command used in the script below executes properly when run directly from cmd.exe

import os
import subprocess

def FileProcess(inFile):
    #Create the startup info so the java program runs in the background (for windows computers)
    startupinfo = None
    if os.name == 'nt':
        startupinfo = subprocess.STARTUPINFO()
        startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW
    #Execute Stanford Core NLP from the command line
    print inFile
    cmd = ['java', '-Xmx1g','-cp', 'stanford-corenlp-1.3.5.jar;stanford-corenlp-1.3.5-models.jar;xom.jar;joda-time.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-annotators', 'tokenize,ssplit,pos,parse', '-file', inFile]
    output = subprocess.call(cmd, startupinfo=startupinfo)
    print inFile[(str(inFile).rfind('\\'))+1:] + '.xml'
    outFile = file(inFile[(str(inFile).rfind('\\'))+1:] + '.xml')

FileProcess("C:\\NSF_Stuff\\ErrorPropagationPaper\\RandomTuftsPlain\\PreprocessedTufts8199PLAIN.txt")

When this code is executed, I receive the error message that the output file does not exist. The java process I am executing should output an xml file when it is done.

It is my belief that for some reason subprocess.call is never successfully executing the command. I have tried using subprocesss.popen for the same task and I get the same results.

EDIT: I have changed my code so that I can capture error messages and I think I am beginning to understand the problem.

I changed my code to

import os
import subprocess

def FileProcess(inFile):
    #Create the startup info so the java program runs in the background (for windows computers)
    startupinfo = None
    if os.name == 'nt':
        startupinfo = subprocess.STARTUPINFO()
        startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW
    #Execute Stanford Core NLP from the command line
    print inFile
    cmd = ['java', '-Xmx1g','-cp', 'stanford-corenlp-1.3.5.jar;stanford-corenlp-1.3.5-models.jar;xom.jar;joda-time.jar', 'edu.stanford.nlp.pipeline.StanfordCoreNLP', '-annotators', 'tokenize,ssplit,pos,parse', '-file', inFile]
    proc = subprocess.Popen(cmd, stderr=subprocess.STDOUT, stdout=subprocess.PIPE, shell=True)
    print proc
    stdoutdata, stderrdata = proc.communicate()
    print stdoutdata
    print stderrdata
    outFile = file(inFile[(str(inFile).rfind('\\'))+1:] + '.xml')

FileProcess("C:\\NSF_Stuff\\ErrorPropagationPaper\\RandomTuftsPlain\\PreprocessedTufts8199PLAIN.txt")

stdoutdata contains the message "'java' is not recognized as an internal or external command, operable program or batch file."

Now this is a very bizarre message because java is definitely a recognized command when I run it from the cmd.exe . There is some issue here where executing the command from python is messing with my system environment variables such that java is no longer recognized as a command.

GrantD71
  • 1,787
  • 3
  • 19
  • 27
  • 1
    could you plese show us your inFile value you use? – nio Jul 29 '13 at 17:59
  • It is provided in the code. It is just the absolute path of a .txt file. – GrantD71 Jul 29 '13 at 18:08
  • try to print out your cmd variable like this: `print " ".join(cmd)` and then try to run it from cmd.exe – nio Jul 29 '13 at 18:35
  • That print statement yields: "java -Xmx1g -cp stanford-corenlp-1.3.5.jar;stanford-corenlp-1.3.5-models.jar;xom.jar;joda-time.jar edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,parse -file C:\NSF_Stuff\ErrorPropagationPaper\RandomTuftsPlain\PreprocessedTufts8199PLAIN.txt" which produces correct output from the command line when I run it from the same location the python script is executing from. – GrantD71 Jul 29 '13 at 18:43
  • what IDE do you use? how do you run your script? it maybe just the current working directory issue. I've edited the code. – nio Jul 29 '13 at 19:31
  • 1
    try to run `print os.environ['PATH']` in your script and check if you have java in it – nio Jul 29 '13 at 19:39
  • Thanks nio! I did this and it led me to the correct answer. Apparently java wasn't in my path variable. I didn't even bother checking this originally because I was having no problems executing java commands from the windows command line. I'm guessing that commands executed directly from cmd.exe use a different path variable than one's executed using the subprocess module. – GrantD71 Jul 29 '13 at 19:52

2 Answers2

1

I was able to solve my problem by adding the location of java to my PATH variable. Apparently java wasn't in my path variable. I didn't even bother checking this originally because I was having no problems executing java commands from the windows command line. I'm guessing that commands executed directly from cmd.exe use a different environment variable to find the java executable than commands executed indirectly from the subprocess module.

GrantD71
  • 1,787
  • 3
  • 19
  • 27
0

By trying your code it prints out PreprocessedTufts8199PLAIN.txt.xml file name. I'm not sure if the .txt.xml extension was the desired result. If your file has only .xml extension, then you're not stripping away the original .txt header.

Try to change this line:

outFile = file(inFile[(str(inFile).rfind('\\'))+1:] + '.xml')

Into this code:

fnameext = inFile[(str(inFile).rfind('\\'))+1:]
fname,fext = os.path.splitext(fnameext)
xmlfname = fname + '.xml'
xmlfpath =  os.path.join(".", xmlfname)
print "xmlfname:", xmlfname, " xmlfpath:", xmlfpath
print "current working directory:", os.getcwd()
outFile = open(xmlfpath, "r")

Answer for extension stripping.

Community
  • 1
  • 1
nio
  • 5,141
  • 2
  • 24
  • 35
  • Thanks for the response. The problem lies somewhere before this issue because the xml file is never being created at all at this point. – GrantD71 Jul 29 '13 at 18:31