0

I have this bash script "run_transcribe_py.sh":

RESULT=$(/usr/bin/docker run --mount type=bind,source="/usr/src/Projects/docker/transcribe/audio",target=/home/audio --mount type=bind,source="/usr/src/Projects/docker/transcribe",target=/home/transcribe -it --rm --name transcribe transcribe python /home/transcribe/transcribe.py $1)

echo $RESULT

echo "ABC"

The Python script run in the docker:

import sys

transcript="abcdef"

print(transcript)

If I run the bash script in BASH:

# /bin/bash /usr/src/Projects/docker/transcribe/run_transcribe_py.sh /home/audio/102303160955qhhs5zq.mp3
abcdef
ABC
#

However, if run via a JSCh exec command from Java:

public boolean commitTranscription(ArrayList<transcriptionMetaData> pRecordingsInfo) {
        boolean retVal = false;

        JSch localJsch = null;
        localJsch = new JSch();

        Session localSession = initJSch(localJsch, AppSettings.getPbxServer(), AppSettings.getPbxUser(), AppSettings.getPbxServerPassword(), AppSettings.getPbxPort());

        try {
            for (transcriptionMetaData iterateRecData : pRecordingsInfo) {
                ArrayList<String> transcribeLines = new ArrayList<String>();

                ChannelExec shellChannel = (ChannelExec) localSession.openChannel("exec");

                try ( BufferedReader resultOfTranscription = new BufferedReader(new InputStreamReader(shellChannel.getInputStream()))) {
                    shellChannel.setCommand("/bin/bash /usr/src/Projects/docker/transcribe/run_transcribe_py.sh /home/audio/"
                            + iterateRecData.getCallLogNo() + ".mp3");
                    shellChannel.connect((int) TimeUnit.SECONDS.toMillis(10));

                    String resultLine = null;

                    while ((resultLine = resultOfTranscription.readLine()) != null) {
                        transcribeLines.add(resultLine);
                    }

                    iterateRecData.setTranscript(transcribeLines.toString());

                    if (shellChannel != null) {
                        if (shellChannel.isConnected()) {
                            shellChannel.disconnect();
                        }

                        shellChannel = null;
                    }
                }

                transcribeLines = null;
            }
        } catch (JSchException jex) {
            localLogger.error((String) logEntryRefNumLocal.get()
                    + "JSch exception in commitTranscription() method in ExperimentalRecordingsTranscription. JSch exception: " + jex.toString() + ". Contact software support." + jex.getMessage(), jex);
        } catch (RuntimeException rex) {
            localLogger.error((String) logEntryRefNumLocal.get()
                    + "Runtime exception in commitTranscription() method in ExperimentalRecordingsTranscription. Runtime exception: " + rex.toString() + ". Contact software support." + rex.getMessage(), rex);
        } catch (Exception ex) {
            localLogger.error((String) logEntryRefNumLocal.get()
                    + "Exception in commitTranscription() method in ExperimentalRecordingsTranscription. Exception: " + ex.toString() + ". Contact software support." + ex.getMessage(), ex);
        } finally {
            if (localSession != null) {
                if (localSession.isConnected()) {
                    localSession.disconnect();
                }

                localSession = null;
            }

            localJsch = null;
        }

        return retVal;
    }

Java literally sees

# 

ABC
#

The output from the docker Python script (e. g. "abcdef") is just blank in Java, while, if the script is run from BASH itself, both lines are present.

Why is this

RESULT=$(/usr/bin/docker run --mount type=bind,source="/usr/src/Projects/docker/transcribe/audio",target=/home/audio --mount type=bind,source="/usr/src/Projects/docker/transcribe",target=/home/transcribe -it --rm --name transcribe transcribe python /home/transcribe/transcribe.py $1)

echo $RESULT

invisble to Java only, but shows in BASH in the console, just above "abc"?

Anybody got any idea?

Thanks!

EDIT: From feedback from Charles Duffy (thanks Charles!) I've changed my BASH script as referenced above to

#!/bin/bash

declare RESULT

RESULT="$(/usr/bin/docker run --mount type=bind,source="/usr/src/Projects/docker/transcribe/audio",target=/home/audio --mount type=bind,source="/usr/src/Projects/docker/transcribe",target=/home/transcribe -it --rm --name transcribe transcribe python /home/transcribe/transcribe.py "$1")"

printf '%s\n' "$RESULT"

echo "ABC"

This still however, in Java results in the exact same blank output if the BASH script is called from the JsCH exec method:


ABC

while running it straight in BASH results in

abcdef
ABC

I literally just want the "abcdef" to be "visible" to Java in the Java code above... so nothing changes even if I clean up the variable instantiation in BASH and the output of it via printf instead of echo as advised by the link Charles gave...

EDIT: I also tried calling the dockerised Python instance directly from Java and skip BASH alltogether - behaviour remains exactly the same. Java never sees the output printed to stdout by docker from running the Python script inside docker.

E. g.

shellChannel.setCommand("/bin/bash /usr/src/Projects/docker/transcribe/run_transcribe_py.sh /home/audio/" + iterateRecData.getCallLogNo() + ".mp3");

changed to

shellChannel.setCommand("/usr/bin/docker run --mount type=bind,source=\"/usr/src/Projects/docker/transcribe/audio\",target=/home/audio --mount type=bind,source=\\\"/usr/src/Projects/docker/transcribe\\\",target=/home/transcribe -it --rm --name transcribe transcribe python /home/transcribe/transcribe.py " + iterateRecData.getCallLogNo() + ".mp3");

still gives the same blank result. The Java BufferedReader never sees the output printed to stdout by Python running inside docker. If run from the terminal directly with the above commandline, result is as expected - the letters "abcdef" appears in the terminal.

ChrisF
  • 134,786
  • 31
  • 255
  • 325
Stefan
  • 316
  • 2
  • 16
  • 1
    `echo $RESULT` is inherently buggy -- see [I just assigned a variable, but `echo $variable` shows something different](https://stackoverflow.com/questions/29378566/i-just-assigned-a-variable-but-echo-variable-shows-something-else) – Charles Duffy Mar 17 '23 at 15:12
  • 1
    (Also, if your call log numbers can possibly be anything other than strictly numeric, you've got security problems here -- always escape strings before substituting them into places a shell will parse as code) – Charles Duffy Mar 17 '23 at 15:13
  • 1
    Anyhow -- if I'm reading this right, it looks like you're getting the channel's stdin, but you want its stdout. (Also, be mindful of stderr, though if the program you're calling is well-written it'll be diagnostics, logs, and other non-output content there). Anyhow -- by my read, `setExtOutputStream` determines where stderr will go, whereas `setOutputStream` determines where stdout will go; you care about both of those. – Charles Duffy Mar 17 '23 at 15:16
  • 1
    (also, `$1` needs to be `"$1"` if you don't want the shell to split it into words on characters in IFS -- whitespace by default -- and then expand each of those words as a glob; it's deeply undesirable as default behavior, but that's what following 1970s-era standards gets you, and it's arguably better than _not_ following 1970s standards -- at which point you end up with zsh, where anyone who uses it too much writes code that's wildly buggy when targeting any other shell). – Charles Duffy Mar 17 '23 at 15:18
  • 1
    Anyhow -- to _unambiguously_ print the value of your variable in bash, use `declare -p result`, or `printf 'RESULT=%q\n' "$RESULT"` – Charles Duffy Mar 17 '23 at 15:20
  • 1
    (also, think about using `set -x` when running your example transcript so we know which output corresponds with which command, which content is/isn't part of the prompt, etc; it's not clear to me right now how to interpret the newline before the `echo` command, for example -- you _could_ have typed it manually as a separator, it _could_ be content returned on stderr that evaded the capture, it _could_ be something else; don't make us guess). – Charles Duffy Mar 17 '23 at 15:24
  • Thx Charles made the changes in the link you gave as regards the variable reference and output of the variable, still no change. Java just "sees" a blank line if the script is run via a BASH instance started from Java, if run straight in BASH on the terminal, it works correctly and prints both the line from Python and the one directly in bash. Java still just sees the bottom line, and a blank line above it - even with the blank line being the result of a printf call. – Stefan Mar 20 '23 at 09:28
  • I've also just tried to write the output I need to a file in BASH, by using > filename.txt but the behaviour is the same - as long as the script is run in BASH, the filename.txt gets created on the filing system, if I run the script via a BASH instance started from Java, filename.txt does not get created. Somehow BASH behaves radically differently if run from Java as when run "direct" from the terminal, on the same BASH script using proper variable definition and printf to print that variable out. – Stefan Mar 20 '23 at 09:30

1 Answers1

0

Ok this has been solved.

More research revealed the following.

The bash script referenced above should call docker like this:

RESULT="$(/usr/bin/docker run --mount type=bind,source="/usr/src/Projects/docker/transcribe/audio",target=/home/audio --mount type=bind,source="/usr/src/Projects/docker/transcribe",target=/home/transcribe -i --rm --name transcribe transcribe python /home/transcribe/transcribe.py "$1")"

e.g. the parameter

-it

to docker must ONLY be

-i

Otherwise docker tries to spawn a pseudo tty which does not work as I'm calling it non-interactively (as I understand it.)

I also had to add this to the top of the BASH script calling docker:

exec 2>&1

to unify STDOUT and STDERR to STDOUT only.

After I did those, I suddenly saw the dockerised Python script's output appear in the Java program in the transcribeLines Java ArrayList as I needed.

The complete, correct, and working BASH script to do what I need therefore is:

#!/bin/bash

exec 2>&1

declare RESULT

RESULT="$(/usr/bin/docker run --mount type=bind,source="/usr/src/Projects/docker/transcribe/audio",target=/home/audio --mount type=bind,source="/usr/src/Projects/docker/transcribe",target=/home/transcribe -i --rm --name transcribe transcribe python /home/transcribe/transcribe.py "$1")"

printf '%s\n' "$RESULT"

I then also had to do

chmod 666 /var/run/docker.sock

or I would get permission errors (yup, security sucks on that but this is a closed development server only with no user besides me. Will tackle security regarding this, sort out the group memberships, etc. before publishing to our company wide server.)

The Java and Python code is unchanged from my original post.

Everything now works and the output from Python running inside Docker is accessible in Java when calling the above BASH script with valid parameters (the name of an .MP3 file in the correct docker-mounted folder, specified as the docker-mounted directory alias.)

E. g. I needed to run docker without the "t" parameter, and unify STDOUT and STDERR in the BASH script, and make /var/run/docker.sock world read-writable.

Thanks to Charles Duffy for his comments.

Regards

Stefan

Stefan
  • 316
  • 2
  • 16
  • 1
    The `2>&1` lets you get away without `setExtOutputStream`, but it would be better if you _did_ read stderr from java instead of adding code to work around the fact that you don't. – Charles Duffy Mar 24 '23 at 17:37