9

WARNING -- THE CAUSE IS NOT A MISSING FILE -- ALL THREADS ARE CALLING THE SAME SCRIPT FILE

I am starting 5-6 threads that call a local script in a Red Hat box.

I've noticed that sometimes, I get the following error message

couldn't read file "/home/leo/myScript.exp": no such file or directory

Obviously, all processes are executing the script, so it seems to be something related to [1] OS has some restriction on simultaneous processes that can run a script or access a file for reading or [2] Java is trying to perform some operation in some stream that is not ready (I was assuming that commons-exec would take care of this for me)

Here's the code

ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
CommandLine commandline = CommandLine.parse("/home/leo/myScript.exp");
DefaultExecutor exec = new DefaultExecutor();
PumpStreamHandler streamHandler = new PumpStreamHandler(outputStream);
exec.setStreamHandler(streamHandler);
try {
   exec.execute(commandline); <<< error happens here
}catch(IOException io) {
   throw new Exception("");
}

If the error is [1], then I'd like to know how to relax this restriction in the linux OS

If the error is [2], then I'd like to know how do I tell commons-exec to wait for the resource to be ready (in the worst case I'll just add some retries, but I think this is not very elegant)

If the error is something else, at least knowing the cause will be good enough for me to find some solution.

UPDATE - March 15th

Well, here's the thing.

The script is an expect script that uses a library to call a Java class.

One thing I've noticed is that the script goes well until it calls a java method that creates a database connection.

Because the number of threads is low (3~5) I don't think it's a problem in the database. Instead, it seems to me that something is blocking the script to be called while the java code is being called and/or while the java code is creating the database connection.

I am still trying to get the exact exception, but the expect script looks like this (kind of)

#!/opt/tclblend/bin/expect -f
set edfDir "/usr/local/nssa/bin/edf";
set env(LD_LIBRARY_PATH) "/opt/tclblend/lib/tcljava1.4.1"; # for tclBlend

## always use absolute paths
set env(TCL_CLASSPATH) "/home/leoks/EclipseIndigo/workspace2/xyzJavaWrapper/bin";
set env(CLASSPATH) {/home/leoks/EclipseIndigo/workspace2/xyzTomEE/lib/commons-logging-1.1.1.jar:/home/leoks/EclipseIndigo/workspace2/xyzConfiguration/lib/commons-configuration-1.9.jar:/home/leoks/EclipseIndigo/workspace2/xyzConfiguration/lib/commons-lang-2.4.jar:/home/leoks/EclipseIndigo/workspace2/xyz/3rdPartyJDBCJars/ojdbc6.jar:/home/leoks/EclipseIndigo/workspace2/xyzJavaWrapper/lib/commons-dbutils-1.5.jar:/home/leoks/EclipseIndigo/workspace2/xyzJavaWrapper/tcl/tcllib/xyzConfiguration.jar};

source $edfDir/lib/statics.tcl;
source $edfDir/lib/acclib.tcl; 

package require java

java::import com.abc.xyz.legacydriver.TCLDriverWrapper
java::import com.abc.xyz.legacydriver.LegacyDriverTaskInputData
java::import com.abc.xyz.legacydriver.LegacyDriverTaskEnum

set ticket  [ lindex $argv 0 ];
set inputData [ java::call com.abc.xyz.legacydriver.TCLDriverWrapper pullNextInputData $ticket ]

where pullNextInputData looks like

public static LegacyDriverTaskInputData pullNextInputData(String token) throws Exception {

    try {
        return pullNextInputDataImpl(token);
    } catch (Exception e) {
        e.printStackTrace();
        throw e;
    }
}

private static LegacyDriverTaskInputData pullNextInputDataImpl(String token) throws Exception {
    Connection conn = null;
    try{
        conn = new TCLDriverWrapper().getConnection();
        QueryRunner run = new QueryRunner();
        ResultSetHandler<LegacyDriverTaskInputData> rsh = new BeanHandler<LegacyDriverTaskInputData>(LegacyDriverTaskInputData.class);
        LegacyDriverTaskInputData inputData = run.query(conn,"select * from LegacyDriverTask where id = ?",rsh,Long.valueOf(token));
        return inputData;
    } catch (ClassNotFoundException e) {
        e.printStackTrace();
        throw e;
    } catch (SQLException e) {
        e.printStackTrace();
        throw e;
    } finally {
        DbUtils.close(conn);
    }
}

and getConnection() is just a regular driver instantiation code like (it uses apache dbutils)

private Connection getConnectionImpl() throws Exception{    
    Class.forName("driver name");
    Properties props = new Properties();
    props.put("user", UtilConf.getProperty("javawrapper.user"));
    props.put("password", UtilConf.getProperty("javawrapper.password"));
    return DriverManager.getConnection(UtilConf.getProperty("javawrapper.jdbc"), props);
}

As soon as I get a stacktrace, I'll put it here

UPDATE - March 16th

The stacktrace does not say much :-(

2016-03-17 01:49:10,034 INFO   [QProcessor] Threads started (ok=0 nok=0 wait=0) org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1)
at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:404)
at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:166)
at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:153)
at com.ericsson.xyz.tomee.q.QWorker.onMessageImpl(QWorker.java:776)
at com.ericsson.xyz.tomee.q.QWorker.onMessage(QWorker.java:303)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.openejb.core.interceptor.ReflectionInvocationContext$Invocation.invoke(ReflectionInvocationContext.java:182)
at org.apache.openejb.core.interceptor.ReflectionInvocationContext.proceed(ReflectionInvocationContext.java:164)
at org.apache.openejb.monitoring.StatsInterceptor.record(StatsInterceptor.java:180)
at org.apache.openejb.monitoring.StatsInterceptor.invoke(StatsInterceptor.java:99)
at sun.reflect.GeneratedMethodAccessor106.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.openejb.core.interceptor.ReflectionInvocationContext$Invocation.invoke(ReflectionInvocationContext.java:182)
at org.apache.openejb.core.interceptor.ReflectionInvocationContext.proceed(ReflectionInvocationContext.java:164)
at org.apache.openejb.core.interceptor.InterceptorStack.invoke(InterceptorStack.java:80)
at org.apache.openejb.core.stateless.StatelessContainer._invoke(StatelessContainer.java:212)
at org.apache.openejb.core.stateless.StatelessContainer.invoke(StatelessContainer.java:181)
at org.apache.openejb.core.ivm.EjbObjectProxyHandler.synchronizedBusinessMethod(EjbObjectProxyHandler.java:268)
at org.apache.openejb.core.ivm.EjbObjectProxyHandler$1.call(EjbObjectProxyHandler.java:253)
at org.apache.openejb.async.AsynchronousPool$AsynchronousCall.call(AsynchronousPool.java:110)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Leo
  • 751
  • 4
  • 29
  • @Kenster the problem is that my code runs in 10 different threads, and it works in 9 of them, but fails in 1. Obviously is not that the file does not exist and I've made it clear in the question – Leo Mar 09 '16 at 17:12
  • BTW, why the downvote, someone? – Leo Mar 09 '16 at 17:12
  • Is `cmd` a constant? Show us the contents of `cmd`. And it would be helpful to see an actual error with its stacktrace. – Roman Mar 09 '16 at 17:41
  • yes. cmd is a constant (question edited) – Leo Mar 09 '16 at 19:04
  • 1
    Does the `IOException` have any stack trace that goes into the Apache code? Maybe determining the location where the error happens and then reading the source could clarify something. – 9000 Mar 11 '16 at 16:58
  • 1
    @Leo - just a thought but does `/home/leo/myScript.exp` do anything that could affect how it itself is run? – Tom Mac Mar 11 '16 at 17:08
  • I bet this is your OS failing to run the script. See [here](http://askubuntu.com/questions/133389/no-such-file-or-directory-but-the-file-exists) for an explanation. In this case, maybe you're hitting the user limit for open files and so not able to start the process up? – Peter Brittain Mar 12 '16 at 11:38
  • Is `myScript.exp` an shell script or an [expect(1)](http://linux.die.net/man/1/expect) script? Does the same error occur when you replace it with some dummy script (e.g. `#!/bin/sh` `\n` `\n` `echo AAA` `\n` `sleep 10` `\n` `echo BBB` `\n` `sleep 1`)? – vlp Mar 13 '16 at 11:06
  • it's an expect script. Why it should be different if it was shell script? – Leo Mar 13 '16 at 15:50
  • @Leo you still haven't shown the stacktrace. Are you going to? – Roman Mar 13 '16 at 19:34
  • (Please consider providing `myScript.exp` script as well). Do you have the correct shebang (like `#!/usr/bin/expect -f`). Did you try to call it with `CommandLine.parse("/usr/bin/expect -f /home/leo/myScript.exp");`? It is strange that it fails only from time to time -- are you sure the script works well for all possible cases (it is easy to overlook and use some shell idiom in an expect script, which won't work). Did you try to replace the `myScript.exp` script with some 100% working dummy (this would eliminate case [1]) – vlp Mar 13 '16 at 21:13
  • 5
    "no such file or directory" is a standard unix operating system error message. If you were to run `strace -f` on the JVM where your code is running, you might be able to identify an actual system call that's failing when your program emits this error. The strace output will include the arguments to the system call, including the filename being accessed. – Kenster Mar 14 '16 at 20:01
  • thanks @Kenster I'll try that. If I find some clue, I'll post here as part of the question. thx – Leo Mar 15 '16 at 01:20
  • I second @Kenster 's `strace` request. – Martin Cowie Mar 16 '16 at 14:35
  • Interesting problem! Could you also provide the exception stacktrace and also exact java version? – gustf Mar 16 '16 at 21:03
  • Checked the source and compared with the stack and it seams like you should be getting an `IOException` if the script could not be found, not an `ExecuteException`. Can you confirm this if you try with a dummy-string in the `CommandLine.parse...`? – gustf Mar 17 '16 at 12:16

1 Answers1

1

I'd like to thank all the people who helped in this case.

I'd delete this question but since I've put a bounty on it, I can't. (maybe when the bounty expires, if SO allows me to, I'll do it then)

I've finally found a way to not have the problem, and it seems to be related to the way the application was allocating resources for the threads, so given the information provided in the question, I think it was not possible to figure out the root cause.

Anyway, I'll try to describe here what happened.

enter image description here (click to enlarge)

The problem was I've noticed was in (5) (6) (7). But the cause was in (3).

TomEE configuration for the thread pool that serves the @Asynchronous methods had fewer threads (my app is not using the default values) than necessary, so it seems some resource was being starved.

Also the error message seems to have some bug that I haven't identified yet, so even the error message can be invalid in this case.

Because the question had not all the variables necessary for its resolution, I'm providing a whole explanation here about the problem.

Since this is a client code, I cannot make the code available, or provide all the information. Even package names had to be obfuscated.

Again, I'd like to thank anyone who spent time trying to help here and I'd like to apologize for not providing all the necessary data.

Leo
  • 751
  • 4
  • 29