18

Whether I use this:

process = Runtime.getRuntime().exec("logcat -d time");

or that:

process = new ProcessBuilder()
              .command("logcat", "-d", "time")
              .redirectErrorStream(true)
              .start();

I get the same results: it often hangs within the exec() or start() call, no matter what I tried to do! The thread running this cannot even be interrupted with Thread.interrupt()! The child process is definitely started and if killed the above commands return.

These calls may fail on first attempt, so THERE IS NO WAY TO READ THEIR OUTPUT! I can also use a simple "su -c kill xxx" command line, same result!

EDIT: Started debugging the java_lang_ProcessManager.cpp file in an NDK project with some debugging logs! So here is what I found so far, after the fork() the parent does this:

int result;
int count = read(statusIn, &result, sizeof(int));            <- hangs there
close(statusIn);

Though the child process is not supposed to block on it: That's what the child does (if started at all!):

    // Make statusOut automatically close if execvp() succeeds.
    fcntl(statusOut, F_SETFD, FD_CLOEXEC);                      <- make the parent will not block

    // Close remaining unwanted open fds.
    closeNonStandardFds(statusOut, androidSystemPropertiesFd);  <- hangs here sometimes

    ...

    execvp(commands[0], commands);

    // If we got here, execvp() failed or the working dir was invalid.
    execFailed:
        int error = errno;
        write(statusOut, &error, sizeof(int));
        close(statusOut);
        exit(error);

The child can fail for 2 reproducible reasons: 1- child code is not running, but the parent believes it is! 2- child blocks on closeNonStandardFds(statusOut, androidSystemPropertiesFd);

In either case the read(statusIn...) in the parent ends in deadlock! and a child process is left dead (and cannot be accessed, pid unknown, no Process object)!

Kara
  • 6,115
  • 16
  • 50
  • 57
3c71
  • 4,313
  • 31
  • 43
  • http://developer.android.com/reference/java/lang/Process.html – Selvin Dec 31 '11 at 14:36
  • Are you sure you read *all* the output from your process? You do read it in a loop, don't you? – JimmyB Dec 31 '11 at 15:34
  • Yes, immediately starting a thread to read the output continuously. Sometimes the very first call to exec() just hangs so there's nothing to read as the process object is not returned yet, hence can't get hold on the streams! So nothing to do with reading the output! I'm currently reproducing the issue with the NDK source code and adding debug info to track this down. Interestingly found out that adding some logging makes the problem harder to reproduce!? Is the exec() thread-safe? – 3c71 Jan 01 '12 at 14:24
  • By the way, want another verifiable bug, just do this: process = Runtime.getRuntime().exec("/logcat -v time"); This will return immediately with an IOException, but I constantly get a new child process hanging there and useless! Managed to remove this exception using NDK and then kill the unwanted child process. – 3c71 Jan 01 '12 at 14:30
  • @Hanno, if you do a "kill ", there's no input or output whatsoever, and blocking on first exec() call suggest this has nothing to do with it. – 3c71 Jan 02 '12 at 20:45
  • @Selvin, thanks for the link, do you really think I didn't read it? – 3c71 Jan 02 '12 at 20:45
  • but did you tried `process.destroy()` ? – Selvin Jan 03 '12 at 09:24

4 Answers4

6

This problem is fixed in Jelly Bean (Android 4.1) but not in ICS (4.0.4) and I guess it will never be fixed in ICS.

chrulri
  • 1,822
  • 14
  • 16
  • Is there any more background information to show how this issue maps to the child and parent hanging as outlined in 3C71's edited question above? Is it because the child process calls closeNonStandardFds which calls cpuacct_add which locks while using the same fd (statusOut above from the childs's point of view) that the parent process is waiting on in the read function (statusIn from the parents point of view)? I am trying to understand if there is a reliable way to work around this in pre 4.1 versions - closing all open fd's is tricky in a large multi developer app. – Mick Dec 10 '12 at 17:44
  • Have a look at the commit: https://github.com/android/platform_bionic/commit/177ba8cb42ed6d232e7c8bcad5e6ee21fc51a0e8 – chrulri Dec 10 '12 at 17:59
  • It says: When forking of a new process in bionic, it is critical that it does not allocate any memory according to the comment in java_lang_ProcessManager.c: "Note: We cannot malloc() or free() after this point! A no-longer-running thread may be holding on to the heap lock, and an attempt to malloc() or free() would result in deadlock." However, as fork is using standard lib calls when tracing it a bit, they might allocate memory, and thus causing the deadlock. This is a rewrite so that the function cpuacct_add, that fork calls, will use system calls instead of standard lib calls. – chrulri Dec 10 '12 at 18:00
  • Yes thanks, I saw the comment with the fix. I am just trying to relate this to the tracing 3c71 did which suggests two places that the Android native code is hanging. In other words, understanding why specifically the parent and child hang at those points? Or maybe it can hang in multiple places and this is just one example. – Mick Dec 11 '12 at 22:09
  • I guess it's because cpuacct_add() - called by fork() after forking - caused a deadlock (heap lock) by allocating memory in fopen and/or fprintf and thus parent and child process hang at the next try access to the FDs with close() and read(). – chrulri Dec 12 '12 at 10:16
  • In case anyone else ends up here, we're seeing this problem in 4.4. – Harvey Mar 13 '15 at 15:02
  • Agree with @Harvey - I see it on 4.4.2 - a LG 1GB RAM device. – RoundSparrow hilltx Feb 21 '16 at 00:46
4

Above solution didn't prove to be reliable in any ways, causing more issues on some devices!

So I reverted back to the standard .exec() and kept digging...

Looking at the child code that hangs, I noticed the child process will hang while trying to close all file descriptors inherited from the parent (except the one created within the exec() call) !

So I search the whole app code for any BufferedReader/Writer and similar classes to make sure those would be closed when calling exec()!

The frequency of the issue was considerably reduced, and actually never occured again when I removed the last opened file descriptor before calling exec().

NB: Make sure SU binary is up-to-date, it can actually cause this issue too!

Enjoy your search ;)

3c71
  • 4,313
  • 31
  • 43
  • I am facing this issue at the moment on a very large application that does a lot of concurrent stuff (with files, sockets and various things that could be causing open file descriptors.) I would like to ask 3c71, when you say "NB: Make sure SU binary is up-to-date, it can actually cause this issue too!" - what exactly do you mean? Is is possible that updating SU will make it work without having to make sure all the file descriptors are closed? – Nova Entropy Oct 21 '13 at 15:44
  • 1
    At the time there was issues with SU too, but those are long gone. – 3c71 Nov 10 '13 at 18:03
3

Bug fix in Bionic was commited monthes ago, but it still hasn't been included in Android 4.0.4.

A-IV
  • 2,555
  • 2
  • 21
  • 17
0

I have the same problem on ICS (seem to works fine on Android < 4). Did you find a solution?

A simple workaround could be to call the "exec" method in a dedicated thread with a timeout-join so that this situation could be "detected" (yes I know it's not very elegant...)

stema
  • 90,351
  • 20
  • 107
  • 135
Berserker
  • 21
  • 1
  • 4
  • I tried that, though it turned out the dedicated thread could not be timed-out or joined, nor interrupted! It's just stuck!!! Indeed I can reproduce this on Android 4 all the time, whereas on Android < 4, the issue showed only infrequently on custom kernels! – 3c71 Jan 18 '12 at 21:12
  • Yes, found an actual reliable fix, just see below ;) – 3c71 Jul 03 '12 at 18:43