1

I have a Python v3.4 application that uses check_output() to invoke a C++ application that calls fork(), with the original process exiting and the child process continuing on. It seems the check_output() is also waiting on the child process instead of returning once the main process returns and says the daemon was started successfully.

Do I need to change how I fork() in C++ or does Python check_output() call need to somehow be told only wait for parent process to exit? Do I need to do a second fork() in C++ as described here?


Here is stripped down Python that exhibits the issue:

#! /usr/local/bin/python3

import logging
import argparse
from subprocess import CalledProcessError, check_output, STDOUT

ARGS = ["user@hostname:23021:"]

if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description="Try launching subprocess")
    parser.add_argument("exec", type=str, help="the exec to run")
    args = parser.parse_args()

    logging.basicConfig(level=logging.INFO,
                        format='%(asctime)s - %(message)s',
                        datefmt='%Y-%m-%d %H:%M:%S')

    cmd_list = [args.exec] + ARGS
    logging.info(str(cmd_list))

try:
    output = check_output(cmd_list,
                          stderr=STDOUT,
                          universal_newlines=True)
    logging.info("Exec OK with output:")
    logging.info(output)
except CalledProcessError as e:
    logging.info("Exec Not OK with output:")
    logging.info(str(e))

Here is the C++ code to daemonize the C++ application:

void
daemonize()
{
  // This routine backgrounds the process to run as a daemon.
  // Returns to caller only if we are the child, otherwise exits normally.
  if (getppid() == 1) {
    return;  // Leave if we're already a daemon
  }

  // Create the backgrounded child process.
  const pid_t parent = getpid();
  const pid_t pid = fork();
  CSysParamAccess param;
  const string progName(param.getProgramKindName());

  ::close(STDIN_FILENO);
  ::close(STDOUT_FILENO);
  ::close(STDERR_FILENO);

  if (pid < 0) {
    cerr << "Error: " << progName << " failed to fork server. Aborting."
         << endl; // inform the client of the failure

    exit(appExit::failForkChild);    // Error. No child created.
  } else if (pid > 0)  {
    // We're in the parent.  Optionally print the child's pid, then exit.
    if (param.getDebug()) {
      clog << "Successful fork. The Application server's (" << progName
           << ") pid is: " << pid << "(self) from parent " << parent << endl;
    }

    ::close(STDIN_FILENO);
    ::close(STDOUT_FILENO);
    ::close(STDERR_FILENO);

    exit(appExit::normal);
  }

  ::close(STDIN_FILENO);
  ::close(STDOUT_FILENO);
  ::close(STDERR_FILENO);

  // Here only in the child (daemon).
  if (-1 == setsid()) { // Get a new process group
    cerr << "Error: Failed to become session leader while daemonising - errno: "
         << errno;

    exit(appExit::failForkChild);    // Error.  Child failed.
  }

  signal(SIGHUP, SIG_IGN); // Per example.

  // Fork again, allowing the parent process to terminate.
  const pid_t midParent = getpid();
  const pid_t grandChildPid = fork();

  if (grandChildPid < 0) {
    cerr << "Error: Failed to fork while daemonising - errno: " << errno;

    exit(appExit::failForkChild);    // Error.  GrandChild failed.
  } else if (grandChildPid > 0) {
    // We're in the parent.  Optionally print the grandchild's pid, then exit.
    if (param.getDebug()) {
      clog << "Successful second fork. The Application server's (" << progName
           << ") pid is: " << grandChildPid << "(self) from parent "
           << midParent << endl;
    }

    ::close(STDIN_FILENO);
    ::close(STDOUT_FILENO);
    ::close(STDERR_FILENO);

    exit(appExit::normal);
  }

  // Here only in the grandchild (daemon).
  appGlobalSetSignalHandlers();

  // Set the current working directory to the root directory.
  if (chdir("/") == -1) {
    cerr <<
      "Error: Failed to change working directory while daemonising - errno:"
         << errno;

    exit(appExit::failForkChild);    // Error.  GrandChild failed.
  }

  // Set the user file creation mask to zero.
  umask(0);

  //close(STDIN_FILENO); // Cannot close due to assertion in transfer.cpp
  // Theoretically, we would reopen stderr and stdout using the log file.
  ::close(STDIN_FILENO);
  ::close(STDOUT_FILENO);
  ::close(STDERR_FILENO);

  // We only return here if we're the grandchild process, the Application
  // server.  The summoner exited in daemonize().
  clog << "Application " << param.getProgramKindName()
       << " (" << appGlobalProgramName() << ") successfully started." << endl;
}

It works when called with echo and fails with my C++ application:

> stuckfork.py echo
2016-02-05 10:17:34 - ['echo', 'user@hostname:23021:']
2016-02-05 10:17:34 - Exec OK with output:
2016-02-05 10:17:34 - user@hostname:23021:

> stuckfork.py flumep
2016-02-05 10:17:53 - ['flumep', 'user@hostname:23021:']
  C-c Traceback (most recent call last):
  File "/home/user/Bin/Bin/stuckfork.py", line 26, in <module>
    universal_newlines=True)
  File "/usr/local/lib/python3.4/subprocess.py", line 609, in check_output
    output, unused_err = process.communicate(inputdata, timeout=timeout)
  File "/usr/local/lib/python3.4/subprocess.py", line 947, in communicate
    stdout = _eintr_retry_call(self.stdout.read)
  File "/usr/local/lib/python3.4/subprocess.py", line 491, in _eintr_retry_call
    return func(*args)
KeyboardInterrupt
>

I've narrowed the issue down to one of my C++ static constructors is doing something that causes the launching process to go defunct which is why Python is still waiting. Dividing now to see which one.

Community
  • 1
  • 1
WilliamKF
  • 41,123
  • 68
  • 193
  • 295

2 Answers2

1

A correct solution will be to find correct file descriptor that pipes the output to python from the forked C++ child and close it.

For now you may try to close(1) SYSTEM CALL in the C++ child process or before calling child process (just after fork()) . That will signal python to stop trying to read from the child.

I am not sure if this will work, as the code you posted is not enough.

Jay Kumar R
  • 537
  • 2
  • 7
0

The issue was open file descriptors, it was due to this static code being run:

FILE *origStdErr = fdopen(dup(fileno(stderr)), "a");

Once that line was removed, the daemon's close(0), close(1), and close(2) had the proper effect and the Python code stopped waiting.

WilliamKF
  • 41,123
  • 68
  • 193
  • 295