0

I have to implement something like this:

  1. start a program (e.g a.out) under GDB
  2. set some break points
  3. periodically send CTRL-C signal to the GDB to pause the execution of a.out.
  4. do some commands like "bt, info threads" at the stop point or break points
  5. continue the execution of a.out
  6. until the end of the a.out

Those steps can be executed interactively under shell, but I need to automate that in a program. I'm considering using fork to create a subprocess for GDB and use popen to execute the initial GDB launch command, but how can I periodically send those GDB subcommands (bt, continue) to that subprocess and let it execute them?

I'm stuck at this point and any thoughts would be greatly appreciated. Thanks ahead.

BruceTerp
  • 45
  • 8
  • 2
    What problem are you *actually* trying to solve? – Employed Russian Nov 22 '17 at 03:56
  • you'd typically replace the child process's stdin/stdout/stderr file descriptors w/ pipes back to the controlling process. I agree w/ @EmployedRussian that this sounds like an X/Y problem. – lockcmpxchg8b Nov 22 '17 at 06:04
  • Is this a one-off project? If so, try [expect](http://expect.sourceforge.net/) or [pexpect](https://pexpect.readthedocs.io/en/stable/), since it's easy to get things up quickly. Long term, you may want to use [gdb mi](https://sourceware.org/gdb/onlinedocs/gdb/GDB_002fMI.html), which is less apt to change than gdb's command line interface. – Mark Plotnick Nov 22 '17 at 15:18
  • @EmployedRussian I need to sample the execution of a program and get stack traces at each sample point. I know you may suggest using PAPI for sampling but in my case, I have to rely on gdb for now. Therefore, I need to periodically pause the gdb and get a backtrace at that point. – BruceTerp Nov 22 '17 at 18:52
  • @BruceTerp -- "I need to ..." but *why* ? http://xyproblem.info/ – Employed Russian Nov 22 '17 at 19:09
  • @MarkPlotnick I don't understand the syntax of gdb mi, how to use that in a program, could you give me a simple example? Thanks – BruceTerp Nov 22 '17 at 19:27
  • @BruceTerp Some MI examples and references to libraries that simplify things can be found at [writing front end for gdb](https://stackoverflow.com/questions/16771393/writing-front-end-for-gdb). – Mark Plotnick Nov 22 '17 at 19:42
  • @EmployedRussian Thanks for the xyproblem link. In a larger picture, I want to sample the execution of cuda programs and get stacktraces at each sample point. Current support from Nvidia (PC_Sampling) doesn't give you stack trace and so I have to take advantage of cuda-gdb, which is very like gdb but much few people used as I know, so I extend the concept to gdb and subprocess to get more attention on this. – BruceTerp Nov 22 '17 at 19:45
  • @EmployedRussian I've done some trials, one feasible way is to call popen once at each sample point with a command '"gdb --pid=XXX -ex bt -batch". The problem with it is the overhead of calling popen is too hight, making my sampling rate too low. Therefore I want to just call popen once but send control signals to the gdb and get what I need. So do you have any idea of how to sample a cuda program and get stacktraces? I appreciate that very much :) – BruceTerp Nov 22 '17 at 19:45
  • @lockcmpxchg8b Can you be more specific in this gdb case? The examples I found on PIPI are all just sending a string to the child process and child process print it out. But how to let the child process interpret that string as a command and execute it? – BruceTerp Nov 22 '17 at 19:48
  • does `cat /proc//stack` show the relevant info, or is the stack-trace you're interested in held within the GPUs? – lockcmpxchg8b Nov 28 '17 at 16:48

1 Answers1

2

This is a very simplistic implementation. It forks the target process with no pipes, we just need to learn it's pid. Then it forks gdb with the -p <PID> option to attach to our target. The fork for GDB sets up pipes for stdin/stdout/stderr before exec'ing, so that we can remote control GDB.

A few interesting notes:

  1. When GDB is running a debug-target, it doesn't respond to SIGINT. You have to send the SIGINT to the debug-target. This is why I fork twice rather than launching gdb --args <target>. I need the PID of the process it's debugging so I can send SIGINT.
  2. When you attach pipes to a process' stdout and stderr, you must read them or the target process will eventually block (when they fill the pipe's buffer). My implementation is stupid here, because I didn't want to take the time to use threads or do proper select calls.
  3. You have to be somewhat careful about when the APIs will block. Note that I'm using read/write instead of fread,fwrite due to their behavior when they can't read the amount I have requested.

The 'tracer' program is:

#include <stdio.h>
#include <string.h>

#include <unistd.h>
#include <signal.h>
#include <fcntl.h>
#include <sys/select.h>

char gdb_pid_buf[20];

char *gdb_argv[] =
{
  "gdb",
  "-p",
  gdb_pid_buf,
  NULL
};

char *child_argv[] =
{
  "./looper",
  NULL
};

const char GDB_PROMPT[] = "(gdb)";

int wait_for_prompt(const char *prefix, int fd)
{
  char readbuf[4096];
  size_t used = 0;
  while(1)
  {
    ssize_t amt;
    char *prompt;
    char *end;

    amt = read(fd, readbuf+used, sizeof(readbuf)-used-1);
    if(amt == -1)
    {
      return 1;
    }
    else if(amt == 0)
    {  }
    else
    {
      used += amt;

      readbuf[used] = '\0';
      for(end = strstr(readbuf, "\n"); end; end= strstr(readbuf, "\n"))
      {
        size_t consumed;
        size_t remaining;

        *end = '\0';
        printf("%s: %s\n", prefix, readbuf);

        consumed = (end-readbuf) + strlen("\n");
        remaining = used - consumed;
        memmove(readbuf, readbuf+consumed, remaining);
        used -= consumed;
      }

      prompt = strstr(readbuf, GDB_PROMPT);
      if(prompt)
      {
        *prompt = '\0';
        printf("%s: %s", prefix, readbuf);
        printf("[PROMPT]\n");
        fflush(stdout);
        break;
      }
    }
  }
  return 0;
}

int main(int argc, char *argv)
{
  int i;

  int stdin_pipe[2];
  int stdout_pipe[2];
  int stderr_pipe[2];

  pipe(stdin_pipe);
  pipe(stdout_pipe);
  pipe(stderr_pipe);

  int gdb_pid;
  int child_pid;

  //Launch child
  child_pid = fork();
  if(child_pid == 0)
  {
    close(stdin_pipe[0]);
    close(stdout_pipe[0]);
    close(stderr_pipe[0]);
    close(stdin_pipe[1]);
    close(stdout_pipe[1]);
    close(stderr_pipe[1]);

    execvp(child_argv[0], child_argv);
    return 0;
  }

  sprintf(gdb_pid_buf, "%d", child_pid);

  //Launch gdb with command-line args to attach to child.
  gdb_pid = fork();
  if(gdb_pid == 0)
  {
    close(stdin_pipe[1]);
    close(stdout_pipe[0]);
    close(stderr_pipe[0]);

    dup2(stdin_pipe[0],0);
    dup2(stdout_pipe[1],1);
    dup2(stderr_pipe[1],2);

    execvp(gdb_argv[0], gdb_argv);
    return 0;
  }

  close(stdin_pipe[0]);
  close(stdout_pipe[1]);
  close(stderr_pipe[1]);

  //Wait for GDB to reach its prompt
  if(wait_for_prompt("GDB", stdout_pipe[0]))
    {fprintf(stderr,"child died\n");return 1;}

  printf("[SENDING \"continue\\n\"]\n");
  fflush(stdout);
  write(stdin_pipe[1], "continue\n", strlen("continue\n"));

  sleep(4);

  printf("[SENDING \"CTRL+C\"]\n");
  fflush(stdout);
  kill(child_pid, SIGINT);

  //Then read through all the output until we reach a prompt.
  if(wait_for_prompt("POST SIGINT", stdout_pipe[0]))
    {fprintf(stderr,"child died\n");return 1;}

  //Ask for the stack trace
  printf("[SENDING \"where\\n\"]\n");
  fflush(stdout);
  write(stdin_pipe[1], "where\n", strlen("where\n"));

  //read through the stack trace output until the next prompt
  if(wait_for_prompt("TRACE", stdout_pipe[0]))
    {fprintf(stderr,"child died\n");return 1;}

  kill(child_pid, SIGKILL);
  kill(gdb_pid, SIGKILL);
}

The target program, looper is just:

#include <stdio.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
  while(1)
  {
    printf(".");
    fflush(stdout);
    sleep(1);
  }
}

Example output is:

$ ./a.out
.GDB: GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
GDB: Copyright (C) 2013 Free Software Foundation, Inc.
GDB: License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
GDB: This is free software: you are free to change and redistribute it.
GDB: There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
GDB: and "show warranty" for details.
GDB: This GDB was configured as "x86_64-redhat-linux-gnu".
GDB: For bug reporting instructions, please see:
GDB: <http://www.gnu.org/software/gdb/bugs/>.
GDB: Attaching to process 8057
GDB: Reading symbols from /home/<nope>/temp/remotecontrol/looper...(no debugging symbols found)...done.
GDB: Reading symbols from /lib64/libc.so.6...(no debugging symbols     found)...done.
GDB: Loaded symbols for /lib64/libc.so.6
GDB: Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
GDB: Loaded symbols for /lib64/ld-linux-x86-64.so.2
GDB: 0x00007f681b4f9480 in __nanosleep_nocancel () from /lib64/libc.so.6
GDB: Missing separate debuginfos, use: debuginfo-install glibc-2.17-    106.el7_2.4.x86_64
GDB: [PROMPT]
[SENDING "continue\n"]
....[SENDING "CTRL+C"]
POST SIGINT: Continuing.
POST SIGINT:
POST SIGINT: Program received signal SIGINT, Interrupt.
POST SIGINT: 0x00007f681b4f9480 in __nanosleep_nocancel () from /lib64/libc.so.6
POST SIGINT: [PROMPT]
[SENDING "where\n"]
TRACE: #0  0x00007f681b4f9480 in __nanosleep_nocancel () from /lib64/libc.so.6
TRACE: #1  0x00007f681b4f9334 in sleep () from /lib64/libc.so.6
TRACE: #2  0x0000000000400642 in main ()
TRACE: [PROMPT]

You can see from the .... that the target did continue running, even though GDB's output of "Continuing." doesn't show up until later when I read it's stdout pipe.

lockcmpxchg8b
  • 2,205
  • 10
  • 16
  • If this answer is on target, I'd suggest changing the title to something like "how to remote-control GDB on Linux" since the surprises were largely related to it being GDB. – lockcmpxchg8b Nov 23 '17 at 01:59
  • Thank you so much! This is exactly what I'm looking for, although it took me a while to fully understand your code since I'm not familiar with Unix/System programming. Two questions: [1] In your second notes, you said your method is stupid, but it seems pretty straightforward to me as you read line by line, what else more efficient way it could be? [2] Currently, after I modified your code to periodically peeking the stack until the end of the child, the overhead is kinda high, for a complex program, it takes up to 5 times of child's original execution, any improvement suggestions? Thanks! – BruceTerp Nov 25 '17 at 01:45
  • As for the second note, it ends up doing a lot of extra copying of data within readbuf. Now that I think about it, scanning backward from the end of the readbuf with `strrstr` would be much more efficient. You could also probably speed it up considerably by not printing the output of GDB to the screen (presuming you haven't already eliminated the printfs). I think the performance hit is probably intrinsic in this method; dwarf debug info is compressed, so having to parse and unpack it to interpret the stack at various locations is costly. – lockcmpxchg8b Nov 25 '17 at 03:59
  • I've tried several ways and here's what I've found: [1]commenting out all printf doesn't give me noticeable speedup [2] omitting `where` only reduce runtime from 5x to 4x [3] Simply printing out readbuf everytime without the line-by-line parsing in the for loop only gives me negligible speedup. So my questions are: 1. does it mean the major performance bottleneck is at the int-cont operation as you mentioned the intrinsic performance hit. 2. How to be more efficient using strrstr as you suggested while I do need all the output info in-order? Thanks! – BruceTerp Nov 28 '17 at 04:29
  • the `strrstr` thing would just let you find the last `'\n'` in the buffer in one step. You can still process the text between the beginning of `readbuf` and that last newline, but then you only have to do one `memmove` to get the buffer ready for the next read. I also think you could add a flag to the `wait_for_prompt` method, to indicate whether you want it to write the output. That way, you can spare yourself the I/O that occurs between sending `SIGINT` and the prompt. I think you can also get rid of all of the `fflush()` after printing `"[PROMPT]"`. That might save a little I/O time. – lockcmpxchg8b Nov 28 '17 at 05:08
  • none of that is major, though. Rather than echoing all that text to the screen, it might be helpful to write it directly to a file, if you can find a write function that buffers well (i.e., is not line-buffered). (I recently learned that stdin/stdout flush on newlines) – lockcmpxchg8b Nov 28 '17 at 05:10
  • I'm wondering why do I have to dup2 the stdout of gdb and read it from parent process(also, shall I close those ends after dup2 them in gdb process?)? I can simply let the gdb print everything to the screen or redirect it to a file instead of waiting for the parent to read it right? But after I remove all wait_for_prompt calls, I got this error "Couldn't get registers: No such process." any thoughts on this? – BruceTerp Nov 28 '17 at 16:25
  • I guess the reason that I have to read dup2 in gdb and read the output from parent is that we have wait until the prompt to send the next command and there seems to no better way to know that...Also, even if I don't print anything to the screen in wait_for_prompt, it only gives me minor speedup so printing to a file with buffered write seems not very helpful,either. Thanks – BruceTerp Nov 28 '17 at 19:05