16

I have an application that I am debugging and I'm trying to understand how gdb works and why I am not able to step through the application sometimes. The problem that I am experiencing is that gdb will hang and the process it is attached to will enter a defunct state when I am stepping through the program. After gdb hangs and I have to kill it to free the terminal (ctrl-C does not work, I have to do this from a different terminal window by getting the process id for that gdb session and using kill -9).

I'm guessing that gdb is hanging because it's waiting for the application to stop at the next instruction and somehow the application finished execution without gdb identifying this. But that's just speculation on my part from the behavior I've observed thus far. So my question is if anyone has seen this type of behavior before and/or could suggest what the cause might be. I think that might help me improve my debugging strategy.

In case it matters I'm using g++ 4.4.3, gdb 7.1, running on Ubuntu 10.04 x86_64.

Gabriel Southern
  • 9,602
  • 12
  • 56
  • 95
  • Is there some minimal testcase you can give? I mean, as written, its very hard to answer why it might be happening—other than "could be because gdb has a bug". – derobert Jan 23 '12 at 21:31
  • thanks for the suggestion. Right now the application has a lot of library code, but I'll see if I can make a small test case that I could post as an example. – Gabriel Southern Jan 23 '12 at 22:01
  • 1
    I would recommend to use a more recent `gdb`. Current version is **7.3** and they did a lot of progress. (and BTW, using a more recent `g++` ie 4.6.2 would also be helpful; since GCC also progressed on debugging information). – Basile Starynkevitch Jan 24 '12 at 06:35
  • I'm using what's installed with the Ubuntu 10.04 LTS release, and I'm not likely to upgrade until 12.04 is out and tested with the systems we use. But testing with a newer version of gdb is a good idea for checking the specific problem I'm seeing just in case it is a bug with gdb. So thanks for the suggestion I'll try that with one of the systems I can use. – Gabriel Southern Jan 24 '12 at 07:27
  • @BasileStarynkevitch thanks for the suggestion. I finally got around to trying gdb 7.4 and it seems like that fixed the problem I had so I guess there was some bug that had already been solved. – Gabriel Southern Jan 31 '12 at 19:04

2 Answers2

7

I had a similar problem and solved it by sending a CONT signal to the process being debugged.

Michaël Witrant
  • 7,525
  • 40
  • 44
6

I'd say the debugged process wouldn't sit idle if it was the cause of the hang. Every time GDB has completed a step, it has to update any expressions you required to print. It may include following pointers and so, and in some case, it may fail there (although I don't remind of a real "hang"). It also typically try to update your stack trace. If the stack trace has been corrupted and is no longer coherent, it could be trapped into an endless loop. Attaching gdb to strace to see what kind of activity is going on during the hang could be a good way to go one step further into figuring out the problem.

(e.g. accessing sources through a no-longer-working NFS/SSHFS mount is one of the most frequent reason for gdb to hang, here :P)

PypeBros
  • 2,607
  • 24
  • 37
  • thanks for the thoughts. Can you elaborate on what you mean by "attaching gdb to strace"? – Gabriel Southern Jan 25 '12 at 03:16
  • 1
    I mean running `strace -p \`pidof gdb\` ` in a terminal and studying the output. It should present all the system calls executed by the program. – PypeBros Jan 25 '12 at 04:51
  • 2
    thanks for the suggestion. I haven't figured out what's cause the problem yet but it's useful to look at. The last system call where gdb hangs is `wait4(26066,` (for this run the PID varies of course). Meanwhile a listing with ps -a shows that process 26066 is `defunct`. – Gabriel Southern Jan 25 '12 at 20:58
  • was it by any chance the debugged process ? – PypeBros Jan 26 '12 at 12:52
  • yes it, process `26066`, was the debugged process. I guess I don't completely understand what the state `defunct` means. I was reading about it and it sounds like this should be the correct sequence of events because gdb should reap the process when it finishes execution. So I suppose that the problem is not with this last system call but something that happened earlier on. I'm still trying to figure it out though. – Gabriel Southern Jan 26 '12 at 16:40
  • http://en.linuxreviews.org/Defunct_process seems to explain it fine: your defunct process is dead, but it is waiting for its parent process (gdb, afaik) to acknowledge this. GDB should thus have received `SIGCHILD` earlier. I'm puzzled here: `wait4()` should precisely allow the debugged process to leave the defunct state here ... – PypeBros Jan 28 '12 at 10:39
  • 3
    thanks for the troubleshooting suggestions, it looks like the problem I had was related to some sort of bug in gdb 7.1 because it is working correctly with gdb 7.4 – Gabriel Southern Jan 31 '12 at 19:06
  • 1
    I had the exact same problem while debugging multithreaded code. After upgrading to gdb 7.4, I am getting a stacktrace for segmentation fault instead of the hang. – amit kumar Jan 11 '13 at 09:03
  • 1
    I have this issue of gdb hanging on a call instruction. strace reports an infinite loop of `ptrace(PTRACE_PEEKTEXT, ...)` calls all returning 0. ubuntu focal x86_64 – fuzzyTew May 21 '20 at 16:07