1

I am having trouble with a bug caused by overwriting a pointer with an invalid value. I have not been able to find the bug using valgrind (in it's default mode) or with GDB because they only point me to the invalid pointer, and NOT what overwrote that pointer to the incorrect value.

It's always the same variable, however, I do not explicitly set it to the bad value. Some other line in the program must be accessing memory out of it's bounds but by chance it happens to hit the storage for this pointer instead.

I am unsure what debugging tools/options I should use to approach this bug.

Example crash:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff6ffc700 (LWP 2425)]
0x00000000004058b2 in writeToConn (conn=0x7ffff0004f40) at streamHandling.c:115
115             ssize_t result = send(conn->fd, conn->head->data->string + position, conn->head->data->size - position, 0);
(gdb) print conn
$1 = (struct connection *) 0x7ffff0004f40
(gdb) print conn->head->data
$2 = (struct dbstring *) 0x35

Unfortunately I can't simply watch the variable conn->head->data because I have about 5,000 conn structs.

This code works most of the time, however if run under a moderately heavy load it will crash after a few seconds.

charliehorse55
  • 1,940
  • 5
  • 24
  • 38
  • What is the type of `conn->head` and how is that structure defined? – Joachim Isaksson Oct 14 '12 at 06:49
  • what kind of pointer is it? maybe you can use a thunk and see who's modifying it? I do not know about GDB, but WinDbg has data breakpoints, that fire everytime state is modified ***EDIT: maybe try this? http://www.technochakra.com/debugging-types-of-data-breakpoints-in-gdb/ –  Oct 14 '12 at 06:49
  • 4
    When in doubt, print values out... – norman Oct 14 '12 at 06:52
  • I already searched the code for instances where I change that value manually. I only change it once when I allocate the buffer. I should add that this code works 99% of the time, but when I run it on 5,000 items it will crash on this after a few seconds. – charliehorse55 Oct 14 '12 at 06:53
  • @IonTodirel I can't use that feature, because I make thousands of `conn` structs and only 1 causes the segmentation fault. Unless there is a way to watch ALL of them, and then print out a history of the changes made to the particular one that caused the crash. – charliehorse55 Oct 14 '12 at 06:57
  • yes, there is, at least in WinDbg, you can continue logging the data without stopping, and do diffs offline maybe –  Oct 14 '12 at 07:01
  • Have you looked at the memory surrounding the bad pointer? Are any other pointers bad? If so, do they have any particular defining characteristic (e.g. 0x35 is '5' in ASCII)? Does the damage extend past `conn->head`? – nneonneo Oct 14 '12 at 07:02
  • And, if there is no other bad memory nearby, you should trash all your object/binary files and rebuild (a major cause of randomly overwritten struct values is mismatched headers and libraries, or headers and binaries). – nneonneo Oct 14 '12 at 07:03
  • I've solved my problem. I was freeing memory without removing it from it's list. Serves me right for not setting to NULL. I'm leaving this question open, as I would still like to hear any other strategies for solving these types of bugs. – charliehorse55 Oct 14 '12 at 07:23
  • @charliehorse55, but valgrind should have told you that your are using memory that you have previously freed. Perhaps you hadn't enabled enough options for this type of errors? – Jens Gustedt Oct 14 '12 at 09:17
  • Ah, but I am doing a lot of allocation in between the `free()` and the memory access. I assume it just gets reallocated. I'm sure there is an option in Valgrind to support that, but I guess I didn't have it turned on. – charliehorse55 Oct 14 '12 at 09:19

2 Answers2

2

You can have gdb automatically execute commands when a breakpoint is triggered, with Break Commands.

You could set up a Break Command to run whenever a struct connection is allocated, and have it add a watchpoint on the field of interest.

caf
  • 233,326
  • 40
  • 323
  • 462
  • Exactly what I was looking for. I have actually already solved my bug by carefully reading over code, so I added it back in and tried this. It worked! I'll keep it mind if this pops up again. – charliehorse55 Oct 14 '12 at 08:37
  • http://stackoverflow.com/questions/58851/can-i-set-a-breakpoint-on-memory-access-in-gdb look in to this post too. There are read and write breakpoints that can be set o memory location. rwatch, awatch. The progam will run slower than normal run, as GDB check for memory access for each step. – Kamath Oct 14 '12 at 17:19
0

Would a stack backtrace help? Here is a page that tells how to do it.

How can one grab a stack trace in C?

Community
  • 1
  • 1
Marichyasana
  • 2,966
  • 1
  • 19
  • 20