4

We have a posix mutli-threaded C++ program running on Linux 2.6.32, which core-dumps in one of the threads. Analysing the core file with gdb-7.2 corss-compiled, we see that the faulting instruction is here

0x11491178 <+208>:   lwz     r0,8(r9)

and registers in the frame show:

(gdb) info reg
r0             0x0      0
….
r9             0xdeaddead       3735936685

Which makes sense as r9 has an invalid address value(in fact heap scrub pattern we write) in the context of the process/thread.

The confusing bit is that r9 is loaded from like this

0x1149116c <+196>:   lwz     r9,0(r4)

and r4 contains the value of (first and only) function parameter "data". GDB tells me the following information about data:

(gdb) p data
$6 = (TextProcessorIF *) 0x4b3fe858

(gdb) p *data
$7 = {_vptr.TextProcessorIF = 0x128b5390}

(gdb) info symbol 0x128b5390
vtable for TextProcessorT<unsigned short> + 8 in section .rodata 

Which is all correct in this context. So r9 should have had a value of 0x128b5390 instead of the pattern "0xdeaddead" which is written when the memory is free'd and given back to the heap.

So, why the register r9 contains the scrubbed value when the memory contains a legal object. My theory is that the core contains snapshot of the memory just as the process died which is much further down the line when the actual crash happened. After the SIGSEGV has been raised, this location of the heap memory can still be used by other threads as they are logging data till the time process dies. So, it is possible that the memory pointed to by data maybe have been allocated again and being used/been used at the time memory snapshot has been taken and preserved in core.

My question is:
A) Is my theory correct?
B) Am I right in presuming that the heap memory snapshot is not taken at the time crash (signal being raised) but at in the final moments of the process?
C) Address/location that caused a SIGSEGV can still be used (by other threads)?

Thanks!

Jerry YY Rain
  • 4,134
  • 7
  • 35
  • 52
abhikaro
  • 111
  • 1
  • 5
  • 1
    I believe your theory is incorrect (but I am not sure). Core dumps are related to *processes* not *threads*, even if `SIGSEGV` is (generated by and) emitted to the faulty thread. – Basile Starynkevitch Jul 25 '14 at 07:08
  • Thanks! The theory is around the heap usage which is common to all threads in the process. I am aware that the dumps collect process-wide information and what I am postulating is that heap is still being used by other threads in the process as they service their shutdown code and what I get in the core is the heap whose contends may well have been changed from the time signal was raised. – abhikaro Jul 25 '14 at 07:16
  • Have you checked what values you are putting into heap? It is possible that the prev and next addresses are being overwritten causing your heap chunks to be misaaligned. Dump the location of the crash by doing `x/64x ADDRESS-10` adjust this to match up to your case but this might be a possible issue as 0xdeaddead does not look like a proper memory address within the range of your application. – Phobos Aug 12 '14 at 23:52

1 Answers1

-2

Do you use signal handler(s) for SIGSEGV? Are they asynchronous and re-entrant?

See How to write a signal handler to catch SIGSEGV?

Community
  • 1
  • 1
kinsu
  • 1