0

when program exits, the information is given:

*** glibc detected *** double free or corruption (!prev): 0x09a8fcb8 ***

It seems like double free at one object. Then I used gdb to debug the coredump file. The following is the bt result(more traces are not posted):

#0 0x005197a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1 0x0055a825 in raise () from /lib/tls/libc.so.6
#2 0x0055c289 in abort () from /lib/tls/libc.so.6
#3 0x0058ecda in __libc_message () from /lib/tls/libc.so.6
#4 0x0059556f in _int_free () from /lib/tls/libc.so.6
#5 0x0059594a in free () from /lib/tls/libc.so.6
#6 0x00c0f001 in operator delete (ptr=0x0) at ../../../../gcc-4.2.2/libstdc++-v3/libsupc++/del_op.cc:49
#7 0x00bea48d in std::string::_Rep::_M_destroy (this=0x9a8fcb8, __a=@0xbfe134af)
  at /home/robert_bu/src/build_gcc-4.2.2/i686-pc-linux-gnu/libstdc++-v3/include/ext/new_allocator.h:97
#8 0x070807e8 in __tcf_0 () from ./../bin/../lib/librlxvm_kmmpv_ocp_tl2.so
#9 0x0055d5a7 in exit () from /lib/tls/libc.so.6
...

Then the valgrind result shows me the string is deleted by two different .so file(libkmm.so.2.0.0 and libpv.so.2.0.0). Detailed information (some lines are masked):

==28125== Invalid free() / delete / delete[]
==28125==    at 0x400588F: operator delete(void*) (vg_replace_malloc.c:387)
==28125==    by 0x446548C: std::string::_Rep::_M_destroy(std::allocator<char> const&) (new_allocator.h:97)
==28125==    by 0x55FA7E7: __tcf_0 (in /home/alan_tao/vm/test/lib/libkmm.so.2.0.0)
==28125==    by 0x55D5A6: exit (in /lib/tls/libc-2.3.4.so)
==28125==    by 0x42B10D9: stop_sim() (in /home/alan_tao/vm/test/lib/libcomm.so.2.0.0)
==28125==    by 0x807C83A: func_on_exit(int) (in /home/alan_tao/vm/test/bin/engine)
==28125==    by 0x55A917: ??? (in /lib/tls/libc-2.3.4.so)
...
==28125==  Address 0x4a484d0 is 0 bytes inside a block of size 525 free'd
==28125==    at 0x400588F: operator delete(void*) (vg_replace_malloc.c:387)
==28125==    by 0x446548C: std::string::_Rep::_M_destroy(std::allocator<char> const&) new_allocator.h:97)
==28125==    by 0x650C0B7: __tcf_0 (in /home/alan_tao/vm/test/lib/libpv.so.2.0.0)
==28125==    by 0x55D5A6: exit (in /lib/tls/libc-2.3.4.so)
==28125==    by 0x42B10D9: stop_sim() (in /home/alan_tao/vm/test/lib/libcomm.so.2.0.0)
==28125==    by 0x807C83A: func_on_exit(int) (in /home/alan_tao/vm/test/bin/engine)
==28125==    by 0x55A917: ??? (in /lib/tls/libc-2.3.4.so)

...


The valgrind result shows that one string is deleted twice. But I can't know the right static one. Who has any idea to know deleting which string cause the error and how to fix it? Thanks

PS: program is running under linux 2.6.9. gcc version is 4.2.2. dll is used.

New Update: Using gdb to list the error lib file, command "l __tcf_0" shows me the following code:

inline std::vector<const char*>& get_phase_name_vec(){
  static std::vector<const char*> phase_name_vec(END_RESP+1, (const char*)NULL);
  return phase_name_vec;
}

This is from OSCI TLM header file. And the above libraries have to include them. It's in a separated namespace "tlm". Any idea to fix this error?

Alan
  • 1
  • 4
  • Peek into the app's memory to see what this string is, it may still be there, not fully overwritten. Experiment. Add logging to your app. Be creative. – Alexey Frunze Dec 12 '11 at 09:55
  • Are you able to show us a code, where you think the app crashes? – maverik Dec 12 '11 at 09:59
  • Thanks! I have tried this. The pointer value at 0x9a8fcb8 is 0 – Alan Dec 12 '11 at 10:02
  • There are many static string stored in different classes. Individual shared const string such as: const ::std::string ARR_I2S[33]= { "0" ,"1" ,"2" ,"3" ,"4" ,"5" ,"6" ,"7" ,"8" ,"9" ,"10","11","12","13","14","15", "16","17","18","19","20","21","22","23","24","25","26","27","28","29","30","31","" }; – Alan Dec 12 '11 at 10:06
  • Can you post a minimal sample where the problem reproduces? It's pointless to assume what your code looks like. – INS Dec 12 '11 at 10:36

1 Answers1

0

The problem is solved. There a global variable name collision in libkmm.so.2.0.0's source and tlm header files.

It seems that "l __tcf_0" doesn't show the right variable. Thanks Alex, maverik, Iulian Şerbănoiu and others reading this question.

Alan
  • 1
  • 4