6

Background

While moving to a newer version of CC compiler, a segfault manifested in a module that used to work.

Observations so far

  1. From the core file I could learn in what function the segfault originated. When I observed the function I could not find anything suspicious.

  2. First major problem was that the segfault reproduced only when compiling in "release" (optimizations turned on), and didn't reproduce on "debug". Also, the segfault doesn't reproduce on g++.

  3. Now I started to use printing, and a bigger problem arose- when adding cout/printf (to binary search the segfault line/print the value of pointers) to certain lines in the code, the segfault didn't reproduce. Moreover, I added a cout in a certain line in the code that maintained the segfault, which supposedly means that the segfault happens before that line. Commenting lines after that line made the segfault go away.

To me, this screams memory corruption (specifically of the stack), but I have no idea on how to advance on this without looking at the generated assembly.

Any ideas? thanks in advance.

I'm working on SunOS_5.10_Studio_12_5.12_64, CC version "Sun C++ 5.12 SunOS_sparc 2011/11/16"

More details in response to comments

  1. The code is single-threaded.
  2. valgrind is not available on Solaris so it's not relevant.
infokiller
  • 3,086
  • 1
  • 21
  • 28
  • Is the code multi-threaded? Have you tried it with another compiler on another platform? One with better warnings and higher conformance than the SunOS compiler? – pmr Jul 25 '12 at 15:47
  • You should investigate if Valgrind is available for your platform. It runs on most Unix-like systems. – Graham Borland Jul 25 '12 at 15:48
  • is all the code built with the same compiler flags turned on ie is there any way the old version is getting linked in. – rerun Jul 25 '12 at 15:50
  • @pmr- Yes, with g++ on Linux. No special warnings. – infokiller Jul 25 '12 at 15:50
  • Quick Note: cout is buffered IO, you should use cerr to give a more timely printout - cout may well have your print out sitting in the buffer (but not printed to screen visibly) before the crash while cerr would put it straight to screen before the crash. You can even add a pthread_yield after the cerr for complete certainty. Though if this is multithreaded and this is a timiing issue, this will all change the timing :/ – John Humphreys Jul 25 '12 at 15:55
  • If it happens only in release mode, the first thing I would check are uninitialized variables. – João Augusto Jul 25 '12 at 16:02
  • @w00te i use cout with endl (which flushes the buffer) – infokiller Jul 25 '12 at 16:03
  • @JoãoAugusto the are no uninitialized variables (which don't have a ctor, and most of them are strings/vectors from std) – infokiller Jul 25 '12 at 16:08
  • If you can extract a call stack from your core dump, I would definitely also look at the callers of the crashing one. And also, if you do not have the specific line of code where it's crashing, I would chase that too, by trying to set up an optimised build that also has debug information, so you can zero in on the crashing site. Last, if that does not work, I would try and break up the crashing function into progressively smaller chunks, and chase the one that keeps crashing. Getting closer to the exact crashing site will hopefully give you better insights as to what what may be occurring. – Nicolas Tisserand Apr 09 '18 at 23:15

1 Answers1

2

You should use a memory debugger/profiler like valgrind. It will quickly tell you the location of corruption. On Solaris you can try libumem.

perreal
  • 94,503
  • 21
  • 155
  • 181
  • @Graham Borland and perral: valgrind is not available for Solaris: http://valgrind.org/info/platforms.html – infokiller Jul 25 '12 at 16:01
  • 1
    You can use the dbx memcheck module. That comes with the Sun Workshop. – Mark B Jul 25 '12 at 16:01
  • @JohnnyW, there is libumem for Solaris – perreal Jul 25 '12 at 16:19
  • http://stackoverflow.com/questions/1881343/locate-bad-memory-access-on-solaris has some of the memory checkers that are currently available for Solaris. And while it's not integrated upstream yet, a student did work on a valgrind port as part of his thesis: https://dip.felk.cvut.cz/browse/pdfcache/pavlupe1_2012dipl.pdf (code at https://bitbucket.org/setupji/valgrind-solaris ) – alanc Jul 30 '12 at 23:15