2

I'm developing software on an embedded platform and keep getting inexplicable (to me) segmentation faults. I was hoping to get some debugging ideas from those of you with more embedded platform experience. I couldn't find any useful information with a google search.

Details:

  • C++ compiled with GCC-ARM toolchain (4.9.3)
  • ARM Cortex-M3 processor (on a LPC1768 dev board, if you're curious)
  • I can prevent the segmentation fault by modifying the build order of the source files (i.e., file order in the Makefile). This file order is essentially arbitrary.
  • The segmentation fault always occurs during the instantiation of class objects in the class constructor, and occurs during program startup (before main() is reached).
  • If I comment out the code in the given class constructor where the segfault occurs, the segfault will occur in the constructor of some other class object instantiation (of a different class, of course).

I'm at a loss. It looks like the object instantiation is writing over other memory to cause the segfault, but shouldn't the kernel prevent that? I'm not writing directly to memory here, I'm just doing a normal object instantiation.

My guess: I believe I've read that ARM-based architecture puts both the ROM and the RAM on the same flash memory block. Changing the file make order changes the order of the objects in the ROM. Upon startup, the aforementioned object instantiation in the RAM block is overwriting some ROM memory inadvertently. In one source file order case, the memory overwritten doesn't matter and therefore doesn't trigger the segfault, and in the other case it does matter and does trigger the segfault.

That guess may reveal how little I know about how hard fault handlers work. Please forgive my naiveté with embedded platforms.

Any thoughts on what I might investigate? Is this sort of issue characteristic of a particular source or Makefile error?

Here are a couple of examples of the segfaults:

Program received signal SIGSEGV, Segmentation fault.
0x00003cde in HardFault_Handler ()
(gdb) where
#0  0x00003cde in HardFault_Handler ()
#1  <signal handler called>
#2  dataComm::dataComm (this=0x10000218 <dc>) at dataComm/dataComm.cpp:12
#3  0x000002e6 in __static_initialization_and_destruction_0 (
    __initialize_p=1, __priority=65535) at main.cpp:22
#4  _GLOBAL__sub_I_dataIn () at main.cpp:87
#5  0x00006b32 in __libc_init_array ()
#6  0x0000016e in _start ()
#7  0x0000016e in _start ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

and

Program received signal SIGSEGV, Segmentation fault.
0x00003586 in HardFault_Handler ()
(gdb) where
#0  0x00003586 in HardFault_Handler ()
#1  <signal handler called>
#2  0x00004f94 in spi_format ()
#3  0x000044d2 in mbed::SPI::aquire() ()
#4  0x0000205e in FSM::FSM (this=0x1000070c <fsm>)
    at FiniteStateMachine/FSM.cpp:54
#5  0x0000097c in __static_initialization_and_destruction_0 (
    __initialize_p=1, __priority=65535) at initExoVars/initExoVars.cpp:37
#6  _GLOBAL__sub_I_txtLeft () at initExoVars/initExoVars.cpp:215
#7  0x0000606a in __libc_init_array ()
#8  0x0000016e in _start ()
#9  0x0000016e in _start ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
artless noise
  • 21,212
  • 6
  • 68
  • 105
cydonian
  • 1,686
  • 14
  • 22
  • 2
    https://isocpp.org/wiki/faq/ctors#static-init-order In addition: http://stackoverflow.com/questions/3035422/static-initialization-order-fiasco – PaulMcKenzie Jul 25 '15 at 02:40
  • It doesn't look like Make's fault. It looks as if the visible error occurs in one place in the code, when the Bad Thing has already happened -- silently -- somewhere else, which is often a sign of heap corruption. Any chance you could give us a [minimal complete example](http://stackoverflow.com/help/mcve)? – Beta Jul 25 '15 at 03:26
  • 1
    The order of source files inside Makefile shouldn't affect the program execution. And the compilation order is largely determined by the dependency tree constructed by the make when parsing the Makefile. I suggest to check all global scope static declarations across compilation units to see if the initialization order might impact. Try to eliminate global scope static declarations. – simon Jul 25 '15 at 05:39
  • 2
    You're on a Cortex-M3 - you don't have an MMU and there is no guarantee the kernel (if anything really fitting that description exists) is enforcing any memory protection using the MPU. So your link order may well affect what code gets overwritten via a stray pointer. – unixsmurf Jul 26 '15 at 10:23
  • 1
    It sounds like you have not enough memory to store .data section. Switching the order of file make some initializer to be stored before other that don't cause segmentation fault or are not used at startup. – LPs Jul 27 '15 at 06:21
  • @PaulMcKenzie After further code review, all signs indeed point to SIOF. There are build warnings that mention initialization order (not sure why I neglected to examine those earlier), static objects constructed referring to other static objects, and of course the segfault is within the static initializer. I'm actually a consultant rather than a developer on the project (literally none of the code is mine), but I'll pass this info on to the developers and help them (many novice programmers) address it. If SIOF is indeed the issue I'll post an "answer" here for future reference. Thanks! – cydonian Jul 27 '15 at 20:38

0 Answers0