0

I have a C++ program that uses a shared C library (namely Darknet) to load and make use of lightweight neural networks.

The program run flawlessly under Ubuntu Trusty on x86_64 box, but crashes with segmentation fault under the same OS but on the ARM device. The reason of the crash is that calloc returns NULL during memory allocation for an array. The code looks like the following:

l.filters = calloc(c * n * size * size, sizeof(float));
...
for (i = 0; i < c * n * size * size; ++i)
    l.filters[i] = scale * rand_uniform(-1, 1);

So, after trying to write the first element, the application halts with segfault.

In my case the amount of the memory to be allocated is 4.7 MB, while there is more than 1GB available. I also tried to run it after reboot to exclude the heap fragmentation, but with the same result.

What is more interesting, when I am trying to load a larger network, it works just fine. And the two networks have the same configuration of the layer for which the crash happens...

Valgrind tells me nothing new:

==2591== Invalid write of size 4
==2591==    at 0x40C70: make_convolutional_layer (convolutional_layer.c:135)
==2591==    by 0x2C0DF: parse_convolutional (parser.c:159)
==2591==    by 0x2D7EB: parse_network_cfg (parser.c:493)
==2591==    by 0xBE4D: main (annotation.cpp:58)
==2591==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==2591==
==2591==
==2591== Process terminating with default action of signal 11 (SIGSEGV)
==2591==  Access not within mapped region at address 0x0
==2591==    at 0x40C70: make_convolutional_layer (convolutional_layer.c:135)
==2591==    by 0x2C0DF: parse_convolutional (parser.c:159)
==2591==    by 0x2D7EB: parse_network_cfg (parser.c:493)
==2591==    by 0xBE4D: main (annotation.cpp:58)
==2591==  If you believe this happened as a result of a stack
==2591==  overflow in your program's main thread (unlikely but
==2591==  possible), you can try to increase the size of the
==2591==  main thread stack using the --main-stacksize= flag.
==2591==  The main thread stack size used in this run was 4294967295.
==2591==
==2591== HEAP SUMMARY:
==2591==     in use at exit: 1,731,358,649 bytes in 2,164 blocks
==2591==   total heap usage: 12,981 allocs, 10,817 frees, 9,996,704,911 bytes allocated
==2591==
==2591== LEAK SUMMARY:
==2591==    definitely lost: 16,645 bytes in 21 blocks
==2591==    indirectly lost: 529,234 bytes in 236 blocks
==2591==      possibly lost: 1,729,206,304 bytes in 232 blocks
==2591==    still reachable: 1,606,466 bytes in 1,675 blocks
==2591==         suppressed: 0 bytes in 0 blocks
==2591== Rerun with --leak-check=full to see details of leaked memory
==2591==
==2591== For counts of detected and suppressed errors, rerun with: -v
==2591== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 402 from 8)
Killed

I am really confused what might be the reason. Could anybody help me?

Dmytro Prylipko
  • 4,762
  • 2
  • 25
  • 44
  • What are the values of `c`, `n` and `size` ? What is the type of `i`? – Eugene Sh. Apr 07 '16 at 20:54
  • And what about `in use at exit: 1,731,358,649 bytes in 2,164 blocks`? – LogicStuff Apr 07 '16 at 20:57
  • Perhaps you have memory leaks causing allocations to fail? valgrind shows alot of allocated but not freed memory. Print `errno` when the allocation fails. – kaylum Apr 07 '16 at 20:57
  • I haven't read through all of this but maybe there are some clues here: http://stackoverflow.com/questions/19868584/does-linux-malloc-behave-differently-on-arm-vs-x86 – yano Apr 07 '16 at 21:50
  • Is the problem specific to calloc(), or does it also occur if you replace the call to calloc() with a function that calls malloc(), then manually clears the memory buffer before returning it? (actually this sounds like maybe your heap is getting corrupted somehow) – Jeremy Friesner Apr 07 '16 at 22:26
  • Given that heap size, I'd guess it may simply have run out of room to expand (between other things dotted about the address space) - does the same thing really run "flawlessly" on x86, or is it just that the 64-bit address space gives it enough room to merrily leak until execution completes? – Notlikethat Apr 07 '16 at 22:27

0 Answers0