4

I need to allocate 16-byte-aligned memory, and from what I can tell the accepted method to do so is posix_memalign(), using the man page as reference (other StackOverflow questions indicated this to be so). The code below is simplified to exclude unrelated parts (ie other platforms), but I have kept some context (sse is just a namespace):

#include <malloc.h>
#include <stdlib.h>
float* sse::alloc(unsigned int count)
{
    void* p;
    int r = posix_memalign(&p,16,sizeof(float)*count);
    if ( r == 0 )
        return (float*)p;
    /* else output error and exit(1) - has never failed */
    else exit(1);
}

void sse::free(float* p)
{
    free(p);
}

The code that uses it is pretty self-explanatory:

int main(int argc, char* argv[])
{
    const unsigned int total = 16000;
    float *array = sse::alloc(total), *arr2 = sse::alloc(total);
    /* null ptr checks */
    // ...
    sse::free(array); sse::free(arr2);
    return 0;
}

I have commented out all non-essential code to test this, and have confirmed that it does indeed 'hang' on free(p); - when I Ctrl-C in gdb it reports the line. The behaviour is no different in valgrind, and the SSE code (using Intel intrinsics) (in place of ...) runs successfully. I have been compiling with fairly standard options: -g -O3 -std=c++11, and have tried no/less optimisation, no debug, and some unnecessary casts. Some information about my system (please ask if you would like more):

  • uname -a: Linux (name) 3.12.3-1-ARCH #1 SMP PREEMPT (date) x86_64 GNU/Linux
  • g++ --version: 4.8.2
  • gdb --version: 7.6.1
  • valgrind --version: 3.9.0

Since the man pages clearly state that free() is the correct function, I am incredibly stumped, and I would prefer to avoid writing a mechanism to use new/delete and padding by 15 bytes (for obvious reasons). If there is an alternative that I am unaware of, I am happy to try that. Also, information about potential causes of such a hang could prove useful, as it is particularly difficult to search for some of these terms (still easier than searching for 'stack overflow').

P̲̳x͓L̳
  • 3,615
  • 3
  • 29
  • 37
Neofish
  • 118
  • 8

1 Answers1

13

you need to call the global free - change the free routine to:

void sse::free(float* p)
{
    ::free(p);
}

namespace rules cause it to call sse::free unless otherwise told

Anya Shenanigans
  • 91,618
  • 3
  • 107
  • 122
  • Ouch, that has to sting. – Casey Jan 13 '14 at 23:47
  • Omg I can't believe I missed that.. I'm so ashamed. Thanks. – Neofish Jan 13 '14 at 23:47
  • 7
    @Neofish If you'd compiled *without* optimizations, the stack overflow would have been more evident; however, the compiler decided to helpfully "optimize" the infinite recursion into an infinite loop instead. – Adam Rosenfield Jan 13 '14 at 23:51
  • @Adam Useful to know, although it seems to make that optimisation anyway (might be my CFLAGS/etc). Can readily make it overflow by changing to a pointer-reference, or better, a C-style cast to void* raises a compile-time type error, which would make it a bit more clear. – Neofish Jan 14 '14 at 00:01