I need to allocate 16-byte-aligned memory, and from what I can tell the accepted method to do so is posix_memalign()
, using the man page as reference (other StackOverflow questions indicated this to be so). The code below is simplified to exclude unrelated parts (ie other platforms), but I have kept some context (sse is just a namespace):
#include <malloc.h>
#include <stdlib.h>
float* sse::alloc(unsigned int count)
{
void* p;
int r = posix_memalign(&p,16,sizeof(float)*count);
if ( r == 0 )
return (float*)p;
/* else output error and exit(1) - has never failed */
else exit(1);
}
void sse::free(float* p)
{
free(p);
}
The code that uses it is pretty self-explanatory:
int main(int argc, char* argv[])
{
const unsigned int total = 16000;
float *array = sse::alloc(total), *arr2 = sse::alloc(total);
/* null ptr checks */
// ...
sse::free(array); sse::free(arr2);
return 0;
}
I have commented out all non-essential code to test this, and have confirmed that it does indeed 'hang' on free(p);
- when I Ctrl-C in gdb it reports the line. The behaviour is no different in valgrind, and the SSE code (using Intel intrinsics) (in place of ...
) runs successfully. I have been compiling with fairly standard options: -g -O3 -std=c++11
, and have tried no/less optimisation, no debug, and some unnecessary casts. Some information about my system (please ask if you would like more):
- uname -a: Linux (name) 3.12.3-1-ARCH #1 SMP PREEMPT (date) x86_64 GNU/Linux
- g++ --version: 4.8.2
- gdb --version: 7.6.1
- valgrind --version: 3.9.0
Since the man pages clearly state that free()
is the correct function, I am incredibly stumped, and I would prefer to avoid writing a mechanism to use new/delete and padding by 15 bytes (for obvious reasons). If there is an alternative that I am unaware of, I am happy to try that. Also, information about potential causes of such a hang could prove useful, as it is particularly difficult to search for some of these terms (still easier than searching for 'stack overflow').