With transparent hugepages, simply allocating a 4M-aligned buffer will work. Use aligned_alloc
or posix_memalign
to get a pointer you can free
. (Note that aligned_alloc
is required to fail if the buffer size isn't a multiple of the alignment. /facepalm).
Depending on your setting for /sys/kernel/mm/transparent_hugepage/defrag
, you may need to use madvise(MADV_HUGEPAGE)
on the buffer to strongly encourage the kernel to use hugepages.
Also note that x86-64 uses 2M hugepages. x86-32 uses 4M hugepages. Aligning to 4M is fine if you want the easy solution for both.
request the memory to be either uncacheable or write-through?
AFAIK, you can't easily do that through normal Linux APIs. NT stores work to normal write-back memory, so use that instead. (They over-ride the memory type and are weakly-ordered cache-bypassing).
But if you're not writing full cache-lines at a time, you definitely want cached writes. Especially if there's any spatial or temporal locality, but even if not then letting the store buffer do its job (hiding the latency of cache-miss stores) is a good thing.