1

I am using std::aligned_alloc() in one of my projects to allocate aligned memory for optimized PCIe read/write.

When I read about aligned_alloc from here, it says:

Defined in header <stdlib.h>

void *aligned_alloc( size_t alignment, size_t size );

Passing a size which is not an integral multiple of alignment or an alignment which is not valid or not supported by the implementation causes the function to fail and return a null pointer (C11, as published, specified undefined behaviour in this case, this was corrected by DR 460). Removal of size restrictions to make it possible to allocate small objects at restrictive alignment boundaries (similar to alignas) has been proposed by n2072.

From what I understood, now the only valid restriction is that the alignment parameter should be a valid alignment value (and a power of two). Fine. To get a valid alignment value, we can get the value of max_align_t.

[My System RAM - 128 GB. 2 CPUs - AMD EPYC 7313 16-Core Processor. It is a server machine running Centos7 latest]

I now have a couple of doubts here:

In my system, for almost every combination of 'alignment value' and 'size', aligned_alloc() returns success. (Unless the alignment is some huge value). How is this possible? Is it implementation specific?

My code snippet:

```
void* a = aligned_alloc(64, 524288000);
if(a == nullptr)
    std::cout << "Failed" << std::endl;
else
    std::cout << "Success" << std::endl;
```

Here is what values I tried for aligned_alloc() and their results:

aligned_alloc(64, 524288000) - Success

aligned_alloc(4096, 524288000) - Success

aligned_alloc(64, 331) - Success

aligned_alloc(21312323, 889998) - Success

aligned_alloc(1, 331) - Success

aligned_alloc(0, 21) - Success

aligned_alloc(21312314341, 331); - Success

aligned_alloc(21312312243413, 331); - Failed

Please do comment if any more info is needed to clear the question. Thanks

  • *or an alignment which is not valid or not supported by the implementation causes the function to fail and return a null pointer* - this is completely implementation dependent. – 273K May 18 '22 at 04:19
  • 1
    https://stackoverflow.com/questions/794632/programmatically-get-the-cache-line-size. (Yes, the page size is always a multiple of the cache line.) – rici May 18 '22 at 04:36
  • @273K Then isn't the official definition wrong? – Vaishakh Krishnan May 18 '22 at 04:45
  • Why is it wrong? That was a quote from the links in your question. – 273K May 18 '22 at 05:11
  • @273K I think you misunderstood my question here. All I was saying is the official definition says "so" but it is actually implementation dependent. If the definition says that an invalid value for alignment should return NULL, then in all invalid cases (mentioned in my original question) it should return NULL. But that is not the case. – Vaishakh Krishnan May 18 '22 at 07:12
  • It's hard to answer two questions at once. Please remove the second question, or the first question. While I can answer about "How is this possible?", I have no idea about PCI. – KamilCuk May 18 '22 at 07:20
  • @VaishakhKrishnan Which official documentation? C11? Plus DR 460? Plus n2072? Which implementation do you use? Which of these documents it applies? – Daniel Langr May 18 '22 at 07:20
  • @KamilCuk Okay I will Remove the second question here and ask that separately. Thanks. – Vaishakh Krishnan May 18 '22 at 07:27
  • @DanielLangr That is the confusion. They have proposed the changes mentioned in DR460/n2072, which is not reflected in the official definition. Am I referring to the wrong definition link? – Vaishakh Krishnan May 18 '22 at 07:32
  • 2
    Although the glibc version you are using may be too old to "know" about the DR as mentioned in one of the answers, it is still not implemented in current glibc. Open bug report is [here](https://sourceware.org/bugzilla/show_bug.cgi?id=20137). So glibc stills works under the assumption that using size which is not a multiple of the alignment or unsupported alignment, is UB. It might be interesting to review how many of the most commonly used C standard library implementations implement that DR correctly. – user17732522 May 18 '22 at 13:04
  • 2
    @VaishakhKrishnan cppreference.com is not an official reference for either C or C++. The official definitions are in the standard documents. In the C11 standard using unsupported alignment or non-integer multiple size is UB, but a defect report (DR 460) corrects that after publication of the C11 standard to returning a null pointer instead. The proposal n2072 then suggests removing the requirement that the size be an integer multiple, which as far as I can tell has been incorporated into the C17 revision of the standard. – user17732522 May 18 '22 at 13:42
  • But cppreference has links to the closest freely available drafts of the standards, for example for C on [this page](https://en.cppreference.com/w/c/links). – user17732522 May 18 '22 at 13:43

2 Answers2

4

Glibc has this line of code https://github.com/lattera/glibc/blob/master/malloc/malloc.c#L3278

/* Make sure alignment is power of 2.  */
  if (!powerof2 (alignment))
    {
      size_t a = MALLOC_ALIGNMENT * 2;
      while (a < alignment)
        a <<= 1;
      alignment = a;
    }

How is this possible?

(Weeeeellll, that that something is in specification doesn't restrict reality.) There is just code that makes it possible. If you want to know what exactly happens, inspect the source code - glibc is open-source.

Centos7 "latest" is quite old, I see glibc 2.17 which is from year 2012 ( https://centos.pkgs.org/7/centos-x86_64/glibc-2.17-317.el7.x86_64.rpm.html and https://sourceware.org/glibc/wiki/Glibc%20Timeline ). DR460 is from 2014. For that glibc that DR does not exist, and we can say that glibc followed C11 standard and the behavior is undefined.

Is it implementation specific?

"Implementation specific" is a... specific term used by standards to specify the behavior. In C11 the behavior is undefined. in C17 the behavior is that aligned_alloc should fail with invalid alignment. In real life, everything is implementation specific, as glibc comes with the implementation of aligned_alloc.

If you are wondering not about alignment, but why you can specify a size greater than your available RAM, then welcome to virtual memory. Malloc allocates memory more than RAM

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
4

Looks like you found a bug. The libc doesn't seem to fail as specified by the standard but just gives you memory instead. Personally I don't see anything wrong with 331 bytes aligned to a 64 byte boundary. It's just not something C/C++ ever has because a struct with 64 byte alignment always has padding at the end to a multiple of 64.

None of your allocations use a lot of ram, half a gig at most. So you are not running out of memory.

As for why insanely huge alignment works?

If the code isn't stupid it will use mmap() with a fixed address to allocate memory to the closest page. So no matter the alignment you should never have more than 2 * 4095 bytes wasted (assuming 4k pages, could be 16k or 64k too).

And as KamilCuk pointed out: https://github.com/lattera/glibc/blob/master/malloc/malloc.c#L3278

/* Make sure alignment is power of 2.  */
  if (!powerof2 (alignment))
    {
      size_t a = MALLOC_ALIGNMENT * 2;
      while (a < alignment)
        a <<= 1;
      alignment = a;
    }

Seems like the glibc will round up the alignment to the next power of 2. So all your huge odd numbers would become multiples of page sizes and waste even less. Although how that fullfilles the standard I don't know.

As for your last case: The address space of the architecture is only so big. You can see that in /proc/cpuinfo under Linux:

address sizes : 43 bits physical, 48 bits virtual

Relevant here is the 48 bits virtual. That goes from -128EB - 128EB or 0 - 128EB or (16Gozillabyte - 128EB) to 16Gozillabyte depending on how you view the address space (signed or unsigned addresses). Either way user space has a maximum of 128EB to work with. Your last alignment is ~19TB, or after rounding up 32TB. Looks like glibc isn't smart enough to mmap that properly. That's plenty small enough to work with.

Goswin von Brederlow
  • 11,875
  • 2
  • 24
  • 42
  • "_Although how that fullfilles the standard I don't know._": They just implement according to C11, not the DR. So using an unsupported alignment, i.e. not a power of 2, is just UB and behaving as if an other alignment was given satisfies the standard requirements. Bug report asking to implement the DR [here](https://sourceware.org/bugzilla/show_bug.cgi?id=20137). – user17732522 May 18 '22 at 13:07