0

I am writing a custom memory allocator in my program and trying to better understand what is considered allocated vs unallocated memory. I am told that for a basic, "naive" sbrk() memory allocator, calls to sbrk() must provide a size aligned to (a multiple of) 16 bytes. This means that if I need to allocate for example, 5 bytes of memory, the operation (5 + (16-1)) & ~(16-1)) is applied, which rounds up to 16 in this case. If the size requested were 17 instead of 5, then it would round to 32.

This means that we are getting back from the operating system more bytes than the user requested for the sake of alignment. My question is, are the 11 bytes (in the case of the first example) or 15 bytes (in the case of the second example) considered "allocated" or not? In a proper implementation of a memory allocator, could the user actually use more than the requested bytes between the requested size and the 16 byte boundary? If not, how is this enforced?

the_endian
  • 2,259
  • 1
  • 24
  • 49
  • What you mean by "could the user actually use more than the requested bytes"? Are you asking if the user can count on the allocator to allocate more than what he asks for to use as valid space for example? – andresantacruz Aug 18 '19 at 04:55
  • Kinda. So for example in the case of the official stdlib malloc, I know that when I malloc(5), there will always be at least 16 bytes allocated, is there any actual guard stopping me from malloc(5) and then using 10 bytes which is 5 bytes past what I requested? Is that padding memory considered allocated or does the allocator use those bytes for other things? I would never do this but I'm curious as to whether these extra bytes are considered allocated or free or what? I am trying to reason about how much "wasted" memory there is. – the_endian Aug 18 '19 at 04:58
  • If you mean, could use overflow a buffer in the allocated space without anything _bad_ happening, then I suppose the answer is: Yes. However, it would be just that: A buffer overflow that had the fortunate chance of not corrupting anything important. – daShier Aug 18 '19 at 04:58
  • On any reasonably modern machine, memory is not managed on a byte or 16-byte basis, but in *pages* by the operating system (which `sbrk` interfaces). – EOF Aug 18 '19 at 04:59
  • I would add: Why would you care? If you want more than the 5 bytes, then request what you want. – daShier Aug 18 '19 at 04:59
  • @EOF so sodlib's malloc will return an entire page for a malloc(5) call and the remaining bytes are just wasted? – the_endian Aug 18 '19 at 05:00
  • @daShier I am writing a custom allocator and trying to determine how this is handled. – the_endian Aug 18 '19 at 05:00
  • @the_endian I think malloc will allocate an entire page to use it as user requests allocations if there is no enough available space in another allocated pages. – andresantacruz Aug 18 '19 at 05:02
  • Whatever the `sizeof (struct foo)` is will be allocated. It is not up to you to worry about if and how much padding is added in your struct -- that is up to your compiler's implementation. You just have to ensure there is a sufficiently sized block reserved to hold it. An [awkwardly written, but good basic malloc/realloc introduction](https://github.com/zyfjeff/C-HOW-TO/blob/master/c-malloc/Malloc_tutorial.pdf) is worth wading through if you are writing an allocator with `sbrk` (you can extend to handle x86-64 if you are careful). Note, current allocators use `mmap` instead of `sbrk`. – David C. Rankin Aug 18 '19 at 05:24
  • @DavidC.Rankin I guesss the_endian means padding outside to meet the alignment requirements of the struct. – Antti Haapala -- Слава Україні Aug 18 '19 at 05:30
  • Ahh, that does put a different light on it. In that case, a good allocator would not consider those bytes "allocated" and would be available, but awkward to make use of without a good consolidation routine on `free` of the struct memory. (though that goes beyond a "naive" allocator) – David C. Rankin Aug 18 '19 at 05:35
  • 1
    You can do it either way. You can allocate the whole nine yards to the user and consider it all allocated, or you can add the padding to your free list and consider it unallocated. – user207421 Aug 18 '19 at 06:14

3 Answers3

2

There is no general requirement for any memory allocator to align the returned chunk of memory to 16 bytes. There is, however, the requirement to align this to the strictest alignment requirement of the specific machine/platform (i.e. the returned chunk of memory must be suitable to any datatype and its alignment requirement the specific machine/platform has).

Furthermore, any non-trivial memory allocator will most likely request memory from the OS in much larger chunks than what is typically requested from malloc(). sbrk()/mmap() (or whatever facility your OS might provide) are typically pretty costly operations in terms of performance and any memory allocator will aim to call this as rarely as possible. Typically, it will allocate memory from the OS in chunks of page size (or a multiple of it) and satisfy malloc() requests from there using (library specific) internal management that keeps track of this (typically smaller) allocations and frees maintaining an internal 'free list'. Obviously, this 'free list' must live somewhere.

As - without intimate knowledge of where this 'free list' lives - you can't tell if memory outside the memory chunk you allocated might be there 'unused' because of alignment or because of the library's internal management requirements, touching that memory will always be asking for serious trouble.

mfro
  • 3,286
  • 1
  • 19
  • 28
2

My question is, are the 11 bytes (in the case of the first example) or 15 bytes (in the case of the second example) considered "allocated" or not?

No, they are not considered allocated. To be more specific, those bytes are not owned by the user's program, they are owned by the memory allocator.

In a proper implementation of a memory allocator, could the user actually use more than the requested bytes between the requested size and the 16 byte boundary?

No, the user's program is only allowed to use the memory that was requested.

If not, how is this enforced?

It's not enforced. That's something that sets C apart from other languages. C has lots of rules, but it doesn't enforce them. The programmer must understand the rules, and follow the rules. If the programmer does not follow the rules, the result is undefined behavior. See also this post. In the words of the C specification (emphasis added):

Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

user3386109
  • 34,287
  • 7
  • 49
  • 68
0

I don't think @EOF is correct, nor is @dedecos completely correct either.

A proper memory manager would allocate memory to an application in pages, yes, but only adding another page when needed, but within each page, memory is allocated in blocks (in your example: blocks of 16 bytes).

Any remainder in the last block (the 11 bytes in your example of requesting 5 bytes) will not be assigned to a future malloc, even a malloc(5), since there is no packing smaller than the block increment. Those extra bytes are simply wasted space.

Below the level of a page, the memory manager needs a bitmap to track allocations: one bit per minimum block size (16 bytes in your case), so there's no provision for allocating anything smaller than a block.

daShier
  • 2,056
  • 2
  • 8
  • 14