3

When some amount of memory is dynamically allocated via calling malloc(), the OS internally stores the amount of allocated memory somehow (to track the used memory etc.), so we only provide the pointer to free() when we don't need that memory chunk anymore.

However, we cannot retrieve that size having only the pointer in a portable and OS/compiler-independent way. There exist some non-portable ways like _msize on Windows/Visual C, or malloc_usable_size in glibc. So, the only way is still to propagate all needed sizes along with respective pointers etc., which can be very error-prone.

So, the question is: Why C standard developers decided not to include a portable function into the standard?

P.S. It may be improper to ask "Why" since it usually does involve opinion-based things at least in some extent, but here I believe there's some fundamental reason to do so.

trolley813
  • 874
  • 9
  • 15

2 Answers2

6

Because you can get pointers to things that aren't returned from malloc and friends.

int x = 10;
int * p = &x;

The function you're talking about would have to figure out whether or not p is returned from malloc (possibly expensive). If it's not (as in this case), it has no way of knowing the amount of space allocated. You'd also run into problems if you got a pointer to something allocated by malloc, but not the exact pointer allocated by malloc.

int * p = malloc(sizeof(int) * 10);
int * p2 = p + 5;

What's the right result if I ask for the size of p2?

A more consistent approach involves passing sizes along where they're needed. This lets you work with addresses regardless of where they came from, including offsets to some block of memory (e.g., arrays, like I'm doing with p2 above).

Stephen Newell
  • 7,330
  • 1
  • 24
  • 28
  • 1
    I don't agree with the first paragrah, If that was the reason, I don't think `free` existed, C doesn't have a protective API (is protective a correct word?) – David Ranieri May 02 '20 at 18:24
  • 1
    @DavidRanieri, it is true that a function such as the OP asks about could carry the same restriction as `free`, that the argument must be a pointer returned by an allocation function and not subsequently freed. But so what? In order to accommodate use of pointers such as those Stephen mentions, most code needs to be written with the assumption that the hypothetical size function cannot be used, therefore it would serve little purpose to provide such a function. Freeing is a different use case altogether. – John Bollinger May 02 '20 at 18:32
  • @JohnBollinger, yes it would be very error prone, but is there anything that is not error prone when it comes to pointers? Without going any further: dereferencing `void *` with an incorrect type, it is something that the compiler cannot control, not even emit a warning, but `void *` is useful. – David Ranieri May 02 '20 at 19:23
  • @DavidRanieri, the problem is not that a magic memory size function would be error prone. It is that it would be *unsuited* for most cases, and *unnecessary* for almost all the rest. Again, such a function could have been provided, with the same limitations as `free`, but it would go largely unused, as indeed the implementation-specific examples presented in the question do. Remember that this is not in any case a question of feasibility, but of second-guessing the decision process of those charged with writing and maintaining the standard. – John Bollinger May 02 '20 at 19:46
5

There is no technical issue stopping the C Standard committee from adding a new library function to retrieve the number of bytes accessible via a valid pointer previously returned by malloc(), calloc(), realloc(), aligned_alloc(), strdup() or any similar function. The number returned would not necessarily be the size initially passed to the allocation function, and it is conceivable that this information might be not be available at all, so a return value of 0 would indicate that the information is not available.

The reason such a function has not yet been added might be that the C Standard committee is usually very reluctant at adding new functions. For example it took more than 30 years for strdup() to finally make its way into the C Standard (it will be part of the next version) despite consistent implementations having been available in most C libraries for decades.

This function would have undefined behavior for any pointer not previously returned by a memory allocation function or already freed, just like free or realloc. Whether it is defined for NULL is debatable, but a return value of 0 seems appropriate in this case. If the size is not known, which is possible for dummy allocators that do not store this information, a return value of 0 would indicate this condition too.

Here is an abstract from the man page for malloc_usable_size present in the GNU lib C:

NAME

malloc_usable_size - obtain size of block of memory allocated from heap

SYNOPSIS

   #include <malloc.h>
    
   size_t malloc_usable_size(void *ptr);

DESCRIPTION

The malloc_usable_size() function returns the number of usable bytes in the block pointed to by ptr, a pointer to a block of memory allocated by malloc(3) or a related function.

RETURN VALUE

malloc_usable_size() returns the number of usable bytes in the block of allocated memory pointed to by ptr. If ptr is NULL, 0 is returned.

ATTRIBUTES

Multithreading (see pthreads(7)): the malloc_usable_size() function is thread-safe.

CONFORMING TO

This function is a GNU extension.

NOTES

The value returned by malloc_usable_size() may be greater than the requested size of the allocation because of alignment and minimum size constraints. Although the excess bytes can be overwritten by the application without ill effects, this is not good programming practice: the number of excess bytes in an allocation depends on the underlying implementation.

The main use of this function is for debugging and introspection.

SEE ALSO

malloc(3)

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • .. the value returned could also be non-zero and incorrect. If another thread frees or reallocs the block as the 'allocsize' function returns.... – Martin James May 02 '20 at 19:57
  • @MartinJames: of course, any unprotected concurrent access to shared variables invokes undefined behavior. Merely dereferencing a pointer that another thread can free concurrently has undefined behavior too. – chqrlie May 03 '20 at 00:34
  • Is the behavior for "offset" pointers well-defined for `malloc_usable_size`? I.e. `char *x = malloc(N); x += 8`. In my simple tests it returned `0` (unknown?). Is this a guaranteed behavior? I want to use it to check if the pointer point to the start of the allocation (as in returned by malloc) or not. – Dan M. Apr 04 '22 at 13:51
  • @DanM.: Not at all: `malloc_usable_size` has undefined behavior for any non null pointer not returned by an allocation function or already freed. If you add an offset to implement *tagged pointers*, you must remove this *offset* from tagged pointers you pass to this function, as well as `free()` or `realloc()`. Assuming allocated pointer are 16 byte aligned and you use 4 bit tags, you should write `malloc_usable_size((void *)((uintptr_t)(x) & ~15))` to get the actual object pointer. Beyond an offset of 15, there is no reliable way to compute the original pointer returned by `malloc()`. – chqrlie Apr 05 '22 at 07:28
  • @DanM.: If you map memory yourself using `mmap`, you can detect if the pointer `x` points the the beginning of a page by testing its low 12 of so bits, depending on the system page size, but again, this is not a generic solution for offsets beyond 4095. – chqrlie Apr 05 '22 at 07:37
  • @DanM.: For a generic solution, you can store allocated pointer values in a hash table or some other fast access structure and look up `x` to check if it does point to the beginning of the block... performance should be fine with a hash table, but be careful to add all pointer values when you allocate new blocks and to remove them when you free them. – chqrlie Apr 05 '22 at 07:39
  • @chqrlie yeah, has table probably won't cut it perf-wise. I want to be able to "instrument" all allocations in the lib (i.e. add a 4-8 bytes of info before the allocation), but the problem is the allocations coming from the "outside". I want to be able to detect when I need to adjust the pointer before freeing it and when I don't (for example if it's coming from `strdup`). – Dan M. Apr 05 '22 at 11:57
  • @DanM.: I guess you should instrument `strdup()` too. For your use case, if the offset is less than 16, testing the low 4 bits of the pointer is a simple solution because pointers returned by `malloc()` are 16-byte aligned on current desktop systems. Be aware that misaligning pointers returned by your wrappers is a problem: this will cause misaligned accesses to members of allocated structures. Make sure you to keep them at least 8-byte aligned. Alternately, you could keep the ancillary information in a separate place, for example in the hash table. Don't underestimate hash table efficiency:) – chqrlie Apr 05 '22 at 14:15