17

Everyone knows that:

  • realloc resizes an existing block of memory or copies it to a larger block.
  • calloc ensures the memory is zeroed out and guards against arithmetic overflows and is generally geared toward large arrays.

Why doesn't the C standard provide a function like the following that combines both of the above?

void *recalloc(void *ptr, size_t num, size_t size);

Wouldn't it be useful for resizing huge hash tables or custom memory pools?

Matt
  • 21,026
  • 18
  • 63
  • 115
  • 7
    If you're just going to zero everything then there isn't much point resizing - just free the old block and then calloc a new block. – Paul R Feb 12 '15 at 20:52
  • 12
    @PaulR: Presumably it would only zero out the new memory (or rather *ensure that it is* zeroed out). – Matt Feb 12 '15 at 20:53
  • 5
    The point of the standard library is not to provide a rich set of cool functions. It is to provide an essential set of building blocks, from which you can build your own cool functions. Your proposal for recalloc would be trivial to write, therefore, is not something the standard lib should provide. – abelenky Feb 12 '15 at 20:57
  • It doesn't seem like a particularly useful or common use case, but you could always implement your own wrapper around realloc I guess. – Paul R Feb 12 '15 at 20:58
  • 2
    Would still be somewhat redundant, given that it's trivial to implement "manually". Why add a function to the standard for a super-niche use-case? – Oliver Charlesworth Feb 12 '15 at 20:58
  • 2
    I can count the situations where I need `calloc()` on the fingers of one hand (without using binary ;-) ), a `recalloc()` routine would be only be useful in more extreme corner cases. As such, I think, there is simply no need for it. – cmaster - reinstate monica Feb 12 '15 at 20:58
  • 1
    @abelenky: How would you write it in a platform-independent way? You would have to know all about the system's virtual memory pages. – Matt Feb 12 '15 at 20:58
  • 2
    I believe it'd be written it as a `realloc`, followed by a `memset` of the "fresh" part of the memory. – abelenky Feb 12 '15 at 21:00
  • 1
    @abelenky: you would need to know the size of the current allocation though, and there is no portable way of getting that. – Paul R Feb 12 '15 at 21:02
  • 1
    The IRIX operating system does have [recalloc](http://nixdoc.net/man-pages/IRIX/man3x/malloc.3x.html) . – M.M Feb 12 '15 at 21:08
  • 3
    Seems like a fair enough proposal to me, all things considered – M.M Feb 12 '15 at 21:09
  • 1
    @abelenky. Additionally to the essential functions, there is, of course, a bunch of useless or very specialized (`fgets`, `strncpy`, `atoi`, ...) or misdesigned (`scanf`, ...) functions. A lot of functions in the C library are there almost entirely for historic reasons... – mafso Feb 12 '15 at 21:12
  • Do you want a `recalloc` equivalent to `free` + `calloc` or equivalent to `calloc` + `memcpy` + `free`, that is, should the old contents of the memory be kept or should the whole new size of allocated space be initialized? – mafso Feb 12 '15 at 21:18
  • @mafso: If I wanted `free` + `calloc`, then I would have just used `free` + `calloc`. – Matt Feb 12 '15 at 21:20
  • 4
    `calloc()` has another feature that `malloc()` does not: in arcane systems like DOS: the ability to allocate an array larger than `SIZE_MAX`. Thus code could `calloc(60000u, sizeof (double))`, even when `size_t` was 16-bit. I have wondered about the C compliance of this - but it appears to be correct. – chux - Reinstate Monica Feb 12 '15 at 21:42
  • Think about the potential bugs if `recalloc` was accidentally typed where `realloc`was intended. Two functions with the same signature and extremely similar names, yet massively different functionality is asking for bugs. – abelenky Sep 10 '20 at 15:00

2 Answers2

16

Generally in C, the point of the standard library is not to provide a rich set of cool functions. It is to provide an essential set of building blocks, from which you can build your own cool functions.

Your proposal for recalloc would be trivial to write, and therefore is not something the standard lib should provide.

Other languages take a different approach: C# and Java have super-rich libraries that make even complicated tasks trivial. But they come with enormous overhead. C has minimal overhead, and that aids in making it portable to all kinds of embedded devices.

abelenky
  • 63,815
  • 23
  • 109
  • 159
  • 5
    You would have to know all about the system's virtual memory to write such a function efficiently without having to call `memset`. – Matt Feb 12 '15 at 21:00
  • 9
    @abelenky what he's getting at is that some OS's take `calloc` pages from a different pool than `malloc` pages (if possible) and the calloc uses lazy allocation with copy-on-write from a page of all zeroes. This is why `calloc` on Linux can be faster than `malloc` (and much faster than `malloc` followed by `memset`). – M.M Feb 12 '15 at 21:03
  • 4
    @abelenky: You'd have to go through the entire rest of the block rather than relying on pre-zeroed copy-on-write memory. – Matt Feb 12 '15 at 21:03
  • I don't think that idea had been invented when K&R were adding malloc to the language tho :) – M.M Feb 12 '15 at 21:04
  • 3
    @MattMcNabb: Neither had `size_t`, and many other things everyone uses. – Matt Feb 12 '15 at 21:08
  • 4
    @abelenky I would not say that proposed version of `recalloc` would be trivial to write, even with `memset`: for `recalloc` to work as zero-extended copy, you would have to keep track of the initial size of the memory. IMHO, the rationale behind `realloc` was to relieve user from keeping track of the sizes of allocated memory. If the standard library was only to provide essential functions, it would have never introduced `realloc` in the first place since it can easily be expressed in conditional malloc-copy-free statement. – aprelev Sep 18 '19 at 22:37
5

I assume you're interested in only zeroing out the new part of the array:

Not every memory allocator knows how much memory you're using in an array. for example, if I do:

char* foo = malloc(1);

foo now points to at least a chunk of memory 1 byte large. But most allocators will allocate much more than 1 byte (for example, 8, to keep alignment).

This can happen with other allocations, too. The memory allocator will allocate at least as much memory as you request, though often just a little bit more.

And it's this "just a little bit more" part that screws things up (in addition to other factors that make this hard). Because we don't know if it's useful memory or not. If it's just padding, and you recalloc it, and the allocator doesn't zero it, then you now have "new" memory that has some nonzeros in it.

For example, what if I recalloc foo to get it to point to a new buffer that's at least 2 bytes large. Will that extra byte be zeroed? Or not? It should be, but note that the original allocation gave us 8 bytes, so are reallocation doesn't allocate any new memory. As far as the allocator can see, it doesn't need to zero any memory (because there's no "new" memory to zero). Which could lead to a serious bug in our code.

Cornstalks
  • 37,137
  • 18
  • 79
  • 144
  • 1
    Which is why such a function would be useful only for large blocks of memory. – Matt Feb 12 '15 at 21:02
  • @Matt: Not necessarily. You could request a buffer size of 100001 bytes. The allocator will round that up to the nearest alignment size, and allocate 100008 bytes (now you've got 7 bytes of padding). If you realloc to 100002, you've still got the same problem. Having a large block of memory doesn't change the problem. – Cornstalks Feb 12 '15 at 21:04
  • The need to zero the padded space or not idea would imply a new need to keep track of the previous allocated size as well as the true allocated size (at least its LSBits) . Good point. – chux - Reinstate Monica Feb 12 '15 at 21:04
  • And it doesn't need to know how much you were using. If it gave you a zeroed-out block the first time, then it only needs to zero out the new part. Those 7 bytes would have already been zero. – Matt Feb 12 '15 at 21:05
  • @Matt: sure, but `malloc()` doesn't need to give you a zeroed out chunk of memory. If it did, then this wouldn't be a problem, but the reality is that you can allocate memory that isn't zeroed. You can't quite rely on it being zeroed in the past. – Cornstalks Feb 12 '15 at 21:06
  • 1
    @Cornstalks: Well then there would have to be a caveat that it only works properly with memory previously allocated by `calloc` or itself. – Matt Feb 12 '15 at 21:07
  • 1
    @Matt Consider a calloc-re-allocation of 100001, then 100008, and repeat. Each increase would oblige a zeroing of 7 bytes. – chux - Reinstate Monica Feb 12 '15 at 21:07
  • 2
    @Matt: you could do that, but now you're painting yourself into a weird corner, and the usefulness of this `recalloc` function is decreasing. You could make it work by adding more and more arbitrary restrictions, but at some point (sooner rather than later) it's just easier to require the user to keep track of the memory and zero out the new memory after a `realloc`. – Cornstalks Feb 12 '15 at 21:09
  • @chux: No, when you reach a new page, you would be given a new clean page of zeros. Then you can increase by 7 all you want and everything is already zero. – Matt Feb 12 '15 at 21:10
  • @Cornstalks: 1. That's not a restriction, because it still works as an allocator with memory allocated by `malloc`. 2. It's not even arbitrary, since it makes perfect sense. – Matt Feb 12 '15 at 21:11
  • @Matt: 1. that's just begging for a horrible bug. 2. it makes perfect sense, but that choice is arbitrary. Why not require `malloc` to add some book-keeping so it knows it gave you 8 bytes, but you only requested 1 byte? That's an equally valid choice that also makes perfect sense. The restrictions are "arbitrary" because there are multiple ways to solve this problem, none of which are obvious winners (different people have different goals). – Cornstalks Feb 12 '15 at 21:17
  • @Cornstalks: If you used `malloc` the first time, why would use `recalloc` the second time? – Matt Feb 12 '15 at 21:19
  • @Matt: I'm sure there are various scenarios we can imagine. Here's one: I'm a video filter, and I've been given a frame buffer. My job (as a video filter) is to add a pillarbox/letterbox to the frame. To do this, I need to make the frame buffer larger. I also want the new memory be zeroed out (so it's black). So I call `recalloc`. But perhaps the frame buffer wasn't originally allocated with `malloc`. – Cornstalks Feb 12 '15 at 21:24
  • 2
    @Matt My suggestion - more clearly - was a calloc-re-allocation of 100001, then 100008, and then back to 100001 and then repeat that. The true block size never changed. But I do see that on decrease, the padded bytes could be zero - thus negated the need to keep the previous request size. Thanks for the challenge. – chux - Reinstate Monica Feb 12 '15 at 21:24
  • @chux: good point. The allocator *must* keep track of how many bytes you intended the buffer to be. – Cornstalks Feb 12 '15 at 21:26
  • I don't see a problem with extra bytes. Allocator should (and does) keep track of both how much memory you requested and how much memory is utilised to satisfy your request. `recalloc`-ing the previously allocated 1 byte to 2 bytes would force runtime to clear the second byte, even if the reallocation does not occur. – aprelev Sep 18 '19 at 22:53
  • @aprelev There's no requirement for the allocator to keep track of how much memory you requested. Some implementations may, but [some implementations do not](https://git.musl-libc.org/cgit/musl/tree/src/malloc/malloc.c). – Cornstalks Sep 19 '19 at 04:39
  • @Cornstalks Indeed, standard does not require it. But efficient implementation of `realloc` does. – aprelev Sep 23 '19 at 19:11
  • @aprelev But the fact that the standard does not require it means that the extra bytes *are* a problem. – Cornstalks Sep 24 '19 at 04:12
  • 2
    @Cornstalks As far as standard is concerned, there are no extra bytes. Allocating extra bytes to "keep alignment" is something specific implementation might do, just as well as it might keep track of requested vs issued memory. As I said, `recalloc` will force the implementation to clear those extra bytes, sure. Just as alignment requirements on some architectures force the implementation to allocate those extra bytes in the first place. – aprelev Sep 24 '19 at 16:04