1

According to the standard (C17 draft, 7.22.3.2), calloc

void *calloc(size_t nmemb, size_t size);

"allocates space for an array of nmemb objects, each of whose size is size" (and initializes all bits to zero).

For calloced arrays of T, I have only ever seen code like this:

T *p = calloc(nmemb, sizeof(T));
T *p;
p = calloc(nmemb, sizeof(T));

But given that calloc allocates space for an array, the following ought to be fine too:

T (*arrp)[nmemb] = calloc(nmemb, sizeof(T));
T (*arrp)[nmemb];
arrp = calloc(nmemb, sizeof(T));

(Here, the top version of each pair is technically speaking an initialization, not an assignment.)

What type can the result of calloc be assigned to, a pointer to an array (type T (*)[]), a pointer to the type contained within the array (type T *), or either?

The following code

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    int (*iarrp)[5];
    int *ip;
    float (*farrp)[1];
    float *fp;
    int i;

    iarrp = calloc(5, sizeof(int));
    for (i = 0; i < 5; ++i)
        (*iarrp)[i] = -i;
    ip = calloc(5, sizeof(int));
    for (i = 0; i < 5; ++i)
        ip[i] = i + 100;
    for (i = 0; i < 5; ++i)
        printf("%d: %d, %d\n", i, (*iarrp)[i], ip[i]);

    farrp = calloc(1, sizeof(float));
    (*farrp)[0] = 5.5;
    fp = calloc(1, sizeof(float));
    *fp = 6.6;
    printf("%.2f, %.2f\n", (*farrp)[0], *fp);

    free(iarrp);
    free(ip);
    free(farrp);
    free(fp);

    return 0;
}

compiles just fine for me with GCC (gcc -std=c17 -pedantic -Wall -Wextra) and with MSVC (cl /std:c17 /Wall), with the output being as expected:

0: 0, 100
1: -1, 101
2: -2, 102
3: -3, 103
4: -4, 104
5.50, 6.60

The background for asking this question is this: For an array arr of type T[], of the following three expressions

  • arr; type: T[] (before decay), T * (after decay)
  • &arr[0]; type: T *
    • this is what arr decays to
  • &arr; type: T (*)[]
    • the & prevents decay

the first two produce the same value. The third expression can theoretically have a value different from the first two expressions, though I understand that this is uncommon. The standard only guarantees that (void *)&arr == (void *)&arr[0] holds true.

Lover of Structure
  • 1,561
  • 3
  • 11
  • 27
  • How do you want to use that pointer and to what functions do you want to pass it? Will any function where you pass it expect a pointer to an array? – Gerhardh Aug 21 '23 at 10:41
  • @Gerhardh I spent some time pondering the word choice in the question title. In the end I picked "should" because I don't think the standard's wording makes the choice clear. Understanding the word "should" in the sense of "which choice is *better*" isn't the meaning I intended. That said, I'll make an edit to the title clarifying this. – Lover of Structure Aug 21 '23 at 10:48
  • 1
    _"Or are both choices fine?"_ - Yes they are, but the `T*` version is the more idiomatic way to do it since you then don't have to dereference the pointer before using the subscript operator. – Ted Lyngmo Aug 21 '23 at 10:48
  • @TedLyngmo If the result can be legally assigned to a pointer to an array, that certainly makes sense, but then that's interesting because these are subtly different types. I'll make an edit, please check back in 3 minutes. – Lover of Structure Aug 21 '23 at 10:53
  • I've actually used both in combination: `int (*iarrp)[5] = malloc(sizeof *iarrp); int *iarrp_decayed = *iarrp;` where I've used the decayed, `int*`, for easy access to the array and the original `iarrp` to carry the type information, like the size. – Ted Lyngmo Aug 21 '23 at 10:54
  • pointer to the array references the whole array not the element in the array. Pointer to type references the element in the array. When you `T (*arrp)[nmemb] = calloc(nmemb, sizeof(T));` you actually allocate the single element array of arrays – 0___________ Aug 21 '23 at 11:02
  • @TedLyngmo Okay -- note that the question is essentially about why `calloc`'s return value can (presumably) be assigned to either type even though the types are not compatible. – Lover of Structure Aug 21 '23 at 11:06
  • If you want to allocate memory for a string, you are likely to pass it to functions handling strings, i.e. expecting a `char*` as far as it is about standard C library functions. If you want to pass a pointer to those functions, using `char*` would avoid type mismatch and casts for each function call which you would need when you use `char (*)[]`. Is that something you would consider being "better"? – Gerhardh Aug 21 '23 at 11:06
  • 1
    @LoverofStructure `calloc` just returns a `void*` to a bunch of bytes. You can assign it to any (non-function) pointer type. See Eric's answer below. – Ted Lyngmo Aug 21 '23 at 11:07
  • @Gerhardh To answer your question: Yes, that would certainly match my understanding of "better" (but then I edited "should" out of the question, because I intended to ask about possibility, not which option is better). – Lover of Structure Aug 21 '23 at 11:13
  • There is no much sense asking questions about C memory/object model, it is defective/underspecified beyond ridiculous – Language Lawyer Aug 21 '23 at 12:11
  • @chux-ReinstateMonica Regarding the cast, I am not entirely sure. My notes originally had `(char *)`, and then I thought that were wrong (because the standard generally mentions `unsigned char` as the type of raw memory), but I don't entirely remember the source. If you have any pointers [<- lame pun, and it was unintentional], I would appreciate that. – Lover of Structure Aug 21 '23 at 12:49
  • @LoverofStructure Converting to `unsigned char*` is also OK, for compare purposes per "A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined." – chux - Reinstate Monica Aug 21 '23 at 19:05
  • @LoverofStructure Yet converting to `void *` is better to focus on the real part of the question. – chux - Reinstate Monica Aug 21 '23 at 19:24

2 Answers2

4

The dynamically allocated memory returned by calloc has no effective type, per C 2018 6.5 6. In C semantics, it is just a region of memory that can be used for any object type.

It can acquire an effective type by storing a value into it using a non-character type. (And this effective type can be changed by future assignments—the memory can be reused for other types.)

The C standard is unclear about the precise formal semantics of effective types with aggregates. Elements of arrays and members of structures may be stored individually in memory without storing the entire aggregate. Nonetheless, it is clear that we may use dynamically allocated memory to store aggregates, either all at once (for structures) or with individual elements or members.

The issue of whether the memory “should” be accessed using a pointer of type T * or T (*)[] has no consequence. Given declaration T *p0, a value is stored to the memory with *p0 = value;. In this assignment, the * is applied to the address stored in p0 to form an lvalue for a T element in the allocated memory. Given declaration T (*p1)[nmemb], a value is stored to the memory with (*p1)[i] = value;. In this assignment, the * is applied to the address stored in p1 to form an lvalue for the array and this lvalue is converted to a pointer to the first element of the array, and then the subscript operator is applied to form an lvalue for the element i of the array. So, in either case, the lvalue used for the actual store of a value into the memory is an lvalue for the element. The choice of T * or T (*)[] only affects intermediate calculations of the lvalue, not the final lvalue used for the store.

The question is more interesting for structures. Given typedef struct { int i, j; } T; T *p = calloc(1, sizeof *p);, we could do *p = (T) { 7, 13 }; to assign a value to the whole structure or p->i = 7; to assign a value to just one member. If we do the latter, have we set the effective type of the memory to T? Or have we only set the effective type of part of the memory to int? The C standard does not specify this adequately.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
3

You ask:

What type can the result of calloc be assigned to

The answer can be found in the standard (e.g. C17 7.22.3):

The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object

That's it. Any pointer type.

Then you say:

... the following ought to be fine too:

T (*arrp)[nmemb] = calloc(nmemb, sizeof(T));

Well... yes, as long as you access the memory like (*arrp)[i] or arrp[0][i] (and 0 <= i < nmemb), it will work.

But it's not really how calloc is intended to be used. Your code tells calloc that each element has the size sizeof(T). However, you store the return value into a pointer that points to an element with size nmemb * sizeof(T). This is because your arrp is a pointer to an array of nmemb Ts.

For your code the intended form would be:

T (*arrp)[nmemb] = calloc(1, nmemb * sizeof(T));

A better way of writing this would be:

T (*arrp)[nmemb] = calloc(1, sizeof *arrp);

This brings to the "normal" use of this code... It's for allocation of 2 dimensional arrays (aka "arrays-of-arrays). Like this:

T (*arrp)[num_columns] = calloc(num_rows, sizeof *arrp);
               ^                     ^
                \-------------------/
       notice: Unlike your example these values are not the same

This code gives you a dynamic allocation that corresponds to the static/automatic allocation:

T arr[num_rows][num_columns];

BTW:

This could be an interresting read:

Why does calloc require two parameters and malloc just one?

Support Ukraine
  • 42,271
  • 4
  • 38
  • 63
  • The reason why calloc has 2 parameters is because it has a dysfunctional, flawed API and that's about it. Any use of calloc which is not `calloc(1, ...` is code smell, IMO. – Lundin Aug 21 '23 at 14:28
  • @Lundin I disagree... any us of `calloc` where the size-argument doesn't match the size of what the pointer storing the return value points to has a code smell... a big one. – Support Ukraine Aug 21 '23 at 18:54
  • The first parameter is very often used for various flavours of code obfuscation. You _can_ use it without making the code less readable and error prone, but many programmers love to complicate things for the heck of it. `T (*arrp)[num_columns] = calloc(num_rows, sizeof *arrp);` is in my opinion needlessly complicated and could be written as `T (*arrp)[num_columns] = calloc(1, sizeof(T[num_rows][num_columns]) );` to clearly declare the intent of the call - allocating a 2D array. And it's self-documenting code. The `sizeof *var` trick doesn't hold in many situations where you change the type. – Lundin Aug 22 '23 at 06:29
  • For example K&R 1st edition ("The Obfuscation Bible") has code like `char* ptr; .. calloc(BUFFSIZE, 1)`. So this function was already of rotten API in the early 1970s. We shouldn't pretend that it makes sense to have 2 arguments. – Lundin Aug 22 '23 at 06:36
  • @Lundin I have not made any claim about the `calloc` API being good or bad so I'm a bit surprised that you bring that up. To me it is what it is. To me the intend of it is very clear (number of elements and size of a single element). When I use `calloc` I prefer to follow that intend. IMO that gives simple and easy-to-read code. It's fair that you find another way less complicated. The good thing is that both ways will work. One of the linked answers suggest the "size of element" parameter may have had some use "long ago". But to me that answer isn't really convincing. – Support Ukraine Aug 22 '23 at 07:44