1

I don't really understand why method 1 works but not method 2. I don't really see why it works for characters and not an int.

#include <stdlib.h>
#include <stdio.h>
int main(void)
{
    ///  WORK (METHODE 1)
    char **string_array = malloc(sizeof(char **) * 10);
    string_array[0] = "Hi there";
    printf("%s\n", string_array[0]); /// -> Hi there
    
    /// DOES NOT WORK (METHODE 2)
    int **int_matrix = malloc(sizeof(int **) * 10);
    int_matrix[0][0] = 1;  // -> Segmentation fault
    
    /// WORK (METHODE 3)
    int **int_matrix2 = malloc(sizeof(int *));
    for (int i = 0; i < 10; i++)
    {
        int_matrix2[i] = malloc(sizeof(int));
    }
    int_matrix2[0][0] = 42;
    printf("%d\n", int_matrix2[0][0]); // -> 42
}
  • 1
    Aside: Wrong type in `char **string_array = malloc(sizeof(char **) * 10);`. Better as `char **string_array = malloc(sizeof(char *) * 10);`. Best as `char **string_array = malloc(sizeof *string_array * 10);`. – chux - Reinstate Monica Nov 22 '21 at 22:58
  • i don't really understand what you mean by "wrong type" –  Nov 22 '21 at 23:02
  • memory is reserved for `int_matrix[0]`, although typed incorrectly. Memory is not reserved for `int_matrix[0][0]`, so dereferencing with that second square bracket invokes undefined behavior, which in your case manifests in a segfault. – yano Nov 22 '21 at 23:03
  • The `int` equivalent of Method 1 would be `int_matrix[0] = (int[]){2, 5, 7, 9};` (nearly) – M.M Nov 22 '21 at 23:07
  • @Twyy Your code used `sizeof(char **)`--> Why that incorrect type? – chux - Reinstate Monica Nov 22 '21 at 23:13
  • METHODE 3 also fails as `int **int_matrix2 = malloc(sizeof(int *)); for (int i = 0; i < 10; i++) { int_matrix2[i] = malloc(sizeof(int)); }` is only valid for the first loop iteration. – chux - Reinstate Monica Nov 22 '21 at 23:18

3 Answers3

4

In terms of the types, you want to allocate memory for the type "one level up" from the pointer you're assigning it to. For example, an int pointer (an int*), points to one or more ints. That means, when you allocate space for it, you should allocate based on the int type:

#define NUM_INTS 10
...
int* intPtr = malloc(NUM_INTS * sizeof(int));
//                                      ^^ // we want ints, so allocate for sizeof(int)

In one of your cases, you have a double int pointer (an int**). This must point to one or more int pointers (int*), so that's the type you need to allocate space for:

#define NUM_INT_PTRS 5
...
int** myDblIntPtr = malloc(NUM_INT_PTRS * sizeof(int*));
//                                               ^^ "one level up" from int** is int*

However, there's an even better way to do this. You can specify the size of your object it points to rather than a type:

int* intPtr = malloc(NUM_INTS * sizeof(*intPtr));

Here, intPtr is an int* type, and the object it points to is an int, and that's exactly what *intPtr gives us. This has the added benefit of less maintenance. Pretend some time down the line, int* intPtr changes to int** intPtr. For the first way of doing things, you'd have to change code in two places:

int** intPtr = malloc(NUM_INTS * sizeof(int*));
// ^^ here                              ^^ and here

However, with the 2nd way, you only need to change the declaration:

int** intPtr = malloc(NUM_INTS * sizeof(*intPtr));
// ^^ still changed here                 ^^ nothing to change here

With the change of declaration from int* to int**, *intPtr also changed "automatically", from int to int*. This means that the paradigm:

T* myPtr = malloc(NUM_ITEMS * sizeof(*myPtr));

is preferred, since *myPtr will always refer to the correct object we need to size for the correct amount of memory, no matter what type T is.

yano
  • 4,827
  • 2
  • 23
  • 35
2

Others have already answered most of the question, but I thought I would add some illustrations...

When you want an array-like object, i.e., a sequence of consecutive elements of a given type T, you use a pointer to T, T *, but you want to point to objects of type T, and that is what you must allocate memory for.

If you want to allocate 10 T objects, you should use malloc(10 * sizeof(T)). If you have a pointer to assign the array to, you can get the size from that

T * ptr = malloc(10 * sizeof *ptr);

Here *ptr has type T and so sizeof *ptr is the same as sizeof(T), but this syntax is safer for reasons explained in other answers.

When you use

T * ptr = malloc(10 * sizeof(T *));

you do not get memory for 10 T objects, but for 10 T * objects. If sizeof(T*) >= sizeof(T) you are fine, except that you are wasting some memory, but if sizeof(T*) < sizeof(T) you have less memory than you need.

How much memory you want versus how much you get.

Whether you run into this problem or not depends on your objects and the system you are on. On my system, all pointers have the same size, 8 bytes, so it doesn't really matter if I allocate

  char **string_array = malloc(sizeof(char **) * 10);

or

  char **string_array = malloc(sizeof(char *) * 10);

or if I allocate

  int **int_matrix = malloc(sizeof(int **) * 10);

or

  int **int_matrix = malloc(sizeof(int *) * 10);

but it could be on other architectures.

For your third solution, you have a different problem. When you allocate

  int **int_matrix2 = malloc(sizeof(int *));

you allocate space for a single int pointer, but you immediately treat that memory as if you had 10

    for (int i = 0; i < 10; i++)
    {
        int_matrix2[i] = malloc(sizeof(int));
    }

You can safely assign to the first element, int_matrix2[0] (but there is a problem with how you do it that I get to); the following 9 addresses you write to are not yours to modify.

What you have in your int * array and what you think you have.

The next issue is that once you have allocated the first dimension of your matrix, you have an array of pointers. Those pointers are not initialised, and presumably pointing at random places in memory.

Uninitialised array of pointers.

That isn't a problem yet; it doesn't do any harm that these pointers are pointing into the void. You can just point them to somewhere else. This is what you do with your char ** array. You point the first pointer in the array to a string, and it is happy to point there instead.

Safe update of char ** array.

Once you have pointed the arrays somewhere safe, you can access the memory there. But you cannot safely dereference the pointers when they are not initialised. That is what you try to do with your integer array. At int_matrix[0] you have an uninitialised pointer. The type-system doesn't warn you about that, it can't, so you can easily compile code that modifies int_matrix[0][0], but if int_matrix[0] is pointing into the void, int_matrix[0][0] is not an address you can safely read or write. What happens if you try is undefined, but undefined is generally was way of saying that something bad will happen.

Unsafe dereference of the int_matrix.

You can get what you want in several ways. The closest to what it looks like you are trying is to implement matrices as arrays of pointers to arrays of values.

Array of arrays matrix.

There, you just have to remember to allocate the arrays for each row in your matrix as well.

#include <stdio.h>
#include <stdlib.h>

int **new_matrix(int n, int m)
{
    int **matrix = malloc(n * sizeof *matrix);
    for (int i = 0; i < n; i++)
    {
        matrix[i] = malloc(m * sizeof *matrix[i]);
    }
    return matrix;
}

void init_matrix(int n, int m, int **matrix)
{
    for (int i = 0; i < n; i++)
    {
        for (int j = 0; j < m; j++)
        {
            matrix[i][j] = 10 * i + j + 1;
        }
    }
}

void print_matrix(int n, int m, int **matrix)
{
    for (int i = 0; i < n; i++)
    {
        for (int j = 0; j < m; j++)
        {
            printf("%d ", matrix[i][j]);
        }
        printf("\n");
    }
}

int main(void)
{
    int n = 3, m = 5;
    int **matrix = new_matrix(n, m);
    init_matrix(n, m, matrix);
    print_matrix(n, m, matrix);

    return 0;
}

Here, each row can lie somewhere random in memory, but you can also put the row in contiguous memory, so you allocate all the memory in a single malloc and compute indices to get at the two-dimensional matrix structure.

Flattened matrix.

Row i will start at offset i*m into this flat array, and index matrix[i,j] is at index matrix[i * m + j].

#include <stdio.h>
#include <stdlib.h>

int *new_matrix(int n, int m)
{
    int *matrix = malloc(n * m * sizeof *matrix);
    return matrix;
}

void init_matrix(int n, int m, int *matrix)
{
    for (int i = 0; i < n; i++)
    {
        for (int j = 0; j < m; j++)
        {
            matrix[m * i + j] = 10 * i + j + 1;
        }
    }
}

void print_matrix(int n, int m, int *matrix)
{
    for (int i = 0; i < n; i++)
    {
        for (int j = 0; j < m; j++)
        {
            printf("%d ", matrix[m * i + j]);
        }
        printf("\n");
    }
}

int main(void)
{
    int n = 3, m = 5;
    int *matrix = new_matrix(n, m);
    init_matrix(n, m, matrix);
    print_matrix(n, m, matrix);

    return 0;
}

With the exact same memory layout, you can also use multidimensional arrays. If you declare a matrix as int matrix[n][m] you will get what amounts to an array of length n where the objects in the arrays are integer arrays of length m, exactly as on the figure above.

If you just write that expression, you are putting the matrix on the stack (it has auto scope), but you can allocate such matrices as well if you use a pointer to int [m] arrays.

#include <stdio.h>
#include <stdlib.h>

void *new_matrix(int n, int m)
{
    int(*matrix)[n][m] = malloc(sizeof *matrix);
    return matrix;
}

void init_matrix(int n, int m, int matrix[static n][m])
{
    for (int i = 0; i < n; i++)
    {
        for (int j = 0; j < m; j++)
        {
            matrix[i][j] = 10 * i + j + 1;
        }
    }
}

void print_matrix(int n, int m, int matrix[static n][m])
{
    for (int i = 0; i < n; i++)
    {
        for (int j = 0; j < m; j++)
        {
            printf("%d ", matrix[i][j]);
        }
        printf("\n");
    }
}

int main(void)
{
    int n = 3, m = 5;
    int(*matrix)[m] = new_matrix(n, m);
    init_matrix(n, m, matrix);
    print_matrix(n, m, matrix);

    int(*matrix2)[m] = new_matrix(2 * n, 3 * m);
    init_matrix(2 * n, 3 * m, matrix2);
    print_matrix(2 * n, 3 * m, matrix2);

    return 0;
}

The new_matrix() function returns a void * because the return type cannot depend on the runtime arguments n and m, so I cannot return the right type.

Don't let the function types fool you, here. The functions that take a matrix[n][m] argument do not check if the matrix has the right dimensions. You can get a little type checking with pointers to arrays, but pointer decay will generally limit the checking. The last solution is really only different syntax for the previous one, and the arguments n and m determines how the (flat) memory that matrix points to is interpreted.

Thomas Mailund
  • 1,674
  • 10
  • 16
  • 1
    what tool do you use for making those drawings? – tstanisl Nov 23 '21 at 09:39
  • I use OmniGraffle. It's a macOS/iOS tool only, I think. It's pretty nice on a tablet for some quick sketching, and then it synchronises to the desktop/laptop. I've used if for a couple of years for books and papers, and I'm pretty happy with it. – Thomas Mailund Nov 23 '21 at 09:50
  • Thank you. I will try it. BTW.. A minor improvement to your answer.You may consider using `void print_matrix(int n, int m, int matrix[static n][m])`. This `static` indicates that the `matrix` pointer is pointing to array with at least `n` valid elements. Moreover it makes the declaration of `matrix` visually different from declaration of an array. It really helps it to avoid surprise when `sizeof matrix` expression is used. – tstanisl Nov 23 '21 at 09:58
  • You are right. I'll fix that. – Thomas Mailund Nov 23 '21 at 10:52
1

The method 1 works only becuse you assign the char * element of the array string_array with the reference of the string literal `"Hi there". String literal is simply a char array.

Try: string_array[0][0] = 'a'; and it will fail as well as you will dereference not initialized pointer.

Same happens in method 2.

Method 3. You allocate the memory for one int value and store the reference to it in the [0] element of the array. As the pointer references the valid object you can derefence it (int_matrix2[0][0] = 42;)

0___________
  • 60,014
  • 4
  • 34
  • 74