33

I've always programmed in Java, which is probably why I'm so confused about this:

In Java I declare a pointer:

int[] array

and initialize it or assign it some memory:

int[] array = {0,1,0}
int[] array = new int[3]

Now, in C, it's all so confusing. At first I thought it was as easy as declaring it:

int array[]

and initializing it or assigning it some memory:

int array[] = {0,1,0}
int array[] = malloc(3*sizeof(int))
int array[] = calloc(3,sizeof(int))

Unless I'm wrong, all of the above is equivalent Java-C, right?

Then, today I met a code in which I found the following:

pthread_t tid[MAX_OPS];

and some lines below, without any kind of initialization...

pthread_create(&tid[0],NULL,mou_usuari,(void *) 0);

Surprisingly (at least to me), the code works! At least in Java, that would return a nice "NullPointerException"!

So, in order:

  1. Am I correct with all of the Java-C "translations"?

  2. Why does that code work?

  3. Is there any difference between using malloc(n*sizeof(int)) and calloc(n,sizeof(int))?

Thanks in advance

bluehallu
  • 10,205
  • 9
  • 44
  • 61

5 Answers5

48

You can't assign memory to an array. An array has a fixed size, for the whole of its lifespan. An array can never be null. An array is not a pointer.

malloc returns the address to a memory block that is reserved for the program. You can't "assign" that (being the memory block) to an array, but you can store the address of this memory block in a pointer: luckily, array subscription is defined through pointers - so you can "use pointers like arrays", e.g.

int *ptr = malloc(5 * sizeof *ptr);
ptr[2] = 5; // access the third element "of ptr"
free(ptr); // always free at the end

When you declare an array without a size (i.e. array[]), it simply means the size of the array is determined from the initializer list. That is

int array[] = {1, 2, 3, 4, 5}; // is equal to
int array[5] = {1, 2, 3, 4, 5};

Trying to declare an array without a size and without an initializer is an error.


The code pthread_t tid[MAX_OPS]; declares an array named tid of type pthread_t and of size MAX_OPS.

If the array has automatic storage (i.e. declaration is inside a function and not static, not global), then each of the arrays elements has indeterminate value (and it would cause undefined behavior trying to read such value). Luckily, all that the function call does is that it takes the address of the first element of the array as the first parameter, and probably initializes it (the element) inside the function.


The difference of calloc and malloc is that the memory block that calloc returns is initialized to zero. That is;

int *ptr = calloc(5, sizeof *ptr);
// is somewhat equal to
int *ptr = malloc(5 * sizeof *ptr);
memset(ptr, 0, 5 * sizeof *ptr);

The difference between

int *ptr = malloc(5 * sizeof *ptr);
// and
int array[5];

is that array has automatic storage, (is stored on stack), and is "released" after it goes out of scope. ptr, however, (is stored on heap), is dynamically allocated and must be freed by the programmer.

J-a-n-u-s
  • 1,469
  • 17
  • 20
eq-
  • 9,986
  • 36
  • 38
  • 1
    The 1st paragraph has some dangerously ambiguous assertions. The OP was not trying to assign memory to an array, he was attempting to assign a (void *), the return from malloc() to an array, and if that array had been a int *Array[i], probably in a for{} loop, it would work fine, and is the basis for how dynamic, multidimensional arrays are allocated off the heap. Also, C99 supports variable sized arrays allocated off the stack, a feature few C programmers use, most preferring alloca(), myself included. http://stackoverflow.com/q/1018853/2548100 – user2548100 Jan 16 '14 at 18:55
  • 1
    calloc() is pretty much just memset(malloc(n * mysize),0, (n * mysize)). Especially because C uses null-terminated strings, calloc() is very useful, especially when viewing strings in a debugger, which typically shows the string only up to the null-terminator. If you're just stating out with C, use calloc instead of malloc, it will save you from making a lot of non-terminated C string errors that can and probably will crash your program. For production/release code, use calloc() only when you actually need to initialize the buffer/array/vector to (_int8) 0. – user2548100 Jan 16 '14 at 19:01
  • 7
    Just to wrap things up, and for completeness, an Array IS a pointer. In fact, any array name in C is exactly, precisely a pointer to the base of the first byte of the 1st object in the array, and nothing more. For people coming from Java, .Net, etc, it's helpful to know that C keeps the type of objects/variables completely separate from the storage allocated to hold them. This is why you can cast a pointer as an int, create UNIONs, etc. Very, very flexible, but dangerous for noobies. When you allocate an int array, its just storage at a location. You can put anything you like in that storage. – user2548100 Jan 16 '14 at 19:09
6

You are missing three very basic and tighten (and misleading!) C topics:

  • the difference between array and pointers
  • the difference between static and dynamic allocation
  • the difference from declaring variables on the stack or on the heap

If you write int array[] = malloc(3*sizeof(int)); you would get a compilation error (something like 'identifier' : array initialization needs curly braces).

This means that declaring an array allows only static initialization:

  • int array[] = {1,2,3}; that reserves 3 contiguous integers on the stack;
  • int array[3] = {1,2,3}; which is the same as the previous one;
  • int array[3]; that still reserves 3 contiguous integers on the stack, but does not initialize them (the content will be random garbage)
  • int array[4] = {1,2,3}; when the initializer list doesn't initialize all the elements, the rest are set to 0 (C99 §6.7.8/19): in this case you'll get 1,2,3,0

Note that in all these cases you are not allocating new memory, you are just using the memory already committed to the stack. You would run in a problem only if the stack is full (guess it, it would be a stack overflow). For this reason declaring int array[]; would be wrong and meaningless.

To use malloc you have to declare a pointer: int* array.

When you write int* array = malloc(3*sizeof(int)); you are actually doing three operations:

  1. int* array tells the compiler to reserve a pointer on the stack (an integer variable that contains a memory address)
  2. malloc(3*sizeof(int)) allocates on the heap 3 contiguous integers and returns the address of the first one
  3. = assigns copies that return value (the address of the first integer you have allocated) to your pointer variable

So, to come back to your question:

pthread_t tid[MAX_OPS];

is an array on the stack, so it doesn't need to be allocated (if MAX_OPS is, say, 16 then on the stack will be reserved the number of contiguous bytes needed to fit 16 pthread_t). The content of this memory will be garbage (stack variables are not initialized to zero), but pthread_create returns a value in its first parameter (a pointer to a pthread_t variable) and disregards any previous content, so the code is just fine.

lornova
  • 6,667
  • 9
  • 47
  • 74
  • 5
    for the `int array[4]`, they're all initialized. When the initializer list doesn't initialize all the elements, the rest are set to 0/NULL (C99 §6.7.8/19). – Matthew Flaschen Nov 21 '10 at 14:29
  • This is confusing; "heap" and "dynamic allocation" refer to the same thing. "static initialization" means initializing static variables, which is not the case when talking about so-called "stack" variables. The type of allocation in `int array[3];` inside a function , is "automatic allocation" (or "stack" informally, some systems don't have a stack), not "static". – M.M Apr 28 '16 at 21:49
1

C offers static memory allocation as well as dynamic- you can allocate arrays off the stack or in executable memory (managed by the compiler). This is just the same as how in Java, you can allocate an int on the stack or an Integer on the heap. Arrays in C are just like any other stack variable- they go out of scope, etc. In C99 they can also have a variable size, although they cannot be resized.

The main difference between {} and malloc/calloc is that {} arrays are statically allocated (don't need freeing) and automatically initialized for you, whereas malloc/calloc arrays must be freed explicitly and you have to initialize them explicitly. But of course, malloc/calloc arrays don't go out of scope and you can (sometimes) realloc() them.

Puppy
  • 144,682
  • 38
  • 256
  • 465
  • 1
    The arrays are only static if outside any function or explicitly marked `static`; otherwise they're automatic – M.M Apr 28 '16 at 21:50
0

2 - This array declaration is static :

pthread_t tid[MAX_OPS];

We don't need to allocate memory block, instead of dynamic allocation :

pthread_t *tid = (pthread_t *)malloc( MAX_OPS * sizeof(pthread_t) );

Don't forget to free the memory :

free(tid);

3 - The difference between malloc and calloc is calloc allocate a block of memory for an array and initializes all its bits at 0.

remy_jourde
  • 300
  • 1
  • 2
  • 14
-1

I find it helpful when you are programming in C (as opposed to C++) to indicate *array explicitly, to remember that there is a pointer that can be moved around. So I would like to start by rephrasing your example as:

int array[] = {0,1,2};
int *array = malloc(3*sizeof(int));
int *array = calloc(3,sizeof(int));

The first makes it clear that there is something called array which is pointing to a block of memory that contains a 0, 1 and 2. array can't be moved elesewhere.

Your next code: pthread_t tid[MAX_OPS];

Does in fact cause an array with sizeof(pthread_t) * MAX_OPS to be allocated. But it does not allocate a pointer called *tid. There is an address of the base of the array, but you can't move it elsewhere.

The ptherad_t type is actually a cover for a pointer. So tid above is actually an array of pointers. And they are all statically allocated but they are not initialized.

The pthread_create takes the location at the beginning of the array (&tid[0]), which is a pointer, and allocates a block of memory to hold the pthread data structure. The pointer is set to point to the new data structure and the data structure is allocated.

Your last question --- the difference between malloc(n*sizeof(int)) and calloc(n,sizeof(int)) is that the later initializes each byte to 0, while the first does not.

vy32
  • 28,461
  • 37
  • 122
  • 246