What is the effect of dynamically allocating 100 bytes for an int*, and then trying to set values to it using pointer arithmetic?

Question

On a test I got back recently it had a question to the effect of, "assuming this code compiles, what will it do?"

Code:

int *ptr

ptr = (int *) malloc( 25 * sizeof(int)); //100 bytes

*ptr = 'x';
*(ptr + 1) = 'x';
.... //go through all the values from 1 to 99 as well
*(ptr +99) = 'x';

I have written the code and run it and the result, when printed with printf(%d, *x) is 120, the ascii value of x. I understand that the int must just be being set to x, and when that is printed as an int the ascii value is printed, but I am stumped when it comes to what the actual effect of the malloc is, and what all of the *(ptr + i) actually does.

malloc does not necessarily allocate 100bytes, it allocates `25*sizeof(int)` and `*(ptr + n)` works because the allocated memory is contiguous so `*(ptr + 0)` will give you the int at the first location `*(ptr + 1)` the second and so on. It's very similar to the way C arrays are accessed, and infact you can use the same syntax. I.E `*(ptr + 0)` is the same as `ptr[0]` — George, Dec 13 '16 at 14:57
`*(ptr +99) = 'x';` is the same as `ptr[99] = 'x';`, but you have a problem here, you allocate space for 25 `int`s, therefore the maximum index you can use here is 24 (e.g: `ptr[24] = ...`);. Accessing further (ptr[25], ptr[26]) yields in undefined behaviour. — Jabberwocky, Dec 13 '16 at 14:59
Summing up, the answer to the test question is: "Invoke undefined behavior". I'm now wondering if the teacher did that on purpose or honestly did not understand what they wrote. — UnholySheep, Dec 13 '16 at 15:05
@unholySheep that is definitely the answer he was going for, thanks — Tcpowers, Dec 13 '16 at 15:09

awerchniak · Accepted Answer · 2016-12-13T16:26:11.070

In C, arrays and pointers are very similar, and for simplicity's sake in this case it is convenient to think of them as the same. So, you could either think of that malloc as dynamically allocating an array of 25 integers (e.g. the same as saying int ptr[25] dynamically), or you can think of it as blocking off 25 consecutive integer addresses in memory and marking them as valid. In this way, ptr == &ptr[0]. The dereferencing operator, *, means 'change the value stored at this address', and it essentially 'undoes' the & operator. So, *ptr == *(&ptr[0]) == ptr[0]. This command simply sets the ptr[0] to equal 'x', which has an ASCII value of 120 (and will print as an ASCII value because the array is of type 'int' and not type 'char'). The rest of the assignments also do this. Depending on your compiler and your operating system, anything over ptr + 24 will likely give you a Segmentation Fault or an Invalid Write, because you have only allocated 25 integers, and so (ptr+99) should not be a writable address. You should not be able to edit ptr[99] if you only allocated 25 slots.

Arrays and pointers are similar in many cases, but they're [**not the same thing**](http://c-faq.com/aryptr/). Some examples http://stackoverflow.com/q/1704407/995714 http://stackoverflow.com/q/39444244/995714 — phuclv, Dec 13 '16 at 16:03
You are right. For simplicity's sake in this example it is convenient to think of them as the same, as they act very similarly. I have updated the response to reflect your point. Thanks — awerchniak, Dec 13 '16 at 17:43

Sourav Ghosh · Answer 2 · 2016-12-13T15:09:09.113

The actual effect of malloc() is, making the statement *ptr = 'x'; and subsequents accesses actually valid.

Without the memory allocation, attempt to dereference the pointer would invoke undefined behavior.

That said,

you must check for the success of malloc() before attempt to dereference the returned pointer.
pointer arithmetic honors the data type. so, and expression like (ptr + 1) points to the memory location for next integer, not the next byte of memory. So, anything n > 24 for the RHS of the expression (ptr + <n>) will invoke UB.
The assumption of 25 * sizeof(int) == 100 bytes is very much implementation-specific. In case, sizeof(int) is less than 4 bytes, you'll end up accessing out of bound memory in the pointer arithmetic (even if you alias the pointer to char*, considering).

In the OP's cpde `ptr[25]` is UB, independently of the size of `int` — Jabberwocky, Dec 13 '16 at 15:08

score 1 · Answer 3 · answered Dec 13 '16 at 15:22

I am stumped when it comes to what the actual effect of the malloc is

The malloc call allocates the space for your array. When you initially declare ptr, it's not initialized to point to a valid memory location:

     +---+
ptr: |   | ----> ???
     +---+

Attempting to read or write through ptr at this time will lead to undefined behavior; your code may crash outright, or it may corrupt storage somehow, or it may appear to run without any issues.

The malloc call allocates space from the heap (a.k.a., a dynamic memory pool) and assigns the address of the first element of that space to ptr:

     +---+
ptr: |   | ---+
     +---+    |
      ...     |
       +------+
       |
       V
     +---+
     |   | ptr[0]
     +---+
     |   | ptr[1]
     +---+
      ...

Please note that the (int *) cast on the malloc call has not been necessary since the 1989 standard, and is actually considered bad practice (under C89, it could mask a bug). IMO, the best way to write a malloc call is

T *p = malloc( N * sizeof *p );

where T is any type, and N is the number of elements of type T you want to allocate. Since the expression *p has type T, sizeof *p is equivalent to sizeof (T).

and what all of the *(ptr + i) actually does.

*(ptr + i) is equivalent to ptr[i], so

*ptr = 'x';
*(ptr + 1) = 'x';

are equivalent to writing

ptr[0] = 'x';
ptr[1] = 'x';

Please note that

*(ptr +99) = 'x';

is outside the range of the array you've allocated; you only set aside enough space for 25 integers. Again, this operation (and any operation *(ptr + i) = 'x'; where i is greater than 24) will lead to undefined behavior, and your code may crash, corrupt data, or otherwise.

Pointer arithmetic takes the pointed-to type into account; ptr + 1 yields the address of the next integer object following the one at ptr. Thus, if ptr is 0x8000 and sizeof (int) is 4, then ptr + 1 yields 0x8004, not 0x8001.

score 0 · Answer 4 · answered Dec 13 '16 at 15:13

Oups, C pointer arithmetics is based on the definition *(ptr + i) is ptr[i].

That means that when you allocate space for 25 ints, all access past the 24th element will invoke Undefined Behaviour - you actually try to access to a memory that you do not know what it represents.

But it is allowed to access any object at the byte level, provided you use a pointer to char (or to unsigned char). So assuming that in your compiler sizeof(int) is 4, this is fine:

int *iptr;
char *cptr;
iptr = malloc( 25 * sizeof(int)); //100 bytes since we know that sizeof(int) is 4
cptr = (char *) iptr; // cast of pointer to any to pointer to char is valid
for(int i=0; i<25*sizeof(int); i++) cptr[i] = 'x'; // store chars 'x'
for(int i=0; i<25; i++) {
    printf(" %x", (unsigned int) iptr[i]);  // print the resulting ints in hexa
}
printf("\n");

Assuming that you use ASCII representation of characters (quite common) you should get 25 values all equals to 0x78787878 as 0x78 is the ASCII code of 'x'. But this part is unspecified by the standard and is just implementation defined.

Is the cast `cptr = (char *) iptr;` really necessary? (Can't remember the standard of the top of my head, but I'm pretty sure `cptr = iptr;` works) — UnholySheep, Dec 13 '16 at 15:16

What is the effect of dynamically allocating 100 bytes for an int*, and then trying to set values to it using pointer arithmetic?

4 Answers4