Memory allocated without allocation using malloc, how?

Question

Below is a simple code snippet:

int main()
{
int *p;
p=(int*)malloc(sizeof(int));//allocate m/y 4 1 int 
printf("P=%p\tQ=%p",p,p+2);
}

In one sample run, it gave me the output as below:

P=0x8210008 Q=0x8210010

Starting address of P is-P=0x8210008,next byte is 0x8210009,next byte is 0x821000A,next byte is 0x821000B.So the 4 bytes for int is ending there. We haven't allocated more memory using malloc. Then how is p+2 leading us to 0x8210010,which is 8 bytes after P(0x8210008).

In C you'll never get very far learning by tests like this; because of *undefined behavior* you'll find a lot of inconsistency--none of which you should count on. — Dave, Jul 31 '12 at 05:05
Please use the search facilities of SO first: duplicate of [I can use more memory than how much I've allocated with malloc(), why?](http://stackoverflow.com/questions/3509714/i-can-use-more-memory-than-how-much-ive-allocated-with-malloc-why) — Jens Gustedt, Jul 31 '12 at 09:56

score 5 · Answer 1 · answered Jul 31 '12 at 04:00

Because it's treating it as an integer-element offset from the pointer. You have allocated an array for a single integer. When you ask for p+2 it's the same as &p[2]. If you want two bytes from the beginning, you need to cast it to char* first:

char *highWordAddr = (char*)p + 2;

score 4 · Answer 2 · answered Jul 31 '12 at 03:59

4

C is happy to let you do whatever pointer arithmetic you like. Just because p+2 looks like any other address doesn't mean it's valid. In fact, in this case, it's not.

Be very careful any time you see pointer arithmetic that you're not going outside your allocated bounds.

answered Jul 31 '12 at 03:59

sblom

26,911
4
71
95

This fails to answer the asker‘s question because it does not explain why they get a value 8 greater than the base address instead of 2 greater. First, if they had printed p+1, they would get an address 4 greater because adding an integer to a pointer produces the address of the next object of that type, not the address of the next byte in memory, so the C implementation implements the address arithmetic as needed to make that happen. Second, adding 2 exceeds the range in which the compiler is required to make pointer arithmetic work, so the behavior is undefined, as noted in other comments. – Eric Postpischil Jul 31 '12 at 08:01

score 4 · Accepted Answer · answered Jul 31 '12 at 08:29

First, the fact that you have printed an address does not imply that memory is allocated at that address. You have simply added numbers and produced other numbers.

Second, the reason that you number you got by adding two was eight greater than the base address instead of two greater than the base address was because, when you add integers to pointers in C, the arithmetic is done in terms of pointed-to elements, not in terms of bytes in memory (unless the pointed-to elements are bytes). Suppose you have an array of int, say int x[8], and you have a pointer to x[3]. Adding two to that pointer produces a pointer to x[5], not a pointer to two bytes beyond the start of x[3]. It is important to remember that C is an abstraction, and the C standard specifies what happens inside that abstraction. Inside the C abstraction, pointer arithmetic works on numbers of elements, not on raw memory addresses. The C implementation (the compiler and the tools that turn C code into program execution) is required to perform whatever operations on raw memory addresses are required to implement the abstraction specified by the C standard. Typically, that means the compiler multiplies an integer by the size of an element when adding it to a pointer. So two is multiplied by four (on a machine where an int is four bytes), and the eight that results is added to the base address.

Third, you cannot rely on this behavior. The C standard defines pointer arithmetic only for pointers that point to objects inside arrays, including one fictitious object at the end of the array. Additionally, pointers to individual objects act like arrays of one element. So, if you have a pointer p that points to an int, you are allowed to calculate p+0 or p+1, because they point to the only object in the array (p+0) and the fictitious object one beyond the last element in the array (p+1). You are not allowed to calculate p-1 or p+2, because these are outside the array. Note that this is not a matter of dereferencing the pointer (attempting to read or write memory at the calculated address): Even merely calculating the address results in behavior that is not defined by the C standard: Your program could crash, it could give you “correct” results, or it could delete all files in your account, and all of those behaviors would be conforming to the C standard.

It is unlikely that merely calculating an out-of-bounds address would produce such weird behavior. However, the standard permits it because some computer processors have unusual address schemes that require more work than simple arithmetic. Perhaps the second-most common address scheme after the flat address space is a base address and offset scheme. In such a scheme, the high 16 bits of a four-byte pointer might contain a base address, and the low 16 bits might contain an offset. For a given base address b and offset o, the corresponding virtual address might be 4096*b+o. (Such a scheme is capable of addressing only 2²⁰ bytes, and many different values of base and offset can refer to the same address. For example, base 0 and offset 4096 refer to the same address as base 1 and offset 0.) With a base-and-offset scheme, the compiler might implement pointer arithmetic by adding only to the offset and ignoring the base. (Such a C implementation can support arrays only up to 65536 bytes, the extent addressable by the offset alone.) In such an implementation, if you have pointer-to-int p with an encoding of 0x0000fffc (base 0, offset 65532), and int is four bytes, then p+2 will have the value 0x00000004, not the value that is eight greater (0x00010004).

That is an example where pointer arithmetic produces values that you would not expect from a flat-address machine. It is harder to imagine an implementation where pointer arithmetic that is not valid according to the C standard would produce a crash. However, consider an implementation in which memory must be manually swapped by a process, because the processor does not have the hardware to support virtual memory. In such an implementation, pointers might contain addresses of structures in memory that describe disk locations and other information used to manage the memory swapping. In such an implementation, doing pointer arithmetic might require reading the structures in memory, and so doing invalid pointer arithmetic might reading invalid addresses.

score 1 · Answer 4 · answered Jul 31 '12 at 03:59

1

This is called pointer arithmetic. http://www.learncpp.com/cpp-tutorial/68-pointers-arrays-and-pointer-arithmetic/

answered Jul 31 '12 at 03:59

AlexDev

4,049
31
36

No, it's called undefined behavior. – R.. GitHub STOP HELPING ICE Jul 31 '12 at 04:00
@R It`s not undefined, if the address of P=0x8210008 than the address of p+2 will necessarily be 0x8210010 – AlexDev Jul 31 '12 at 04:02
3

No it won't. Pointer arithmetic is defined only within arrays and up to the location "one past" the last array element. Since the length of the array in question is 1, adding 2 goes outside those bounds and thereby invokes undefined behavior. – R.. GitHub STOP HELPING ICE Jul 31 '12 at 04:06
It's perfectly fine to calculate that address, even if it's not legally useful... Provided you don't read or write to it. But then, if you do and you're lucky, you might have some other block at that address and avoid a crash... I suppose that would fall under 'undefined behaviour' =) – paddy Jul 31 '12 at 04:11
@R Didn`t know that! But still the behavior in question is due to pointer arithmetic, or the way it`s usually implemented on today`s common architectures and under normal conditions such as not overflowing etc. – AlexDev Jul 31 '12 at 04:14
1

@paddy No, it's not fine, "If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overﬂow; otherwise, the behavior is undeﬁned." Of course that's the kind of undefined behaviour that in practice doesn't lead to surprises (if you are careful enough to not dereference the pointer), but it's still UB. – Daniel Fischer Jul 31 '12 at 04:49
Are you sure? If I pass an arbitrary pointer value from the middle of an array to some DLL I'm calling and it adds on a value to that that would overflow the memory block but never actually dereferences the pointer, does the program now exhibit undefined behaviour? And are we now splitting hairs? =) – paddy Jul 31 '12 at 05:06
1

Indeed, @paddy, the standard says it's UB, so it is UB. And of course we're splitting hairs. Haven't you noticed the tag on the question? – Daniel Fischer Jul 31 '12 at 05:13
1

This is undefined behavior because the standard explicitly states it is undefined behavior. From ISO/IEC 9899, Second edition, 1999-12-01, 6.5.6, paragraph 8, which describes adding an integer to a pointer (without dereferencing the result, just the addition alone): “If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.” – Eric Postpischil Jul 31 '12 at 07:56

Memory allocated without allocation using malloc, how?

4 Answers4

Linked