Why we can't compare pointers which don't point to elements within the same array?

Question

I have been learning C language and following "Let Us C" by Yashavant P. Kanetkar.

There is a line in the pointers chapter that says we can only compare i.e less than (<) and greater than (>) the pointers which point to the elements that are within the same array.

Why comparing arbitrary pointers is not valid?

There are some architectures where this would not work - the standard therefore covers this case, even though it might work in most common architectures. BTW, the Kanetkar books are terrible in general - consider getting [a decent C book](http://stackoverflow.com/questions/562303/the-definitive-c-book-guide-and-list) to learn from. — Paul R, Aug 10 '15 at 08:19
do all languages that support pointers have this undefinability? — ashwani, Aug 10 '15 at 08:42
May be a duplicate of [Rationale for pointer comparisons outside an array to be UB](http://stackoverflow.com/q/31151097/1708801) in [my answer](http://stackoverflow.com/a/31151779/1708801) I point out the probable justifications for not totally ordering pointers. — Shafik Yaghmour, Aug 10 '15 at 09:32

score 13 · Accepted Answer · edited May 23 '17 at 10:29

13

Because C makes no assumption about the host machine, and nothing stops the latter from allocating two arrays in two completely separate address spaces.

It's not just about theoretical exotic architectures either. 16-bit compilers for x86 machines provided two kinds of pointers. Near pointers were 16 bits wide and behaved like you'd expect them; however, they only let you access 64k of RAM. If you wanted to access more than 64k of RAM (not 64K for each block: 64K for the whole program!) you had to use far pointers.

Far pointers were 32 bits wide, and made of two 16-bit halves, the segment and the offset; for example 1234:0000 is a pointer that has segment 0x1234 and offset 0. The actual memory address was segment * 16 + offset. Typically, farmalloc returned a pointer with zero offset, and pointer arithmetic only modified the offset. So you could have

 char *x = farmalloc(64);     // returns 1234:0000 for address 0x12340
 char *y = farmalloc(64);     // returns 1238:0000 for address 0x12380

Now if you compute x + 128, the result is 1234:0080, for address 0x123C0. It compares less than 1238:0000 (because 0x1234 < 0x1238) but it points to a higher address (because 0x123C0 > 0x1238).

Why? Because summing 128 to x, which pointed to a 64-byte object, was undefined behavior.

The memory model compiler settings defined whether the default size of pointers was near or far. For example, the "small" memory model had 64K for code and 64K for all of global variables, auto variables (stack) and the malloc heap. Note that the code was in a separate segment, so you couldn't just take a 16-bit ("near") function pointer and dereference it to read machine language! If you had to do that, you had to ask the compiler to put the code in the same segment as the rest (the "tiny" memory model).

Some memory models had the compiler always use far pointers, which was slower but necessary if data+stack+heap exceeded 64K ("compact" or "large" memory models).

The size of code and data was also different, so you could have a memory model where function pointers were near but data pointers were far, or vice versa. This is the case with the aforementioned "compact" model (64K code limit but far pointers for data) and the dual "medium" model (far pointers for code, 64K data limit).

There was also a way for compilers to use flat 32-bit pointers for everything (the so-called "huge" memory model), but it was slow and nobody used it.

edited May 23 '17 at 10:29

Community

1
1

answered Aug 10 '15 at 08:17

Quentin

62,093
7
131
191

"Because C makes no assumption about the host machine" - can u please explain this part..i didn't get it. what is a host machine? – ashwani Aug 10 '15 at 08:25
4

@user132458 it's the machine that ends up running your program. – Quentin Aug 10 '15 at 08:26
1

i have a confusion. lets say pointer p1 points to memory location 1000 which points to an element within array a, and pointer p2 points to memory location 5000 which points to an element within array b. In this case shouldn't (p1 – ashwani Aug 10 '15 at 08:36
6

@user132458 your example assumes that pointers are ordered integers, and that all pointers point into the same address space. C does not guarantee such properties. – Quentin Aug 10 '15 at 08:38
@user132458: Ask grandpa for his old XT/AT (clone), take the pile of Turbo-C or MSC workbench diskettes, install them and start playing with pointers, you will wonder.... ;-)) – alk Aug 10 '15 at 09:26
2

@alk I'm not sure whether this is the best or the worst advice of today's. – Quentin Aug 10 '15 at 09:32
1

@PaoloBonzini you should probably make this edit an answer on its own. – Quentin Aug 10 '15 at 10:30
@Quentin, no problem, I hope your answer gets accepted now. – Paolo Bonzini Aug 10 '15 at 11:22
I am a beginner in C. I'm not able to understand many terms such as near and far pointers. I know this is a good explanation but how can a beginner like me get the points very clear that are explained in this answer? Please Help @PaoloBonzini – ashwani Aug 10 '15 at 18:08
2

@user132458 the gist of it is, pointers are not always consecutive, increasing integers. Some memory models let you obtain two pointers `a` and `b`, whose bits when reinterpreted as integers would lead to `a < b`, when in fact `a` is further. Or, you can have two pointers that compare near-equal, but don't point into the same array because each one refers to a separate memory segment. – Quentin Aug 10 '15 at 19:47

score 5 · Answer 2 · answered Aug 10 '15 at 08:19

5

Undefined behavior applies here. You cannot compare two pointers unless they both point to the same object or to the first element after the end of that object.

answered Aug 10 '15 at 08:19

Dayal rai

6,548
22
29

Why we can't compare pointers which don't point to elements within the same array?

2 Answers2