Two pointer representation dilemma

Question

I don't understand how a machine can have more than one pointer representation. The following from GNU says

if the target machine has two different pointer representations, the compiler won't know which representation to use for that argument.

How is it possible? What is the relationship between the saying and #define SEM_FAILED ((sem_t*)-1)? What does the latter do? I know it is null pointer which has constant value. But how is it represented in a memory since it points to -1? What about if it points to a right location?

The rest of your question is too involved for me to answer right now, but I want to address one particular misconception: `(sem_t*)-1` is _not_ a null pointer. It is the result of casting a _nonzero_ integer constant to a pointer type; this produces a pointer with an implementation-defined _non-null_ value. Whether it points to an object is also implementation-defined. (Pointers can be non-null but still not point to an object.) — zwol, Apr 06 '19 at 16:15
A pointer to `long` need not be able to represent mis-aligned addresses (`address % sizeof(long) != 0`) at all. But a pointer to `char` (or a pointer to void, which is constrained to the same representation) must be able to designate every single byte. — Deduplicator, Apr 06 '19 at 16:16
I've worked on computers that did not have byte addresses so a char * representation was different than int *. — stark, Apr 06 '19 at 16:16
This is why the cast in `printf("%p", (void*)myPtr)` is used. — Weather Vane, Apr 06 '19 at 16:18
@zwol: in addition to your remark, Does `int x = 0; void *p = (void *)x;` make `p` a null pointer? one that compares equal in `p == NULL`? — chqrlie, Apr 06 '19 at 16:34
@chqrlie: Yes, at least for the recent C versions. I *think* since C99. — alk, Apr 06 '19 at 18:15
@alk: I searched the C Standard for an answer and did not find one... Do you have a quote? — chqrlie, Apr 06 '19 at 19:48
@chqrlie: For C11 please see [here](http://port70.net/~nsz/c/c11/n1570.html#6.3.2.3p3) and [here](http://port70.net/~nsz/c/c11/n1570.html#6.3.2.3p4). — alk, Apr 07 '19 at 07:38
@alk: yes of course, the definition of a null pointer is *An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.66) If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.* But this only applies to a constant expression, which is not the case of `(void *)x`. — chqrlie, Apr 07 '19 at 09:39
@alk: All other cases are covered by *An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation* So `(void *)x` might not be a null pointer, and thus would not compare equal to `NULL` or `0`. — chqrlie, Apr 07 '19 at 09:40
@concurrencyboy Did you mean `(sem_t *)0` ? That is a null pointer (with type pointer-to-`sem_t`). There are no requirements in POSIX for what `sem_t` actually is, so `(sem_t)0` could be a null pointer, could just be a number, or could be an invalid cast, depending on the concrete type of `sem_t`. — zwol, Apr 07 '19 at 14:15
@chqrlie I was going to quote the same text you did: it is implementation-defined whether `int x = 0; void *p = (void *)x` sets `p` to a null pointer. You've now made me wonder whether I was right, earlier, to say that `void *p = (void *)-1` is guaranteed to produce a _non-null_ pointer. It might be time for a language-lawyer question. — zwol, Apr 07 '19 at 14:18
@zwol the value of `(sem_t*)-1` is implementation-defined. The implementation might define it to be a null pointer or anything else — M.M, Apr 08 '19 at 05:26

score 1 · Answer 1 · answered Apr 08 '19 at 05:19

1

I believe this is alluding to the "near" and "far" pointers found on some 16-bit architectures? From what I understand, they used different offset scalings to work around being stuck with just 64kb of address space.

answered Apr 08 '19 at 05:19

l.k

199
8

It is not that one. – Antti Haapala -- Слава Україні Apr 08 '19 at 05:23

score 1 · Answer 2 · answered Apr 08 '19 at 05:34

One of the very first architectures that C targeted were some with 36-bit or 18-bit words words (the type int). Only the words were directly addressable at addresses like 0, 1, 2 using the native pointers. However one word for one character would have wasted too much memory, so a 9-bit char type was added, with 2 or 4 characters in one word. Since these would not have been addressable by the word pointer, char * was made from two words: one pointing to the word, and another telling which of the bytes within the word should be manipulated.

Of course now the problem is that char * is two words wide, whereas int * is just one, and this matters when calling a function without prototype or with ellipsis - while (void*)0 would have a representation compatible with (char *)0, it wouldn't be compatible with (int *)0, hence an explicit cast is required.

There is another problem with NULL. While GCC seems to assure that NULL will be of type void *, the C standard does not guarantee that, so even using NULL in a function call like execl that expects char *s as variable arguments is wrong without a cast, because an implementation can define

#define NULL 0

(sem_t*)-1 is not a NULL pointer, it is the integer -1 converted to pointer with implementation-defined results. On POSIX systems it will (by necessity) result in an address that can never be a location of any sem_t.

It is actually a really bad convention to use -1 here since the resulting address most likely doesn't have a correct alignment for sem_t, so the entire construct has undefined behaviour in itself.

Two pointer representation dilemma

2 Answers2

Linked