-1

I was looking at Why can't you do bitwise operations on pointer in C, and is there a way around this? and noticed that most of the responses say that bitwise operations are not well defined on pointers because a pointer is not very well defined in the standard. However, this has never come up in any of my systems classes, and I wasn't aware that a pointer could be anything but the value of the memory address that the pointer points to. Are there any implementations of C where a pointer is not represented as the memory address that the pointer points to?

Are there any plans to change this so that a pointer is well defined in a future C standard?

WillOw
  • 185
  • 13
  • 2
    There have been implementations — though unfortunately I can't cite any just now — that have implemented "smart pointers", where a pointer is a 3-5 word structure incorporating not only the "real" pointer but also the base and size of the object that it points to, so that array-bounds checks can be made. – Steve Summit Oct 30 '21 at 18:38
  • There are some systems where memory is addressed in complex form or even be larger of the system/processor representations. Examples are paged or segmented addresses. In that cases the C standard defines a **trap representation**, that can be any kind of data object needed to access memory. Those complex data objects cannot be used in standard logical or math operations. – Frankie_C Oct 30 '21 at 18:47
  • 1
    I don't think "trap representation" is the term you wanted, @Frankie_C. As used in the language specification, that means "an object representation that need not represent a value of the object type." – John Bollinger Oct 30 '21 at 18:51
  • 1
    For what it's worth, I think the issue is not that pointers are not "well defined", but rather, that their definition does not include arbitrary arithmetic operators. That's not a limitation recently imposed by the C Standards, though — as far as I can remember, arithmetic was *never* defined for pointers, even back in K&R days. If you're writing, say, a linker or a memory manager, you always had to use an integer type to hold memory addresses you were doing arithmetic on. I think all that's changed is that you're more likely to need explicit casts these days, and that `uintptr_t` exists. – Steve Summit Oct 30 '21 at 19:31
  • ...So, although I claim no particular knowledge about the way the Standards are going, my guess is that there aren't too many changes likely in the way pointers are defined. – Steve Summit Oct 30 '21 at 19:39

3 Answers3

3

There's really absolutely no problem here:

  1. You want to do bitwise operations on a pointer?

    The very link you cited gives a common solution:

https://stackoverflow.com/a/15868352/421195

But you can get around it with casting:

#include <stdint.h>

void *ptr1;
// Find page start
void *ptr2 = (void *) ((uintptr_t) ptr1 & ~(uintptr_t) 0xfff)

As for C++, just use reinterpret_cast instead of the C-style casts.

  1. This will ONLY work if the (platform-specific!) implementation of a "pointer" happens to match that of an "unsigned int"; if your particular platform happens to have a "flat" memory model.

  2. Q: Why is this even an issue in the first place?

    A: Because a "pointer" doesn't always map easily to an "unsigned int" (or an "unsigned long").

    EXAMPLE: 16 bit DOS pointers:

    • Near pointer: used to store 16 bit addresses within current segment on a 16 bit machine
    • Far pointer: Typically 32 bits. To use this, compiler allocates a segment register to store segment address, then another register to store offset within current segment.
    • Huge pointer: Also typically 32 bit, but can access outside segment.
    • You can't just naively "twiddle bits". In case of far pointers, a segment is fixed. In far pointer, the segment part cannot be modified, but in Huge it can be.

"Real mode DOS" is just one of many (again, platform-specific) examples where the "flat" memory model doesn't easily apply.

See also:

paulsm4
  • 114,292
  • 17
  • 138
  • 190
2

I was looking at Why can't you do bitwise operations on pointer in C, and is there a way around this? and noticed that most of the responses say that bitwise operations are not well defined on pointers because a pointer is not very well defined in the standard.

That's a poor characterization of what the answers to that question say. Among the things they actually do say are that you cannot perform bitwise operations on pointers because

  • the standard says you can't
  • it would not be useful or meaningful
  • the semantics [of bitwise operations on pointers] are not well defined
  • the standard does not impose requirements on the representation of pointers

The primary message to take from that is that you cannot perform bitwise operations on pointers because the language specification does not define the meaning of such an operation. And why would it? What do you think the meaning or significance would be of twiddling the bits of an address? What, if anything, would the result point to? Under what conditions must the result be valid at all?

However, this has never come up in any of my systems classes, and I wasn't aware that a pointer could be anything but the value of the memory address that the pointer points to.

The language specification identifies pointer values with object addresses, but what the language spec means by that is not necessarily the kind of object that can serve as the operand of a CPU instruction that requires an address. There have been C implementations where these differ.

You also seem to be assuming a flat address space, such that it is reasonable to construe an "address" as a (single) number. There have been and still are machine architectures where that is not the case.

Even considering performing bitwise operations on a C pointer requires thinking at the wrong level of abstraction.

Are there any plans to change this so that a pointer is well defined in a future C standard?

Pointers are already well defined for the purposes they are intended to serve. They are not integers, and are not intended to be treated as integers. This is unlikely to change, because it would not serve a useful purpose. Their representations are unlikely to be specified any more so than they already are, because that would be counterproductive.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
-1

Are there any implementations of C where a pointer is not represented as the memory address that the pointer points to?

Typically, no, this is the fundamental idea of a pointer. You may consider NULL as an example of a pointer that we use logically as something that isn't an address. In a security context, pointers can be encrypted or lower bits are used as tags for dynamic analysis. Here is one example. So in these contexts pointers still represent addresses but at rest may not look like addresses.

Are there any plans to change this so that a pointer is well defined in a future C standard?

The standard is pretty clear about what a pointer represents. In section 6.2.5 it describes a pointer as the following

A pointer type may be derived from a function type or an object type, called the referenced type. A pointer type describes an object whose value provides a reference to an entity of the referenced type. A pointer type derived from the referenced type T is sometimes called "pointer to T".

So any use of a pointer that does not reference an object is a misuse of the standard.

Zack R.
  • 27
  • 2
  • 1
    This answer seems internally inconsistent. On one hand, it starts off denying that there are pointer implementations that are not represented as memory addresses. Then it immediately turns around and says that actually there are, or can be, pointer representations that are not themselves memory addresses, but instead are only related to addresses. And of course pointer representations are related to memory addresses, because as this answer says, that's the nature of a pointer. – John Bollinger Oct 30 '21 at 19:59
  • 2
    But perhaps the biggest thing this answer misses is to challenge the OP's apparent assumption that an address is a number. There have been hardware architectures where addresses are *not* (single) numbers, and some such machines are still in operation. Everything in C's model of computation is *numeric*, because all object representations can be accessed as sequences of `char`, which is a numeric type, but it doesn't make any more sense to twiddle bits of an address than it does to twiddle bits of a string. – John Bollinger Oct 30 '21 at 20:05