1

The following snippet is, of course, not a good idea:

char *vram = (char*)0xB8000;
memset(vram, 32, 0x18000);

Nor is this:

volatile char *LCDC = (volatile char*)0xFF40;
char LCDCshadow = *LCDC;

And the following is clearly Undefined Behavior:

int *dontdoit = 0;
*dontdoit;

because when 0 is used in pointer contexts, it becomes the value of the null pointer, and dereferencing the null pointer is undefined behavior.

But are the first two examples Undefined Behavior, or simply Implementation defined/Unspecified?
And if it's the latter, how does one generate a valid pointer with a value of 0?

Orion
  • 1,157
  • 9
  • 18

3 Answers3

1

An integer constant expression with value 0, when converted to a pointer, yields a NULL pointer regardless of the actual representation of a NULL pointer.

Section 6.3.2.3p3 of the C standard states:

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

Converting any other integer value to a pointer value is implementation defined. From section 6.3.2.3p5:

An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

The above typically applies to embedded implementations where it makes sense to access a specific memory address.

If you had an implementation that supported a non-zero NULL pointer, you could assign the value 0 to it through a variable, for example:

int zero = 0;
int *zeroptr = (int *)zero;

In this case, the value of the pointer would be 0 but would not be NULL.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • 1
    Strictly speaking, the first sentence is a requirement only when the value 0 is an *integer constant expression*. For example, `(0,0)` is not one, but has the value 0, and thus could theoretically be used to obtain "address 0" on an implementation where the null pointer is not numerically zero. – R.. GitHub STOP HELPING ICE Nov 12 '19 at 02:55
1

Conversion of integers to pointers is a construct which most but not all implementations can meaningfully support. Further, the Standard has always been focused on features that all compilers are required to support, rather than offering recommendations for those that most compilers should support when practical, and has sought to avoid having compilers accept or reject different syntactic constructs based upon what features they meaningfully support. The effect is that all compilers are required to syntactically accept conversions from integers to pointers, regardless of whether they will process it meaningfully, but the Standard doesn't describe any situations where they are meaningful.

Even on platforms where the behavior would be meaningful, implementation documentation isn't always clear about what constructs are and are not supported. Consider, for example:

extern int x;
int test2(void)
{
    x=1;
    int res=*(int*)0x12345678;
    x=2;
    return res;
}

If x is defined in assembly, a linker-control script, or other language that allows absolute placement, the programmer might know it will be at address 0x12345678. While clang, given the above code, would allow for the possibility that the volatile-qualified read from address 0x12345678 might interact with the first write to x, gcc would not. The authors of gcc would take the attitude that the Standard doesn't require them to support such cases, so any code requiring such support is "broken", but the Standard doesn't require that compilers support any meaningful constructs involving integer-to-pointer conversions other than those that produce null pointers.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • While I accepted Dbush's answer, I really appreciate yours, since it brings to light some of the implications of "Implementation Defined" that I wasn't aware of. Another source of tug between programmers' "I said I wanted X to happen!" and compilers' "You said you wanted fast code!" – Orion Nov 13 '19 at 02:18
  • @Orion: Some compilers that aren't interested in satisfying *paying* customers seem oblivious to the fact that what customers want is to *reliably and efficiently accomplish certain tasks*. Designate as C' the language formed by extending the language with the rule "If some parts of the Standard and a platform's documentation describe the behavior of some action, give those priority over anything else in the Standard that would characterize it as Undefined". A quality C implementation claiming to be suitable for a task should not make the task harder than it would be in a C' implementation. – supercat Nov 13 '19 at 16:04
-1

The following snippet is, of course, not a good idea

Yeah but only because it doesn't compile on a conforming C compiler. char *vram = 0xB8000; is not even valid C, and like all invalid C is has undefined behavior. See "Pointer from integer/integer from pointer without a cast" issues.

This code however is fine:

char *vram = (char*)0xB8000;
memset(vram, 32, 0x18000);

What happens there is beyond the scope of the C language.


volatile char *LCDC = 0xFF40;
char LCDCshadow = *LCDC;

Same bug with no cast leading to invalid C here. Otherwise perfectly fine code.


And if it's the latter, how does one generate a valid pointer with a value of 0?

The integer constant 0, or that constant cast to a void pointer (void*)0 is a special item called null pointer constant. Whenever a pointer is assigned a null pointer constant, that pointer will get transformed into a null pointer. The actual contents of a null pointer is implementation-defined. Upon encountering code like int* ptr = 0;, the compiler may give ptr any value that makes sense for the given implementation - not necessarily 0.

For example:

uint8_t data [sizeof(int*)];
int* ptr = 0;
memcpy(data, &ptr, sizeof(int*));

This doesn't necessarily result in data being 00 00 00 00 for a 32 bit pointer - it is implementation-defined.

This was how the language was designed, with some confused intention to handle exotic addresses etc. In practice though, this means that we can never create a pointer to the physical address zero. So systems that have such a physical address - most notably pretty much every single microcontroller system created - don't treat null pointers any differently, as C intended. Because they need to use the physical address 0. Meaning that accessing a null pointer on such a system will result in an access to address 0.

I've had bugs in real-world systems where an accidental null pointer access caused I/O ports to go active and hardware misbehaving because of it.

So it boils down to the fact that null pointers is a known language design mistake. They should have used a keyword null that couldn't be mixed up with address 0. C++ has attempted to fix this in later standards, but C remains broken.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Implementations where reads and writes of address zero would be meaningful are perfectly free to extend the language to define null-pointer accesses to behave in such fashion. The authors of the Standard thought that such freedom should eliminate the need to have any other more explicit way to specify such actions. – supercat Nov 12 '19 at 17:57
  • Ok, it's not valid because it's missing a cast - fair. A comment would've sufficed, though. Accessing a null pointer is ***undefined*** behavior, though, which means, even if 0 is a perfectly valid address that you 100% intend to access, the compiler can, will, and has just 'nope'd your decision, and all your other code, out the window. Accessing other integer constants brings the posted question into fore: Will the compiler ignore my code, a'la Undefined behavior? – Orion Nov 13 '19 at 02:12
  • @Orion The question is stating that the 2 first cases are questionable cases, but neither of them are save for the incorrect syntax. Fixing that with the cast makes them both _well-defined behavior_. Your question is making lots of incorrect assumptions save for the case with the null pointer, which I have addressed in detail. – Lundin Nov 13 '19 at 08:01