4

I'm compiling a C++ program to run in a freestanding environment and the CPU I'm running on defines a 32-bit peripheral register to be available (edit: memory-mapped) at PERIPH_ADDRESS (aligned correctly, and not overlapping with any other C++ object, stack etc.).

I compile the following code with PERIPH_ADDRESS predefined, later link it with a full program and run it.

#include <cstdint>

struct Peripheral {
    const volatile uint32_t REG;
};

static Peripheral* const p = reinterpret_cast<Peripheral*>(PERIPH_ADDRESS);

uint32_t get_value_1() {
    return p->REG;
}

static Peripheral& q = *reinterpret_cast<Peripheral*>(PERIPH_ADDRESS);

uint32_t get_value_2() {
    return q.REG;
}

extern Peripheral r;
// the address of r is set in the linking step to PERIPH_ADDRESS

uint32_t get_value_3() {
    return r.REG;
}

Does any of the get_value functions (either directly or through p/q) have undefined behavior? If yes, can I fix it?

I think an equivalent question would be: Can any conforming compiler refuse to compile the expected program for me? For example, one with UB sanitezer turned on.

I have looked at [basic.stc.dynamic.safety] and [basic.compound#def:object_pointer_type] but that seems to only restrict the validity of pointers to dynamic objects. I don't think it applies to this code, because the "object" at PERIPH_ADDRESS is never assumed to be dynamic. I think I can safely say that the storage denoted by p never reaches the end of its storage duration, it can be considered static.

I've also looked at Why does C++ disallow the creation of valid pointers from a valid address and type? and the answers given to that question. They also only refer to dynamic objects' addresses and their validity, so they do not answer my question.

Other questions I've considered but couldn't answer myself that might help with the main question:

  • Do I run into any UB issues because the object was never constructed within the C++ abstract machine?
  • Or can I actually consider the object to be one with static storage duration "constructed" properly?

Obviously, I'd prefer answers that reference any recent C++ standard.

palotasb
  • 4,108
  • 3
  • 24
  • 32
  • I'm confused, registers is not memory, it's a local storage inside a processor. If you don't want the address of an int/double... then the compiler may fit the object in one, otherwise it will store it in memory where a pointer can be taken. – Matthieu Brucher Nov 08 '18 at 18:14
  • Maybe you can safely read it using some compiler intrinsic function? Maybe compiler even offers intrinsic for this particular case? – user7860670 Nov 08 '18 at 18:16
  • 1
    @MatthieuBrucher accessing the register is a read/write from memory in many microchips. – SergeyA Nov 08 '18 at 18:24
  • 2
    @MatthieuBrucher OP says they are working in a specific environment where the register is mapped logically to a memory address. There are systems where certain memory addresses do not actually refer to memory but instead refer to other things. The hardware designers can do it however they want, so on a machine with 1K memory the address 2000 might refer to a register, 2001 a different register, 2002 an I/O hardware to which some arbitrary device is plugged in, for 2003 the bits might all refer to different single-bit I/O pins or to interrupt statuses or whatever, anything really. – Loduwijk Nov 08 '18 at 19:09
  • Related [What is the strict aliasing rule](https://stackoverflow.com/a/51228315/1708801) – Shafik Yaghmour Nov 08 '18 at 19:37
  • I don't quite get the point of the struct. Why not just `const volatile uint32_t& value1 = *reinterpret_cast(PERIPH_ADDRESS)`? – molbdnilo Nov 08 '18 at 19:46
  • Fair enough, memory-mapped makes sense. – Matthieu Brucher Nov 08 '18 at 19:49
  • Things can be undefined according the standard but well defined according to a specific implementation – such things are perfectly valid in the implementation, but you can't turn to the standard for interpretation. ("Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment[...]") – molbdnilo Nov 08 '18 at 19:51
  • @molbdnilo I actually care about being able to access through a non-fundamental object type so I can build a useful driver using multiple subregisters. But I didn't want the question to be about the struct layout. – palotasb Nov 08 '18 at 20:14
  • @palotasb Ah, so the struct "covers" a whole range of addresses? That makes a lot of sense. – molbdnilo Nov 08 '18 at 20:20
  • Yes, for example see the ARM Cortex-M4 System Control Block [typedef](https://github.com/ARM-software/CMSIS_5/blob/1ea17c52c965b2a1a0086bcd414a7429b36802a0/CMSIS/Core/Include/core_cm4.h#L440-L463) and [usage](https://github.com/ARM-software/CMSIS_5/blob/1ea17c52c965b2a1a0086bcd414a7429b36802a0/CMSIS/Core/Include/core_cm4.h#L1549) similar to my example 1. (`__IOM` etc. are const/volatile defines.) – palotasb Nov 08 '18 at 21:03
  • 1
    @molbdnilo: Not only that, but the according to the published Rationale, the authors of the Standard did not want to preclude the use of C as a form of "high-level assembler", but instead observed that C's ability write machine-specific code was one of its strengths. They also expected that many implementations would treat situations where the Standard imposes no requirements as opportunities to implement various useful "popular extensions", but regarded the question of when to do so as a Quality of Implementation issue outside the Standard's jurisdiction. – supercat Nov 08 '18 at 22:00
  • By definition, the meaning of reading/writing a volatile object is defined by an ABI, not a programming language standard. – curiousguy Nov 09 '18 at 02:30
  • @curiousguy: What do you mean "by definition". The C Standard does not recognize the concept of an ABI, and would not forbid an implementation targeting one platform from processing volatile writes in a manner that emulates a different platform. The concept of an implementation which targets a certain platform and generates code that conforms to that platform's ABI is certainly a useful one, but the Standard doesn't recognize such a thing. – supercat Nov 13 '18 at 19:40

7 Answers7

3

It is implementation-defined what a cast from a pointer means [expr.reinterpret.cast]

A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined.

Therefore this is well-defined. If your implementation promises you the result of the cast is valid, you are fine.

The linked question is in regards to pointer arithmetic, which is unrelated to the problem at hand.

† By definition, a valid pointer points to an object, implying subsequent indirections are also well-defined. Care should be exercised in making sure the object is within its lifetime.

Passer By
  • 19,325
  • 6
  • 49
  • 96
2

Does any of the get_value functions (either directly or through p/q) have undefined behavior?

Yes. All of them. They are all accessing the value of an object (of type Peripheral) that as far as the C++ object model is concerned does not exist. This is defined in [basic.lval/11], AKA: the strict aliasing rule:

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

It's not the "cast" that's the problem; it's the use of the results of that cast. If there is an object there of the specified type, then the behavior is well-defined. If there isn't, then it is undefined.

And since there is no Peripheral there, it is UB.

Now, if your execution environment promises that there is an object of type Peripheral at that address, then this is well-defined behavior. Otherwise, no.

If yes, can I fix it?

No. Just rely on the UB.

You're working in a restricted environment, using a free-standing implementation, probably meant for a specific architecture. I wouldn't sweat it.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • 1
    I agree with the fact that the snippet does not contain any code that defines a `Peripheral` object at the address. Is an implementation really allowed to assume that it isn't defined somewhere else (e.g., in another translation unit, that is later linked)? – palotasb Nov 08 '18 at 19:35
  • @palotasb: I don't understand what you mean by that. Undefined behavior is a *runtime* condition, not a compile-time determined thing. It is possible to provide enough information to be certain that some piece of code has well-defined behavior, but the reverse is not the case. – Nicol Bolas Nov 08 '18 at 19:41
  • Yes, but it's caused by broken invariants assumed to be true by the compiler or linker, isn't it? The compiler must generate correct code for the case when the invariants are kept, and I'm trying to rely on this fact. – palotasb Nov 08 '18 at 19:47
  • I've now added the last idea I mentioned in my question to the code example, using the linker to set the address of an `extern`-declared object. Does your answer cover that one too, saying it's UB I can rely on? – palotasb Nov 08 '18 at 19:52
  • @palotasb: "*Yes, but it's caused by broken invariants assumed to be true by the compiler or linker, isn't it?*" UB is caused by your code doing something that the specification says has no defined behavior. "Language-lawyer" questions don't care about what "the compiler or linker" does; it's for questions where you care about what the *language* says. – Nicol Bolas Nov 08 '18 at 20:01
  • @palotasb: "*using the linker to set the address of an extern-declared object*" What determines if this is UB or not is whether there is an object of that type at that address at the time when you access it. If there is, then it is well-defined. If there is not, then it is UB. Your hypothetical other translation unit that defines `r` could easily have destroyed that object (via direct destructor call) before the main one tries to access it, and it will still be UB even though there is a declared variable at that address. – Nicol Bolas Nov 08 '18 at 20:03
  • I mentioned compiling and linking, because the standard refers to it in [[basic.link\]](http://eel.is/c++draft/basic.link) and [[dcl.link\]](http://eel.is/c++draft/dcl.link#9). It also says: "_Linkage from C++ to objects defined in other languages [...] is implementation-defined [...]. Only where the object layout strategies [...] are similar enough can such linkage be achieved._" This might be what I've been looking for with the "if yes, how can I fix it?" question. The object really _is_ defined outside C++ and the layout strategy can be defined to be same as C++. Can this be UB-free then? – palotasb Nov 08 '18 at 20:48
  • @palotasb: You are overthinking this. The question is very simple: is there an object there at the time you access the lvalue? If the answer is "yes", then the behavior is well-defined. If the answer is "no", then the behavior is undefined. Linkage is *irrelevant*; what matters is the state of the abstract machine at the time the code is invoked. – Nicol Bolas Nov 08 '18 at 21:10
  • Of course I am, I tagged it as language-lawyer. :) To summarize, you're saying that it is UB while no object of the correct type exists at that location (from the abstract machine point of view), but at least implementation-defined if there is one? I still assume that has to be _static storage duration_ because [basic.stc.dynamic.safety] refers to "_dynamic objects_" whose pointers still need to be safely derived. Linkage is relevant, because that might be one way to language-legally bring a non-C++ object into C++. – palotasb Nov 08 '18 at 22:12
  • @NicolBolas: From the point of view of the Standard, the only differences between UB and IDB is whether implementations are *required* to define behavior. If 95% of platforms could cheaply and usefully specify a useful behavior for some action, but saying anything useful about it would be impractical on the remaining 5%, the harm from requiring implementations for that 5% go out of their way to behave in a consistent fashion that can be documented, regardless of cost or whether the behavior serves any useful purposes, would be greater than the benefits of requiring that implementations... – supercat Nov 08 '18 at 22:20
  • ...to something that they would do with or without an explicit requirement. The Standard fails to define terms like "object" in ways that can sensibly handle all corner cases where part of the Standard would describe the behavior of some actions and another part says an overlapping category of actions invokes UB, and generally punts by expecting that implementations can behave sensibly without being required to do so. – supercat Nov 08 '18 at 22:24
  • @NicolBolas Your definition of "exists" is too narrow: you seem to be saying that the object must exist in the same object unit (e.g. executable). However we already know that it can exist in a linked library for example, including being provided via a binding to by code written in some other language. There's no real reason to say the object can't exist indirectly in some other place (in this case, in hardware) as long as it's compatible with the C++ object model, which it is. That's why the standard is deliberately vague about such things, rather than saying "a definition must be visible". – Lightness Races in Orbit Nov 09 '18 at 17:08
  • @LightnessRacesinOrbit: "*you seem to be saying that the object must exist in the same object unit (e.g. executable).*" Where did I say that? I said it *must exist*; I didn't specify where. So long as an object exists in that location, in accord with the standard's specification of how objects come into existence, then the code is well-formed. I said or implied *nothing* about "object units", "executable", "a definition must be visible", or anything of the kind. – Nicol Bolas Nov 09 '18 at 18:04
  • @NicolBolas You did imply it, by claiming that the object doesn't exist. It does exist. Since we have ruled out all apparent reasons that you may believe the object doesn't exist, I'm utterly confused as to your viewpoint! – Lightness Races in Orbit Nov 10 '18 at 20:17
  • @LightnessRacesinOrbit: "*It does exist.*" [intro.object]/1 describes the ways in which an object comes into being in C++. Unless one of those things has happened to that address of memory, then there is no object there. The OP has not stated or implied that any of these things have been done to that memory (indeed, the fact that this is even a question strongly implies that none of them have been done). Therefore, unless the compiler itself creates an object there, as far as the C++ object model is concerned, there is no object there. – Nicol Bolas Nov 10 '18 at 20:19
  • @LightnessRacesinOrbit: I'm not the one who has to prove that an object doesn't exist; I'm just telling you what the rules are. When you access that address, there must be an object there. Otherwise, you get UB. – Nicol Bolas Nov 10 '18 at 20:20
  • @Nicol No, that's not all you're doing. Your answer literally says that the object doesn't exist and the program has UB. That is your claim. So, yes, you have to prove that it is true; however, you will struggle, because (as I am explaining to you) this is incorrect. Clearly though this attempt to help you improve your answer is not constructive and therefore no further discussion will be useful. – Lightness Races in Orbit Nov 11 '18 at 00:53
  • @LightnessRacesinOrbit: "*Your answer literally says that the object doesn't exist and the program has UB.*" Because nothing as presented either in the code or the text describing it says that an object exists in that memory location. Indeed, nothing in the question text states or implies that the object exists in accord with C++. If the OP wants to claim that there is an object there, then they can produce the code that puts it there or other documentation that states that a `Peripheral` lives in that address. Otherwise, the code example and its description must be taken as it is. – Nicol Bolas Nov 11 '18 at 02:09
1

This is summarizing the very helpful answers posted originally by @curiousguy @Passer By, @Pete Backer and others. This is mostly based on the standard text (hence the language-lawyer tag) with references provided by other answers. I made this a community wiki because none of the answers were completely satisfying but many had good points. Feel free to edit.

The code is implementation-defined in the best case, but it could have undefined behavior.

The implementation-defined parts:

  1. reinterpret_cast from integer type to pointer type is implementation-defined. [expr.reinterpret.cast/5]

    A value of integral type or enumeration type can be explicitly converted to a pointer. A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined. [ Note: Except as described in [basic.stc.dynamic.safety], the result of such a conversion will not be a safely-derived pointer value. — end note ]

  2. Access to volatile objects is implementation-defined. [dcl.type.cv/5]

    The semantics of an access through a volatile glvalue are implementation-defined. If an attempt is made to access an object defined with a volatile-qualified type through the use of a non-volatile glvalue, the behavior is undefined.

The parts where UB has to be avoided:

  1. The pointers must point to a valid object in the C++ abstract machine, otherwise the program has UB.

    As far as I can tell, if the implementation of the abstract machine is a program produced by a sane, conformant compiler and linker running in an environment that has the register memory-mapped as described, then the implementation can be said to have a C++ uint32_t object at that location, and there is no UB with any of the functions. This seems to be allowed by [intro.compliance/8]:

    A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any well-formed program. [...]

    This still requires liberal interpretation of [intro.object/1], because the object is not created in any of the listed ways:

    An object is created by a definition ([basic.def]), by a new-expression, when implicitly changing the active member of a union ([class.union]), or when a temporary object is created ([conv.rval], [class.temporary]).

    If the implementation of the abstract machine has a compiler with a sanitizer (-fsanitize=undefined, -fsanitize=address), then one might have to add extra information to the compiler to convince it that there is a valid object at that location.

    Of course the ABI has to be correct, but that was implied in the question (correct alignment and memory-mapping).

  2. It is implementation-defined whether an implementation has strict or relaxed pointer safety [basic.stc.dynamic.safety/4]. With strict pointer safety, objects with dynamic storage duration can only be accessed through a safely-derived pointer [basic.stc.dynamic.safety]. The p and &q values are not that, but the objects they refer to do not have dynamic storage duration, so this clause does not apply.

    An implementation may have relaxed pointer safety, in which case the validity of a pointer value does not depend on whether it is a safely-derived pointer value. Alternatively, an implementation may have strict pointer safety, in which case a pointer value referring to an object with dynamic storage duration that is not a safely-derived pointer value is an invalid pointer value [...]. [ Note: The effect of using an invalid pointer value (including passing it to a deallocation function) is undefined, see [basic.stc].

The practical conclusion seems to be that implementation-defined support is needed to avoid UB. For sane compilers, the resulting program is UB-free or it might have UB that can be very well relied on (depending on how you look at it). Sanitizers however can justifiably complain about the code unless they are explicitly told that the correct object exists in the expected location. The derivation of the pointer should not be a practical problem.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
palotasb
  • 4,108
  • 3
  • 24
  • 32
  • Fun story: in "An object is created by a definition ([basic.def]), by a new-expression, when implicitly changing the active member of a union ([class.union]), or when a temporary object is created ([conv.rval], [class.temporary])" the text about union is new. Previous C++ std did not include the union text which **formally means that unions were actually not supported in C++ before** which would the position of zero committee member. Which tells you a lot about the difference between intent and what the C++ committee puts on paper. – curiousguy Nov 10 '18 at 00:29
  • (...) The first C++ std was voted in 1997. The next revisions all had the same text without the union part. So all these years, people used to refer to a language semantic which only allowed objects to be created by a definition, `new` or the implicitly by the compiler (temporary object). That didn't seem to bother any C++ expert that defers to the std regarding fundamental C++ semantics, which is troubling. – curiousguy Nov 10 '18 at 00:35
  • @palotasb: "*then the implementation can be said to have a C++ `uint32_t` object at that location, and there is no UB with any of the functions*" But you're not accessing a `uint32_t`; you're accessing a *`Peripheral`*, which *contains* a `uint32_t`. Even if the implementation promised that there was a `uint32_t` at that address, it doesn't promise there is a `Peripheral` there. So accessing through such an object is still UB. – Nicol Bolas Nov 10 '18 at 20:28
  • @NicolBolas"_it doesn't promise_" A distinction without a difference! – curiousguy Nov 11 '18 at 03:07
  • 1
    @curiousguy: In cases where it had been obvious to everyone how a piece of code should behave, the fact that the Standard didn't forbid compilers from doing something stupid wasn't seen as a problem until compiler writers started regarding the fact that the Standard would allow that code to be processed a certain way as evidence that doing so wouldn't be make an implementation less useful. – supercat Nov 11 '18 at 16:08
1

As a practical matter, of the constructs you suggested, this one

struct Peripheral {
    volatile uint32_t REG;  // NB: "const volatile" should be avoided
};

extern Peripheral r;
// the address of r is set in the linking step to PERIPH_ADDRESS

uint32_t get_value_3() {
    return r.REG;
}

is the most likely not to run foul of "surprising" optimizer behavior, and I would argue that its behavior is implementation-defined at worst.

Because r is, in the context of get_value_3, an object with external linkage that is not defined in this translation unit, the compiler has to assume that that object does exist and has already been properly constructed when generating code for get_value_3. Peripheral is a POD object, so there's no need to worry about static constructor ordering. The feature of defining an object to live at a particular address at link time is the epitome of implementation-defined behavior: it's an officially documented feature of the C++ implementation for the hardware you are working with, but it's not covered by the C++ standard.

Caveat 1: absolutely do not attempt this with a non-POD object; in particular, if Peripheral had a nontrivial constructor or destructor, that would probably cause inappropriate writes to this address at startup.

Caveat 2: Objects that are properly declared as both const and volatile are extremely rare, and therefore compilers tend to have bugs in their handling of such objects. I recommend using only volatile for this hardware register.

Caveat 3: As supercat points out in the comments, there can be only one C++ object in a particular memory region at any one time. For instance, if there are multiple sets of registers multiplexed onto a block of addresses, you need to express that with a single C++ object somehow (perhaps a union would serve), not with several objects assigned the same base address.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • The notion that compilers must allow for things they have no way to disprove may have been a good one once, but is no longer reliable. Compilers used to avoid making assumptions about external symbols, but today's compilers assume that it's impossible for programmers to know anything about the values of external symbols even when using build systems that would allow programmers to control them. – supercat Nov 12 '18 at 22:10
  • @supercat I do not think this code does anything that would be broken by a compiler of the type you describe. – zwol Nov 13 '18 at 00:51
  • Because of whole-program optimization, wrapping operations in external functions is no longer a reliable way of preventing compilers from making assumptions about what objects can be register-cached across function calls. If a compiler happens to notice that e.g. a `volatile` object is accessed using different structure types at different parts of the code, a compiler that determines that one part of the code will execute may be allowed to conclude that the other part can't possibly execute. Today's compilers aren't "sophisticated" enough to make such inferences... – supercat Nov 13 '18 at 04:13
  • ...but that doesn't mean future compilers won't be. – supercat Nov 13 '18 at 04:14
  • @supercat Oh, I see what you mean. Yes, if you assigned two different objects to the same region of memory you could get in trouble, even with the current generation of compilers, I think. I will add a note about that. – zwol Nov 13 '18 at 12:45
0

I don't know if you're looking for a language-lawyer answer here, or a practical answer. I'll give you a practical answer.

The language definition doesn't tell you what that code does. You've gotten an answer that says that the behavior is implementation-defined. I'm not convinced one way or the other, but it doesn't matter. Assume that the behavior is undefined. That doesn't mean that bad things will happen. It means only that the C++ language definition doesn't tell you what that code does. If the compiler you're using documents what it does, that's fine. And if the compiler doesn't document it, but everyone knows what it does, that's fine, too. The code you've shown is a reasonable way of accessing memory-mapped registers in embedded systems; if it didn't work, lots of people would be upset.

Pete Becker
  • 74,985
  • 8
  • 76
  • 165
  • 1
    The question tag is very clear on what OP is looking for. – SergeyA Nov 08 '18 at 18:23
  • @SergeyA -- I'm confused. If **you believe** that this is strictly a language-lawyer question, why did you **write a comment** to the question about the behavior of some microchips? – Pete Becker Nov 08 '18 at 18:35
  • 1
    I was replying to another user who was curios about CPU registers. It is not directly related to the question, in my view. – SergeyA Nov 08 '18 at 18:45
  • @SergeyA is correct that OP tagged `language-lawyer`, but this is still a valid answer. "It doesn't really matter because..." might not be the accepted answer, but it is still valid and useful. _However_, I did not up-vote because of the part "if the compiler doesn't document it, but everyone knows what it does, that's fine, too"... I have seen many bugs introduced because of people relying on this thinking. If it's documented, fine. If it's not, please do not rely on it as it could change. – Loduwijk Nov 08 '18 at 19:13
  • This really is a language lawyer question, I can reasonably rely on GCC/Clang handling this code well. UB Sanitizer in GCC also handles this the way I expect it to (no UB). – palotasb Nov 08 '18 at 19:36
  • 2
    This is still a useful answer for someone looking for a practical result IMHO. – palotasb Nov 08 '18 at 19:56
  • 1
    @Aaron It's wrong to say that if the compiler doesn't document it, you can't rely on it. Because most compiler writers don't know what a specification is. They couldn't write a spec if their life depended on it. You don't have a specification and can't rely on anything but intuition. – curiousguy Nov 11 '18 at 05:48
  • 1
    @curiousguy: Most compiler specs are bad because most specs for just about everything are bad. A bigger issue, though, is that constructs which had been universally supported on except implementations targeting quirky processors prior to C89 are regarded by the Standard's authors as "popular extensions", but compiler writers never saw them as extensions worthy of documentation because such support was normal and it was only quirky processor's behavior that was abnormal. – supercat Nov 11 '18 at 16:14
0

Neither the C nor the C++ standard formally cover even the act of linking object files compiled by different compilers. The C++ standard doesn't provide any guarantee that you can interface with modules compiled with any C compiler, or even what it means to interface with such modules; the C++ programming language doesn't even defer to the C standard for any core language feature; there is no C++ class formally guaranteed to be compatible with a C struct. (The C++ programming language doesn't even formally recognize that there is a C programming language with some fundamental types with the same spelling as in C++.)

All interfacing between compilers is by definition done by an ABI: Application Binary Interface.

Using objects created outside the implementation must be done following the ABI; that includes system calls that create the representation of objects in memory (like mmap) and volatile objects.

curiousguy
  • 8,038
  • 2
  • 40
  • 58
-1

Code like the above effectively seeks to use C as a form of "high-level assembler". While some people insist that C is not a high-level assembler, the authors of the C Standard had this to say in their published Rationale document:

Although it strove to give programmers the opportunity to write truly portable programs, the C89 Committee did not want to force programmers into writing portably, to preclude the use of C as a “high-level assembler”: the ability to write machine-specific code is one of the strengths of C. It is this principle which largely motivates drawing the distinction between strictly conforming program and conforming program (§4).

The C and C++ Standards deliberately avoid requiring that all implementations be usable as high-level assemblers, and makes no attempt to define all the behaviors necessary to make them suitable for such purposes. Consequently, behavior of constructs like yours which effectively treat the compiler as a high-level assembler is not defined by the Standard. The authors of the Standard explicitly recognize the value of some programs' ability to use the language as a high-level assembler, however, and thus clearly intend that code like yours be usable on implementations that are designed to support such constructs--the failure to define the behavior in no way implies a view that such code is "broken".

Even before the Standard was written, implementations intended for low-level programming on platforms where it would make sense to process conversions between pointers and like-sized integers as simply reinterpreting the bits thereof, would essentially unanimously process such conversions that way. Such processing greatly facilitates low-level programming on such platforms, but the authors of the Standard saw no reason to mandate it. On platforms where such behavior wouldn't make sense, such a mandate would be harmful, and on those where it would make sense, compiler writers would behave appropriately with or without it, making it unnecessary.

Unfortunately, the authors of the Standard were a little bit too presumptuous. The published Rationale states a desire to uphold the Spirit of C, whose principles include "Don't prevent the programmer from doing what needs to be done". This would suggest if on a platform with naturally-strong memory ordering it might be necessary to have a region of storage that be "owned" by different execution contexts at different times, a quality implementation intended for low-level programming on such a platform, given something like:

extern volatile uint8_t buffer_owner;
extern volatile uint8_t * volatile buffer_address;

buffer_address = buffer;
buffer_owner = BUFF_OWNER_INTERRUPT;
... buffer might be asynchronously written at any time here
while(buffer_owner != BUFF_OWNER_MAINLINE)
{  // Wait until interrupt handler is done with the buffer and...
}  // won't be accessing it anymore.
result = buffer[0];

should read a value from buffer[0] after the code has read object_owner and received the value BUFF_OWNER_MAINLINE. Unfortunately, some implementations think it would be better to try to use some earlier-observed value of buffer[0] than treat the volatile accesses as possibly releasing and re-acquiring ownership of the storage in question.

In general, compilers will process such constructs reliably with optimizations disabled (and would in fact do so with or without volatile), but cannot handle such code efficiently without the use of compiler-specific directives (which would also render volatile unnecessary). I would think the Spirit of C should make it clear that quality compilers intended for low-level programming should avoid optimizations that would weaken volatile semantics in ways that would prevent low-level programmers from doing the things that may be needed on the target platform, but apparently it's not clear enough.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • This might be off topic, but still interesting. Is what you mention an implementation bug in the compilers or does the standard actually allow reusing the old `buffer[0]` value? – palotasb Nov 08 '18 at 21:08
  • @palotasb: A reload of `buffer[0]` would be useless on implementations that are used for some purposes, but essential on those used for some others. The authors of the Standard expect that people writing implementations intended to be suitable for various purposes will be better positioned than the Committee to determine whether code written for such purposes would require that `buffer[0]` be read after `buffer_owner`. – supercat Nov 08 '18 at 21:26
  • Specifically you need barriers telling the compiler not to reorder the reads/writes. You need the same on processors, though most compiler barriers include the required assembly for the processor as well. Neither the compiler nor processor runs _your_ code exactly. They often do [reorder reads/writes](https://en.wikipedia.org/wiki/Memory_barrier). – Flip Nov 09 '18 at 06:20
  • 1
    @Flip: The authors of the Standard have expressly said that they did not wish to preclude the use of C as a form of "high-level assembler". They also said that the "Spirit of C embodied" the principle "Don't prevent the programmer from doing what needs to be done". Their words. I would think that in cases where an implementation acting as a "high level assembler" could do useful things without compiler-specific directives, they would want implementations claiming to be suitable for low-level programming to be capable of doing those same things likewise. – supercat Nov 09 '18 at 16:02
  • 1
    @Flip: In cases where an assembly-language programmer would need to add processor-specific memory barriers, it would be reasonable for a C compiler to require use of a vendor-specific extension to do likewise. But if all that's necessary is to prevent the *compiler* from reordering loads/stores of objects that have been exposed to the outside world across parts of the code that are known to interact with the outside world in ways the compiler can't understand, there's no reason that should require compiler-specific syntax. – supercat Nov 09 '18 at 16:07
  • @supercat See [Gcc](https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html). Sequences on volatile data aren't reordered by the compiler. But non-volatiles are reordered and can be reordered *relative to volatiles*. So volatile is insufficient. Then there is still the processor which may reorder your volatiles. Which is why `atomic` was introduced. Atomics make explicit memory order guarantees. – Flip Nov 12 '18 at 08:12
  • @Flip: If a non-optimizing compiler for a particular platform would be able to e.g. create a mutex without requiring compiler-specific syntax, a compiler intended for low-level programming on that same platform that respects the Spirit of C (which, according to the authors of the Standard, includes the principle "Don't prevent the programmer from doing what needs to be done") should be able to do likewise, The authors of the C Standard thought compiler writers would be able to recognize what semantics programmers would need on their target platforms, but failed to emphasize that... – supercat Nov 12 '18 at 15:47
  • ...the reason the Standard doesn't mandate such things is that they are Quality of Implementation issues outside the jurisdiction of the Standard. – supercat Nov 12 '18 at 15:49
  • @supercat On out-of-order processors, there isn't a way of creating a mutex without using memory barriers. Which until atomics wasn't part of the standard. Unless you have the compiler generate implicit memory barriers around all volatiles (like MSVC). Which is, needlessly in most cases, a lot slower. Which also conflicts with the "Spirit of C". – Flip Nov 13 '18 at 08:08
  • @Flip: In the paragraph before the code example, I specifically said *on a platform with naturally-strong memory ordering* [a category which, incidentally, includes 99%+ of the microcontrollers used in embedded systems]. It's reasonable to say that if a platform would require memory barriers for correctness, such barriers are the responsibility of the programmer rather than the implementation. If the platform could guarantee correctness to be achieved without barriers, however, a quality compiler should avoid being the only obstacle to that. – supercat Nov 13 '18 at 20:02