3

Context:

I have a chunk of memory that's shared between two processes via shm_open. I am guaranteed that after ftruncateing and mmaping it, the whole chunk has the bit pattern 00000.... I need to share a boolean value across the two processes.

A more concrete question:

Is the following guaranteed to be okay (the assertion doesn't fail and UB does not occur) on reasonable POSIX systems?

void *my_shared_memory_region = calloc(1024, 1);
bool *my_bool = reinterpret_cast<bool*>(my_shared_memory_region);
assert(*my_bool == false);

I believe that there are some restrictions on the actual values that can live inside of a bool, so I'm not sure. I think this question is distinct from this one because reinterpret_cast doesn't do the same kind of conversions that C-style casts do.

Community
  • 1
  • 1
Patrick Collins
  • 10,306
  • 5
  • 30
  • 69
  • Technically it's undefined behaviour in C++ to do anything in malloc'd space other than creating a new object using placement-new. Personally I regard this as more of a defect in the standard, and would expect your code to work. BTW you can use `static_cast` here. – M.M Apr 01 '16 at 03:05
  • @M.M: Whoops, thanks! – Patrick Collins Apr 01 '16 at 03:05
  • 1
    @M.M.: Actually, an object with trivial initialization (this includes `bool`) exists as soon as storage of sufficient size and correct alignment is acquired. That said, you have to consider it to contain an indeterminate value until you write to it using an lvalue of the same type (strict aliasing rule). – Ben Voigt Apr 01 '16 at 03:15
  • @PatrickCollins: Please note that you're using `reinterpret_cast` on the pointer, not the data pointed to. So conversions and style of cast are a non-issue. – Ben Voigt Apr 01 '16 at 03:16
  • @BenVoigt The actual code uses `shm_open`, `ftruncate` and `mmap` --- `ftruncate` [guarantees](http://man7.org/linux/man-pages/man2/ftruncate.2.html) that the memory "reads as null bytes ('\0')." I think the same guarantee is provided by `calloc`. Is writing to it with an lvalue still required? – Patrick Collins Apr 01 '16 at 03:20
  • The shared memory aspect doesn't really affect the underlying issue of expecting all-0 memory to hold `false` values when accessed as a `bool`. See [this question](http://stackoverflow.com/questions/33380742/is-it-safe-to-memset-bool-to-0) for discussion of that aspect. – Tony Delroy Apr 01 '16 at 03:20
  • @BenVoigt I need to use this to signal from one process to another that a particular chunk of shared memory is ready for reading, so it's very important that I can do the read *before* initializing it with anything. Is that guaranteed by anything? – Patrick Collins Apr 01 '16 at 03:21
  • @TonyD I don't expect it to change anything significantly, but I thought I would provide the motivation for why I need to do this rather than initializing it in a more sane way. – Patrick Collins Apr 01 '16 at 03:21
  • Sure. Bottom line though is that it'll work on common systems but isn't guaranteed portable. – Tony Delroy Apr 01 '16 at 03:24
  • @TonyD: good enough for me! If you or someone else would like to post an answer so I can accept it, please do so. – Patrick Collins Apr 01 '16 at 03:25
  • @BenVoigt well, I used to think that too, but T.C. and Columbo convinced me otherwise in a recent comments discussion. Apparently "the lifetime of an object of type T begins when storage for type T is obtained" does not mean "obtaining storage begins the lifetime of an object". Instead, the conditions in which an object is created are described elsewhere, and then that quote defines the lifetime given that we have already determined an object is being created. [cont...] – M.M Apr 01 '16 at 03:28
  • [...] In fact your interpretation leads to immediate nonsense. On 32-bit system, `malloc(4)` obtains storage for an `int`, and a `float`. Does that mean it begins lifetime of 2 objects? Or if you say it begins 1 object, which one was it? T.C.'s view was that it creates no objects, it just obtains storage. – M.M Apr 01 '16 at 03:29
  • @M.M: It creates one object, whose type can validly be `int` or `float`. Because of strict aliasing, you can't read either type unless you first write a value, and then you can only read as the type you wrote (or `char`). When you write, you determine the type (but then you can overwrite with the other type -- reusing memory also creates a new object) – Ben Voigt Apr 01 '16 at 03:30
  • But the text in [basic.life] talks about "object of type `T`", there is no mention anywhere in the standard of "objects of no type", or "objects that can validly have multiple types until written". If you disagree then cite the Standard . The behaviour you describe about writes setting the type is a C thing but is not in C++ (and C++ doesn't include it by reference to C either) – M.M Apr 01 '16 at 03:35
  • M.M. when you simply allocate memory with `malloc`, you're not creating any object. The object starts existing when you write to that memory through e.g. a `T*`. Consider *"Before the lifetime of an object has started but after the storage which the object will occupy has been allocated"* - notice the distinction drawn between storage and later occupation. – Tony Delroy Apr 01 '16 at 03:39
  • @PatrickCollins: *"If you or someone else would like to post an answer"* - seems we're enjoying this other discussion too much, but you can post one yourself if nobody else jumps in ;-). – Tony Delroy Apr 01 '16 at 03:41
  • @TonyD I assume you refer to something like `int *ptr = (int *)malloc(size); *ptr = 5;`. We agree that `malloc` does not create an object. You claim that `*ptr = 5` creates an object, however nowhere in the definition of the assignment operator (or otherwise) does it say that this creates an object. (If you disagree please cite where the standard it says that `*ptr = 5` would begin lifetime of an object) – M.M Apr 01 '16 at 03:47
  • @M.M. It's in the act of assigning to this untyped, waiting storage that you satisfy the *"storage with the proper alignment and size for type `T` is obtained"* in paragraph 1 of 3.8 Object lifetime. Put another way, the assignment is associating/binding/supplying the waiting storage for an object creation event. – Tony Delroy Apr 01 '16 at 03:50
  • @TonyD the assignment operator doesn't obtain storage – M.M Apr 01 '16 at 03:50
  • @M.M. The storage is already waiting there... but you're obtaining it specifically for use by the object under construction by assigning to it. – Tony Delroy Apr 01 '16 at 03:52
  • The assignment operator doesn't construct any objects either. Nor does it change the size or alignment of any existing storage, so it's hard to see what you mean by quoting 3.8/1. The `malloc` already obtained storage with the proper alignment and size for `int`. – M.M Apr 01 '16 at 03:54
  • @M.M. True that there's no construction - that's because we're talking about types for which trivial initialization suffices. The act of assigning doesn't have to change size/allignment - it's the programmer's responsibility to ensure beforehand that the `T*` through which the assignment is done complies with the needs of a `T` object. Anyway, I can't make this any clearer, so will leave it with you.... – Tony Delroy Apr 01 '16 at 03:58
  • Agree that there's not much point discussing in comments, a moderator will delete it soon enough anyway. *If* the standard said something like "using the assignment operator where the left-hand side has a type with trivial initialization, and the left-hand side refers to storage that does not yet contain an object, then an object is created", then we would agree. But it doesn't say anything like that. Read over [expr.ass] perhaps; in fact point 2 of that section says that the left-hand side should already refer to an object. – M.M Apr 01 '16 at 04:00
  • @M.M. *"should already refer to an object"* - a conceptual view of the assignment implicitly triggering trivial initialization and object creation is consistent with that, and my understanding - right or wrong - of the scenario. We're keeping the mods busy. – Tony Delroy Apr 01 '16 at 05:06
  • 2
    Why not use `unsigned char`? It seemingly exists just for things like that. – n. m. could be an AI Apr 01 '16 at 05:16
  • @n.m. That's a good idea. I guess I was just thinking "I'm not writing C right now, I should use a 'real type' instead of numeric types for everything." I will probably suggest that tomorrow. – Patrick Collins Apr 01 '16 at 06:05

1 Answers1

1

The core issue - even if the memory's all zeros, is it valid to read from it as if from a properly initialised bool - is the same as for this question.

Long story short: it's undefined behaviour that works on common systems but isn't guaranteed portable. Specific implementations are allowed to document behaviour for cases the Standard leaves undefined, so it's worth doing some research for the specific platforms/compilers you care about.

Community
  • 1
  • 1
Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
  • Is it really *undefined* behavior, or are the values just *unspecified*? – Cornstalks Apr 01 '16 at 05:50
  • The value itself might best be termed *uninitialised*, as it hasn't gone through proper C++ initialisation, making the read therefrom - by `bool *my_bool = reinterpret_cast(my_shared_memory_region);` - undefined behaviour. Interestingly, footnote 48) in the C++11 Standard bothers to describe one manner in which the undefined behaviour may manifest: *"Using a `bool` value in ways described by this International Standard as “undefined,” such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither `true` nor `false`."* – Tony Delroy Apr 01 '16 at 05:56
  • The linked question uses the term 'unspecified' ("an otherwise-correct program is not rendered incorrect by this operation, but it's not guaranteed to return the same value on all systems") as opposed to 'undefined' ("nasal demons, a conforming implementation may choose to set your CPU on fire, etc"). Your example is compelling, but the language described there strongly suggests that this isn't nasal demon territory. – Patrick Collins Apr 01 '16 at 21:16
  • @PatrickCollins: I think "unspecified" in the linked answer is in the common English usage sense, rather than Standardese. If you consider the answer's logic, it's based on a boost docs claim that C11 had some guarantee, but the only evidence found was a C99 defect report. [This answer]http://stackoverflow.com/a/11139915/410767) squarely addresses that and my reading is that a bool might have padding bits, and their legal content / import isn't dictated by the Standard. That leaves a possibility of trap representations. – Tony Delroy Apr 02 '16 at 01:00