22

The example bellows compiles, but the output is rather strange :

#include <iostream>
#include <cstring>

struct A
{
    int a;
    char b;
    bool c;
};

int main()
{
    A v;
    std::memset( &v, 0xff, sizeof(v) );

    std::cout << std::boolalpha << ( true == v.c ) << std::endl;
    std::cout << std::boolalpha << ( false == v.c ) << std::endl;
}

the output is :

true
true

Can someone explains why?

If it matters, I am using g++ 4.3.0

Jon
  • 428,835
  • 81
  • 738
  • 806
BЈовић
  • 62,405
  • 41
  • 173
  • 273

5 Answers5

11

Found this in the C++ standard, section 3.9.1 "Fundamental types" (note the magic footnote 42):

6. Values of type bool are either true or false. 42)

42) Using a bool value in ways described by this International Standard as ‘‘undefined,’’ such as by examining the value of an uninitialized automatic variable, might cause it to behave as if it is neither true nor false.

This is not perfectly clear for me, but seems to answer the question.

Roman L
  • 3,006
  • 25
  • 37
  • It means that it can be anything, like a third state of a boolean variable. UB is just like that. Could have caused nasal demons :P – BЈовић Dec 23 '10 at 13:39
  • Note that the footnote 42 does not apply here, as the memory has been initialized with the memset to all `1`s – David Rodríguez - dribeas Dec 23 '10 at 16:47
  • 1
    @David Rodríguez - dribeas: I think uninitialized variable is only an example of "undefined" usage (note the "such as") – Roman L Dec 23 '10 at 19:11
  • 2
    The `bool` member in the structure is assigned an undefined value by the `memset` function. This is a good example to stop using `memset` and use constructors instead. – Thomas Matthews Dec 23 '10 at 19:29
  • @VJo: The uninitialized example does not apply as the execution of `memset` initializes the memory. The problem is with the initialization itself as it is not setting the variable to `true` or `false`, but rather to `0xff`. – David Rodríguez - dribeas Dec 30 '10 at 00:43
  • No such thing as "an engineered bool". – Lightness Races in Orbit Feb 21 '11 at 23:25
  • @DavidRodríguez-dribeas By my reading, the point is that being uninitialised is only an example. The footnote refers to UB, which includes things the Standard hasn't defined, and the footnote is linked from a section that _does_ define that `bool`s may only hold the values `true` or `false`. A `bool` that has been `memset` in such a way that it holds another value invokes UB. – underscore_d May 13 '17 at 23:11
2

The result of overwriting memory location used by v is undefined behaviour. Everything may happen, according to the standard (including your computer flying off and eating your breakfast).

Vlad
  • 35,022
  • 6
  • 77
  • 199
  • 4
    I believe that it's actually legal to do this, as v is a POD. – Puppy Dec 23 '10 at 13:05
  • @DeadMG: I doubt about that. The optimizer might have its assumptions about the possible contents of `bool`. – Vlad Dec 23 '10 at 13:07
  • @Vlad: Of course it does- that's my answer too. That's specific to `v.c`, though, not `v` in general. – Puppy Dec 23 '10 at 13:11
  • @DeadMG: I don't think the standard specially distinguishes `bool`. UB is just UB. Have you got a reference which specially says something about bools? – Vlad Dec 23 '10 at 13:14
  • 1
    The structure A is POD, therefore you can do whatever you like with it, including filling some random values into memory location occupied by objects of such types. – BЈовић Dec 23 '10 at 13:19
  • @VJo: if this were true, you wouldn't get the behaviour which you describe. – Vlad Dec 23 '10 at 13:47
  • @downvoter: could you please find a reference in a standard saying that overwriting POD except bools is not an UB? – Vlad Dec 23 '10 at 13:52
  • @Vlad I didn't downvote your answer, but this link should provides all informations about POD (http://www.fnal.gov/docs/working-groups/fpcltf/Pkg/ISOcxx/doc/POD.html) – BЈовић Dec 23 '10 at 13:57
  • 3
    @VJo: 3.9/2 says, "For any object (other than a base-class subobject) of POD type T, whether or not the object holds a valid value of type T, the underlying bytes (1.7) making up the object can be copied into an array of char or unsigned char. If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value.". It doesn't say, "you can do whatever you like with it", and in particular it doesn't say that you can write whatever bytes you like and the result be a valid value of the object (in this case, `bool`). – Steve Jessop Dec 23 '10 at 14:27
  • 2
    There's a defect report on the standard that it doesn't properly say how invalid values lead to undefined behavior, but it certainly doesn't define the behavior when you read memory that isn't a valid value representation for the type you read it as. All-bits-1 presumably is not a valid value representation of `bool` on your implementation. – Steve Jessop Dec 23 '10 at 14:31
2

A boolean value whose memory is set to a value that is not one or zero has undefined behaviour.

Puppy
  • 144,682
  • 38
  • 256
  • 465
2

I thing I found the answer. 3.9.1-6 says :

Values of type bool are either true or false.42) [Note: there are no signed, unsigned, short, or long bool types or values. ] As described below, bool values behave as integral types. Values of type bool participate in integral promotions (4.5).

Where the note 42 says :

42) Using a bool value in ways described by this International Standard as ‘‘undefined,’’ such as by examining the value of an uninitialized automatic variable, might cause it to behave as if it is neither true nor false.

BЈовић
  • 62,405
  • 41
  • 173
  • 273
  • Be aware that notes are informative, not normative. Using the value of any uninitialized automatic variable is undefined behavior, whether its type is `bool` or something else. – Steve Jessop Dec 23 '10 at 14:29
1

I can't seem to find anything in the standard that indicates why this would happen (most possibly my fault here) -- this does include the reference provided by 7vies, which is not in itself very helpful. It is definitely undefined behavior, but I can't explain the specific behavior that is observed by the OP.

As a practical matter, I 'm very surprised that the output is

true
true

Using VS2010, the output is the much more easy to explain:

false
false

In this latter case, what happens is:

  • comparisons to boolean true are implemented by the compiler as tests for equality to 0x01, and since 0xff != 0x01 the result is false.
  • same goes for comparisons to boolean false, only the value compared with is now 0x00.

I can't think of any implementation detail that would cause false to compared equal to the value 0xff when interpreted as bool. Anyone have any ideas about that?

Jon
  • 428,835
  • 81
  • 738
  • 806
  • Nice point, but this is really compiler-specific. UB allows it to do whatever optimizations it wants, and for that specific compiler this leads to a `true`. Your `0xff != 0x01` is far not the only way to check it... – Roman L Dec 23 '10 at 13:55
  • @7vies: sure. I think we all agree this is UB, and you can't use this code to produce meaningful results. But it is still interesting to know *why* it behaves that way. – Jon Dec 23 '10 at 13:56
  • @Jon, for example instead of doing `x == true` it could do `x != false` which is equivalent when you consider booleans, but will result in a different behavior for ints (`0xff == 1` is `false` but `0xff != 0` is `true`). – Roman L Dec 23 '10 at 14:02
  • @7vies: We still agree on all of that. I 'm just wondering what GCC is doing in this case and comes up with this result. – Jon Dec 23 '10 at 14:03
  • @Jon, you mean why exactly it prefers say `x != false` instead of `x == true`? When I look at a generated assembly in the release mode, I usually have a 100 of similar questions, why it does this tricky way and not some other way. Just impossible to have answers for all those questions :) – Roman L Dec 23 '10 at 14:26
  • 6
    @Jon: looking at disassembly (on my machine(TM)) from GCC, it uses `movzbl` to read a `bool`, which copies 8 bits. It doesn't need to actually test against `true` because `true ==` is redundant and optimized away even with no optimization compiler options. So `0xff` is passed. `false ==` becomes an xor with 1, so `0xfe` is passed. I don't know what the stream does with the value when the boolalpha manipulator is in effect, but given the results I would guess something equivalent to: if bit pattern is all-zeros print "false" else print "true". – Steve Jessop Dec 23 '10 at 14:44