71

This is my code:

#include <cstring>
#include <iostream>
int main() {
    bool a;
    memset(&a, 0x03, sizeof(bool));
    if (a) {
        std::cout << "a is true!" << std::endl;
    }
    if (!a) {
        std::cout << "!a is true!" << std::endl;
    }
}

It outputs:

a is true!
!a is true!

It seems that the ! operator on bool only inverts the last bit, but every value that does not equal 0 is treated as true. This leads to the shown behavior, which is logically wrong. Is that a fault in the implementation, or does the specification allow this? Note that the memset can be omitted, and the behavior would probably be the same because a contains memory garbage.

I'm on gcc 4.4.5, other compilers might do it differently.

Bernhard Barker
  • 54,589
  • 14
  • 104
  • 138
flyx
  • 35,506
  • 7
  • 89
  • 126
  • 34
    Wow, but why would you even... – SingerOfTheFall Apr 24 '14 at 12:05
  • 2
    http://ideone.com/tn7UMB – 4pie0 Apr 24 '14 at 12:05
  • 4
    It seems that this was fixed in at least GCC 4.6 [Demo](http://coliru.stacked-crooked.com/a/d0498b9a2721b8e7). – rubenvb Apr 24 '14 at 12:06
  • 3
    LLVM 5.1 (clang-503.0.38) doesn't exhibit this issue either. – Ja͢ck Apr 24 '14 at 12:09
  • 6
    The compiler is allowed to assume that a `bool` value is either `true` or `false`, because a `bool` can only have those values. (That non-zero integer values are *converted to* - not *treated as* - `true` is irrelevant because no conversion is taking place in this code.) – molbdnilo Apr 24 '14 at 12:38
  • molbdnilo: Um, so you're saying that no conversion takes place at `if (a)`, but the value isn't treated as `true` either? Then why does the following code block get executed? – flyx Apr 24 '14 at 13:02
  • 21
    @flyx: Undefined behavior.. no mans land. Anything goes – Engineer2021 Apr 24 '14 at 13:19
  • 2
    There are more possibilities than it just flipping the last bit. The logical not operator could be implemented using a bit-wise not by the compiler, that is if it represents its bool values as all-set. But having it as 0x03 would violate this assumption, as both 0x03 and ~0x03 (0xFC) are true. I'd be curious what you see in memory when you set this bool properly, or alternatively what behavior you see if you memset it to 0xFF. – Apriori Apr 24 '14 at 17:28
  • It would be Undefined Behavior for a different reason if you left out the `memset`: [Using a `memoryless`-variable.](http://stackoverflow.com/questions/22839466/reading-using-modifying-uninitialised-variables-guarantees) – Deduplicator Apr 25 '14 at 17:12
  • `memset` should not be used in C++ code. That code should be rejected during code review. And even legal use of `memset` should be rejected. If you write C++ code, then stop doing C style code. – Phil1970 Jan 18 '17 at 19:31

3 Answers3

92

The standard (3.9.1/6 Fundamental types) says:

Values of type bool are either true or false.

....

Using a bool value in ways described by this International Standard as “undefined,” such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false.

Your program's use of memset leads to undefined behaviour. The consequence of which might be that the value is neither true nor false.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • 14
    I would add 3.9.1/1 "For unsigned character types, all possible bit patterns of the value representation represent numbers. These requirements do not hold for other types." – aschepler Apr 24 '14 at 12:15
  • But if it's neither true nor false, neither of the if-statements should have been executed?! It seems that the OPs code does cause it to behave as both true and false :-/ – Bergi Apr 26 '14 at 14:05
  • @Bergi By definition, if it behaves as both, then it _is_ neither. – underscore_d May 13 '17 at 22:59
42

It's not "logically wrong", it's undefined behaviour. bool is only supposed to contain one of two values, true or false. Assigning a value to it will cause a conversion to one of these values. Breaking type-safety by writing an arbitrary byte value on top of its memory (or, as you mention, leaving it unintialised) will not, so you might well end up with a value that's neither true nor false.

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644
  • Is it NULL or is it nothing? – Ben Apr 25 '14 at 09:30
  • 3
    @Ben is what NULL or nothing? The value? Neither. In the OP's example, the value is 3, which as you can see is neither true nor false. It is 3. – Mr Lister Apr 25 '14 at 13:31
  • The *representation* is 3, which does not correspond to a valid value for a `bool`. – M.M Apr 26 '14 at 07:31
  • 2
    @MattMcNabb That would be implementation dependant. The value of a boolean, when converted to an int, is guaranteed to be `0` or `1`. However, the internal representation may be different. For all we know, bools may be stored as a 32-bit float, and then the representation for `true` would be `00 00 80 3F`. – Mr Lister Apr 26 '14 at 08:19
  • OK, although OP's evidence indicates that on his implementation it does not correspond to a valid value :) – M.M Apr 26 '14 at 08:20
  • -1: it is "logically wrong": we should be able to feel free to `memset(..)` an array of a primitive type without having to check the manual. It ought to be type-safe. If it's UB per the standard — there has to be a good reason for it? And in this case there doesn't seem to be. Typically one expects that zero means false and anything else means true, with `!` switching that around. That's the convention. There has to be a pretty good reason to break the convention. Otherwise we shall consider this a flaw in the standard. – Evgeni Sergeev Aug 22 '14 at 06:08
  • 2
    @EvgeniSergeev: I've no idea what you're talking about. `bool` takes two values, neither of which is `0x03`; using `memset` to reinterpret a `bool` as one or more bytes, and overwrite it with a non-`bool` value, gives undefined behaviour. It's certainly not the case that "zero means false and anything else means true"; only `true` means true. The "convention" you're talking about is a workaround for ancient dialects of C which lacked a boolean type; not for C++. – Mike Seymour Aug 22 '14 at 10:58
  • @MikeSeymour Are you sure you don't know? The convention is right there in C++: it's used to make sense of `if (expr)` where the `expr` evaluates to a `char`, an `int`, a `double`, etc. In that case "zero means false and anything else means true". And `if (!expr)` always gives the opposite result. But not in the case of `bool`. It might be Undefined Behaviour, but it's also Unreasonable Behaviour, so it really ought to be defined. – Evgeni Sergeev Aug 22 '14 at 19:47
  • 1
    @EvgeniSergeev: Oh, I see, you were referring to type conversions to `bool`, not values of `bool`. But this code specifically suppresses type conversions, by using `memset` to write directly to the underlying memory with no knowledge of whatever type might be there. It's (in my view) quite reasonable get undefined behaviour if you deliberately circumvent the language's protection against it. If you want type conversions then use assignment (for a single object) or `std::fill/copy` (for a sequence), and leave `memset/memcpy` for when you really need to dangerously muck around with raw memory. – Mike Seymour Aug 23 '14 at 10:15
  • @MikeSeymour That's the right way of doing it. But it's obviously an issue that trips up a lot of people. In many cases it's probably a variant of `memset(bool_array1, 0xff, sizeof(bool_array1));` which one would expect to work regardless of which of the few natural choices of representations are used by the compiler for `bool` and arrays thereof — because we know that a `bool` will be one bit somewhere; it may be padded and aligned. Perhaps this should even stay UB in the standard, for flexibility reasons, but compiler implementations should ensure this line of code works as expected. – Evgeni Sergeev Aug 24 '14 at 02:34
4

Internally it is likely using a bitwise not (~ operator) to invert it, which would work when the bool was either zero or all ones:

 a = 00000000 (false)
!a = 11111111 (true)

However if you set it to three:

 a = 00000011 (true)
!a = 11111100 (also true)
Zebra North
  • 11,412
  • 7
  • 37
  • 49
  • On all compilers I have seen so far the generated assembly is using `testb` (or an equivalent) instruction for the OPs code, and is internalls transforming this just into one compare and if/else. If we just take the `!a` version, it gets transformed into one `xor` and one `testb`... this call can be played with on the gcc explorer. – PlasmaHH Apr 26 '14 at 10:11
  • That's implementation detail for your compiler. Where in the standard can I find this info? – David Heffernan Apr 26 '14 at 22:42
  • I really doubt any implementation stores `true` as all ones. Given how `bool` can be converted to an `int` with value `0` or `1`, it's far more likely that it's stored that way too. – underscore_d May 13 '17 at 23:07