3

I've found a difference in the value stored in a bool variable (btw Visual-C++ and clang++), in the case where the stored value is neither true nor false (if it was corrupted somehow), and I'm not sure if it's a Visual-C++ bug or if it's just UB I should ignore.

Take the following sample:

#include <cstdint>
#include <iostream>
#include <string>
#include <limits>
bool inLimits(bool const v)
{
    return (static_cast<std::int32_t>(v) >= static_cast<std::int32_t>(std::numeric_limits<bool>::min()) && static_cast<std::int32_t>(v) <= static_cast<std::int32_t>(std::numeric_limits<bool>::max()));
}
int main()
{
    bool b{ false };
    bool const* const pb = reinterpret_cast<bool const*>(&b);
    std::uint8_t * const pi = reinterpret_cast<std::uint8_t*>(&b);

    std::cout << "b: " << b << " pb: " << (*pb) << " pi: " << std::to_string(*pi) << std::endl;
    std::cout << "b is " << (inLimits(b) ? "" : "not ") << "in numeric limits for a bool" << std::endl;

    *pi = 3; // Simulate a bad cast during boolean creation
    bool const b2{ b };
    bool const b3{ *pb };

    std::cout << "b: " << b << " pb: " << (*pb) << " pi: " << std::to_string(*pi) << std::endl;
    std::cout << "b2: " << b2 << " b3: " << b3 << std::endl;

    std::cout << "b is " << (inLimits(b) ? "" : "not ") << "in numeric limits for a bool" << std::endl;
    std::cout << "b2 is " << (inLimits(b2) ? "" : "not ") << "in numeric limits for a bool" << std::endl;
    std::cout << "b3 is " << (inLimits(b3) ? "" : "not ") << "in numeric limits for a bool" << std::endl;

    return 0;
}

This is the output of Visual-C++

b: 0 pb: 0 pi: 0
b is in numeric limits for a bool
b: 3 pb: 3 pi: 3
b2: 3 b3: 3
b is not in numeric limits for a bool
b2 is not in numeric limits for a bool
b3 is not in numeric limits for a bool

and this is the output of clang++

b: 0 pb: 0 pi: 0
b is in numeric limits for a bool
b: 1 pb: 1 pi: 3
b2: 1 b3: 1
b is in numeric limits for a bool
b2 is in numeric limits for a bool
b3 is in numeric limits for a bool

It looks like there is a limits check in clang++ when constructing a new boolean by value, as well as when it is used with stream operator.

Should I just ignore this, or is it a bug that only Visual-C++ has? Thanks!

Edit: For those who did not understand the purpose of the sample, it was just a showcase to "simulate" a memory corruption or a bug in another part of the code that caused a boolean value to be initialized with something else than true or false, whatever the binary representation of a bool.

(I was wondering if I have to protect my code from improper usage somewhere else using an assert for example, but only if this behavior is not UB)

Second edit: Added numeric_limits code.

Chris
  • 136
  • 7
  • The reinterpret_cast can always create undefined behavior when you do things like this. So, perhaps this behavior is outside the standard? – jwimberley Apr 12 '17 at 14:09
  • Agreed. If you want to *copy* representations, use `memcpy`. – chris Apr 12 '17 at 14:11
  • It is likely traceable to a subtle difference in the code generation strategy. MSVC tends to allow garbage in the high bits of a register when it is returning a `bool` value, leaving it up to the caller to zero-extend or otherwise use only the applicable low bits of the register. Clang and GCC, on the other hand, are more likely to ensure that the high bits are zeroed on the callee side. Of course, this isn't a hard-and-fast rule, and it absolutely should not matter which strategy the compiler uses. As the other answers have said, your code is *broken* because it invokes *undefined behavior*. – Cody Gray - on strike Apr 12 '17 at 14:44
  • "Should I just ignore this, or is it a bug that only Visual-C++ has?" Neither. You should not write such code. – Slava Apr 12 '17 at 14:51
  • For those who did not understand the purpose of the sample, it was just a showcase to "simulate" a memory corruption or a bug in another part of the code that caused a boolean value to be initialized with something else than true or false, whatever the binary representation of a bool. (I was wondering if I have to proptect my code from improper usage somewhere else using an assert for example, but only if this behavior is not UB) – Chris Apr 12 '17 at 16:29
  • That last comment means the question makes even less sense. You cannot demonstrate anything meaningful with undefined behavior. You aren't initializing a Boolean value with anything, you're doing raw memory reinterpretation in such a way that is illegal according to the language specification. You cannot protect your code from improper usage if other programmers are invoking undefined behavior because anything can happen. There's no assert that would catch it. – Cody Gray - on strike Apr 13 '17 at 08:39
  • if (v < std::numeric_limits::min() || v > std::numeric_limits::max()) does the job for me, but not consistently accros compilers, that's why I asked the question (I mean that for some compilers, the if is never triggered because the value is changed upon creation). – Chris Apr 13 '17 at 09:17
  • The standard doesn't define the representation of bool – M.M Apr 13 '17 at 09:37
  • Similar question, http://stackoverflow.com/questions/28207856/changing-a-bool-to-a-value-other-than-0-or-1/28208832#28208832 – M.M Apr 13 '17 at 09:43
  • @M.M Again, I never assumed it did (except I chose '3' as a test value and assumed it was neither true nor false on the compilers I tested, but I could have chosen 666 to be sure). The question was not about the representation of the bool, but if it should be put in the limits defined by std::numeric_limits upon assignment, which is not the case with VisualStudio. – Chris Apr 13 '17 at 09:45
  • @M.M Your link question is only similar in the sense of how I manipulate the stored bool value to "fake" a non valid bool (which comes from another code I do not control, from another library coded in C for example). I was asking about bool assignment from a value supposed to be a bool. I think there is no real answer in the standard and will assume it's just UB :) Thanks all for your inputs. – Chris Apr 13 '17 at 09:49
  • Huh? Your question is entirely about the representation of bool. `numeric_limits` has nothing to do with it. Assignment to a `bool` converts the argument to `true` or `false` and stores it ; the examples in your question both comply with the standard in this respect – M.M Apr 13 '17 at 09:50
  • @M.M I updated the question with numeric_limits that was in my whole library code but forgot to include here. And as you can see, the assignment do not convert to true or false in Visual case, it's out of the numeric_limits (which by the specs says min=false max=true) – Chris Apr 13 '17 at 10:05
  • @Chris visual studio does convert it to `true`, you are just misunderstanding the output of your program. Try `bool b = 3; cout << b << '\n';`The line `*pi = 3;` assigns to a `uint8_t`, not a `bool`. – M.M Apr 13 '17 at 10:30
  • @M.M I'm not sure what you mean by "convert it to bool", since once conversion (assignment in this case) has been made, I thought the standard imposed it still matches numeric_limits min/max, which is not the case here. Putting all the printing aside, which is irrelevant for my actual question. But anyway, I was not trying here to write bad code on purpose, but was wondering if I could protect my code from bad code from others (the ones passing an altered bool value), in case it was NOT UB and throw or assert in such a case. If it's really UB, then I do nothing. – Chris Apr 13 '17 at 12:18

4 Answers4

3

"in the case where the stored value is neither true nor false"

Why do you think that's the case? C++ does not restrict the binary representation of bool. On some compilers, true could be represented by 00000011, and other compilers could choose to represent false as 00000011.

But indeed, neither GCC not MSVC use that bit pattern for either bool value. That makes it indeed Undefined Behavior. UB can never be a compiler bug. A bug is where the implementation doesn't work as it should, but UB specifically means that any actual behavior is acceptable.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • Well c++ does not, but the stl does a little bit, since a boolean value has to be comprised btw std::numeric_limits::min and max – Chris Apr 12 '17 at 16:34
  • @Chris: I think you mean "Standard Library". That's an integral part of C++ so you can't make a difference there. – MSalters Apr 13 '17 at 07:11
  • Sure, but I was not assuming anything about the binary representation of the bool anyway. My point was just to determine if this behavior is a bug or if the spec tells it's UB. My code do not control the way the bool is created, but I do check if it's in the range of numeric_limits for other reasons (mainly because it's a template). So I was wondering if I should throw an error if the value is invalid for some reason (coding error in the code before me) or if it's UB (in that case I just assert). It's hard to make a small sample out of a huge library code to showcase the question :) – Chris Apr 13 '17 at 08:24
  • 1
    @Chris: Accessing a value using an expression of another type is only defined for a few special cases. `*pi=3` is valid if and only if `3` happens to be the binary representation of either `true` or `false`. Note that you can't say that `false` must be 0, that's how `static_cast` works. It converts values. `reinterpret_cast` converts bit patterns. – MSalters Apr 13 '17 at 08:45
2

The standard doesn't specity what the value representation of bool is . Compilers are free to make their own specification.

Your evidence suggests that VC++ requires true be represented as just the LSB set, whereas clang++ allows any non-zero representation to be true.

For VC++, your code causes undefined behaviour on the line bool const b2{ b };, specifically when it tries to read the value out of b. The bits set in the storage for b do not correspond to a value of b, and the standard doesn't define what happens in this situation, therefore it is undefined behaviour.

When undefined behaviour happens there are no guarantees whatsoever; all of the output of the program is meaningless. You cannot infer anything based on output statements that appear after this point (or even before it, actually).

M.M
  • 138,810
  • 21
  • 208
  • 365
  • That was exactly my question "For VC++, your code causes undefined behaviour on the line bool const b2{ b };", and I was looking for where the c++ specs tells this is UB (or not). – Chris Apr 13 '17 at 12:23
  • @Chris I think it is undefined by omission: the standard doesn't say what happens when a lvalue-to-rvalue conversion encounters a bit pattern not defined to mean any particular value, therefore it is undefined behaviour. A similar case would be, on common systems, if you set the bits in a `float` to something not covered by IEEE754 – M.M Apr 13 '17 at 13:09
  • Great thanks for the clarification, exactly what I was looking for. – Chris Apr 13 '17 at 13:20
1

Since I didn't, really, find information on casts from pointer-to-bool (or equivalent) in C++ standard (if usage of those is defined), I were reluctant to post this as an answer. But, on the second thought, I may as well post it - it may get elaborated upon by other people.

First of all, The C++14 standard defines bool as:

[basic.fundamental]

  1. Values of type bool are either true or false. [Note:There are no signed, unsigned, short, or long bool types or values. — end note] Values of type bool participate in integral promotions (4.5)

Since it participates in integral promotions, the following promotion is defined for it:

[conv.prom]

  1. A prvalue of type bool can be converted to a prvalue of type int, with false becoming zero and true becoming one.

And, since you are printing with std::ostream::operator<<, for bool, it is defined as follows:

[ostream.inserters.arithmetic]

  1. The classes num_get<> and num_put<> handle locale-dependent numeric formatting and parsing.

Since it uses num_put<> for actual output, the snippet of it, related to bool output is defined as:

[facet.num.put.virtuals]

  1. If (str.flags() & ios_base::boolalpha) == 0 returns do_put(out, str, fill, (int)val)

Since you don't use boolalpha in the example you show - typical integral promotion rules (described above) should apply.

In addition, I still can't explain why std::to_string(*pi) after *pi = 3 is still prints 3 in both cases, but it may, somehow, be related to:

[expr.reinterpret.cast]

  1. [Note: The mapping performed by reinterpret_cast might, or might not, produce a representation different from the original value.— end note]
Community
  • 1
  • 1
Algirdas Preidžius
  • 1,769
  • 3
  • 14
  • 17
  • The iostream was not really a concern, it was just here to print the value, and I was wondering why it's corrected in clang. But I do like your reference to c++14 concerning the bool (which I didn't find) and it kind-of makes me think it's a bug in Visual, since it should be constructed using either true or false, not something else! – Chris Apr 12 '17 at 16:38
  • @Chris I know, that `iostream` is not a concern, by itself, but I referenced it, to prove that, in the end, `bool` gets (or should get) casted to an `int`, when trying to print it, which should prompt integral conversion rules to kick-in. – Algirdas Preidžius Apr 12 '17 at 16:57
  • Exactly :) In fact the whole purpose of my question was to try to find an answer to "gets" (on some compilers) or "should get" (bug if the compiler do not). I'll flag your post as answer since it seems to be the one closest to what I was looking for, thanks again. – Chris Apr 13 '17 at 07:01
0

Not sure if this helps, but g++ exhibits same behavior as Visual-C++.

This is the output I got:

b: 0 pb: 0 pi: 0
b: 3 pb: 3 pi: 3
b2: 3 b3: 3

From what I understand (I'm on expert on c++ compilers), reinterpret_cast instructs the compiler to treat the collection of bits as the new type. So when you are telling the compiler to reinterpret the address of the boolean as an 8-bit integer it is essentially converting the original boolean to an 8-bit integer as well (if that makes sense).

So if my interpretation is correct (it isn't), perhaps this is a "bug" in clang++, not Visual or g++. reinterpret_cast is not very well supported between compilers, so this behavior is definitely worth noting when deciding which to use, if this is necessary for whatever reason.

edit:

I just realized this doesn't explain why b2 and b3 are 3 (non-boolean) as well. I wouldn't imagine it would make sense to treat new booleans as the 8-bit integers as well, regardless of the reinterpret_cast, so take this for what it's worth from a guy with 1 rep :)

  • "So when you are telling the compiler to reinterpret the address of the boolean as an 8-bit integer " - the actual code interprets it as _the address of_ an 8 bit integer. – MSalters Apr 13 '17 at 08:47