3

I have read the answers to that question: Why is a char and a bool the same size in c++? and made an experiment to determine the size of allocated bytes in memory of a _Bool and a bool(I know that bool is a macro for _Bool in stdbool.h but for the sake of completeness I used it too) object in C, as well a bool object in C++ on my implementation Linux Ubuntu 12.4:

For C:

#include <stdio.h>
#include <stdbool.h>   // for "bool" macro.

int main()
{
    _Bool bin1 = 1;
    bool bin2 = 1; // just for the sake of completeness; bool is a macro for _Bool.    

    printf("the size of bin1 in bytes is: %lu \n",(sizeof(bin1)));
    printf("the size of bin2 in bytes is: %lu \n",(sizeof(bin2)));  

    return 0;
}

Output:

the size of bin1 in bytes is: 1
the size of bin2 in bytes is: 1

For C++:

#include <iostream>

int main()
{
    bool bin = 1;

    std::cout << "the size of bin in bytes is: " << sizeof(bin);

    return 0;
}

Output:

the size of bin in bytes is: 1 

So, objects of a boolean type, regardless of specifically C or C++, occupy 1 byte (8 bits) in memory, not just 1 bit.

My question is:

  • Why do objects of the types bool and _Bool in C and bool in C++ can store only the values of 0 or 1 if they occupy 1 byte in memory which could hold 256 values?

Of course, Their purpose is to represent only the values of 0 and 1 or true and false, but which unit or macro decides that it only can store 0 or 1?

Additional, but not my main question:

  • And what would happen if the value of a boolean type is *accidentally modified in memory to a greater value since it can be stored in memory this way?

*With accidentally I mean either: Modified by "Undetectable means" - What are “undetectable means” and how can they change objects of a C/C++ program? or an inappropriate assignment of f.e. bool a; a = 25;.

  • 2
    C evaluates truthiness as ```0 is false, everything else is true```. Changing a boolean to something other than 1 or 0 will result in ```true```. – Michael Bianconi Jan 22 '20 at 15:50
  • @MichaelBianconi Does that mean I am be able to assign a value of f.e. 240 to a boolean type object and it would be evaluated as `true` or `1`? – RobertS supports Monica Cellio Jan 22 '20 at 15:52
  • 1
    "*And what would happen if the value of a boolean type is accidentally modified in memory to a greater value?*" Have you [tried it](https://ideone.com/k0XyUr)? – scohe001 Jan 22 '20 at 15:53
  • 2
    It's the compiler who decides. Any code attempting to assign a non 0 or 1 to `bool` will generate instruction to convert this value to 0 or 1. If the memory is corrupted - well, that's UB. – Eugene Sh. Jan 22 '20 at 15:53
  • 2
    @RobertS-ReinstateMonica I don't know about C, but in C++ you can't "assign" that value to a `bool`. The value will first be implicitly converted to `bool`, resulting in either `true` or `false`. See https://stackoverflow.com/questions/23268357/why-does-bool-and-not-bool-both-return-true-in-this-case. – Max Langhof Jan 22 '20 at 15:54
  • 3
    A C++ `bool` is either `true` or `false`, not `0` or `1`. Sure, it can be converted to an integer with those values, but it's fundamentally wrong to think of a C++ `bool` in terms of integers, I'd say. And how large it is is an implementation detail that should usually not concern you. It's most likely 1 byte on most platforms since platforms that allow direct addressing of a single bit are rare. – Jesper Juhl Jan 22 '20 at 15:54
  • @RobertS-ReinstateMonica If the user is using ```if (variable)``` or ```if(!variable)```, yes. If the user is explicitly using ```if(variable == 1)``` for true, then no. – Michael Bianconi Jan 22 '20 at 15:55
  • @MichaelBianconi I'm not sure you and RobertS are referring to the same language, so your discussion is likely to result in misunderstanding. – François Andrieux Jan 22 '20 at 15:56
  • @FrançoisAndrieux What shall I alter? – RobertS supports Monica Cellio Jan 22 '20 at 16:05
  • @RobertS: "*the value of a boolean type is accidentally modified in memory*" There are no "accidents." If something happens, it's because you did something. Can you provide code which is valid C++ that modifies a `bool` value in such a way? – Nicol Bolas Jan 22 '20 at 16:07
  • 2
    Rolled back removal of C++ content since C++ answers were already given. – dbush Jan 22 '20 at 16:11
  • @NicolBolas I do not only mean modifications "internally" made inside of the program. I refer to another question of mine: [What are “undetectable means” and how can they change objects of a C/C++ program?](https://stackoverflow.com/questions/59811337/what-are-undetectable-means-and-how-can-they-change-objects-of-a-c-c-program) – RobertS supports Monica Cellio Jan 22 '20 at 16:18
  • @RobertS: That's kind of irrelevant. Whatever those "undetectable means" are, they must still put appropriate values into the object. – Nicol Bolas Jan 22 '20 at 16:51
  • note that you're having UB. The correct format specifier for `sizeof` is `%zu`, not `%lu`[ – phuclv Jan 23 '20 at 16:23

5 Answers5

5

The C language limits what can be stored in a _Bool, even if it has the capacity to hold other values besides 0 and 1.

Section 6.3.1.2 of the C standard says the following regarding conversions to _Bool:

When any scalar value is converted to _Bool, the result is 0 if the value compares equal to 0; otherwise, the result is 1.

The C++17 standard has similar language in section 7.14:

A prvalue of arithmetic, unscoped enumeration, pointer, or pointer to member type can be converted to a prvalue of type bool. A zero value, null pointer value, or null member pointer value is converted to false; any other value is converted to true. For direct-initialization (11.6), a prvalue of type std::nullptr_t can be converted to a prvalue of type bool; the resulting value is false.

So even if you attempt to assign some other value to a _Bool the language will convert the value to either 0 or 1 for C and to true or false for C++. If you attempt to bypass this by writing to a _Bool via a pointer to a different type, you invoke undefined behavior.

dbush
  • 205,898
  • 23
  • 218
  • 273
3

Answer for C++:

So, objects of a boolean type, regardless of specifically C or C++, occupy 1 byte (8 bits) in memory, not just 1 bit.

That's simply because the fundamental storage unit in the C++ memory model is the byte.

Why do objects of the type [...] bool in C++ can store only the values of 0 or 1 if they occupy 1 byte in memory which could hold 256 values?

But which unit or macro decides that it only can store 0 or 1?

The assumption here is wrong. In C++, a bool does not hold 0 or 1, it holds false or true: http://eel.is/c++draft/basic.fundamental#10.

How those two values are represented in memory is up to the implementation. An implementation could use 0 and 1, or 0 and 255, or 0 and <any nonzero value>, or anything it wants really. You are not guaranteed to find 0 or 1 when inspecting the memory of a bool, because...

It is the job of the compiler to ensure that the above two things hold true, regardless of how the bool values are represented in memory. Remember, C++ is specified on the abstract machine, and your program only has to behave as if executed on the abstract machine.

And what would happen if the value of a boolean type is accidentally modified in memory to a greater value?

Undefined behavior. See one of these:

Community
  • 1
  • 1
Max Langhof
  • 23,383
  • 5
  • 39
  • 72
1

(Answering for C.)

But which unit or macro decides that it only can store 0 or 1?

In typical C implementations, the compiler implements this. The compiler decides (or is designed to) which instructions to use when manipulating _Bool values. It might test a _Bool with an instruction that sets a condition code according to whether the byte is zero or non-zero it might test it with an instruction that sets a condition code according to whether the low bit (for example) is zero or non-zero. The C standard does not impose any requirements on this. Each C implementation is free to choose its own implementation.

And what would happen if the value of a boolean type is accidentally modified in memory to a greater value?

This depends on the C implementation. A greater value might be treated as 1, if the implementation is testing zero versus non-zero. A greater value might be treated according to its low bit, if the implementation is using that. A greater value might behave differently in different circumstances, if the implementation uses varied instructions according to circumstances. A greater value also might cause results that would be otherwise nonsensical. For example, given int x = 4; and some _Bool y that has been inappropriately modified by writing to its memory, int z = x + y; might set z to 10 even though only 4 or 5 would be possible if y were a proper _Bool. When you modify the representation of a type to something other than bits that represent a proper value as defined by the implementation, the resulting behavior is not defined by the C standard, or, generally, by the C implementation.

Would it even be possible and permissible to assign a greater value to a boolean type?

No, assignments convert the right operand to the type of the assignment expression (which is the type of the left operand, except as a value rather than an lvalue).

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
1

Why do objects of the types bool and _Bool in C and bool in C++ can store only the values of 0 or 1 if they occupy 1 byte in memory which could hold 256 values?

Because, in the end, the language specification doesn't say how large a bool is, it only defines what it can do. The C language specification says that _Bool can hold a 0 or a 1. The size of a bool data type is a detail of individual implementations, not a part of the specification itself. It's possible to have an implementation that actually allocates individual bits for a bool, it's possible to have a specification that allocates multiple bytes for a bool. So to stay inline with the specification, the important part isn't the size of the memory allocated, but that it performs according to the specification which means it holds a 0 or a 1.

And what would happen if the value of a boolean type is accidentally modified in memory to a greater value since it can be stored in memory this way?

Undefined behavior I expect. I don't think that the specification says what happens, and as a result what happens is up to the implementor. One implementation may examine the first bit of the underlying memory and ignore the rest. Another implementation may examine the entire underlying memory location and if any of the bits are set, give a value of 1.

The word of caution...

You can write a program to see what your implementation does with such data and write programs that work will for your implementation, but know that you're not testing what 'C' does, you're testing what that particular implementation/compiler will do. Also, know that once you start treading into the waters of undefined behavior, you also start treading into the waters of things that will break programs for reasons that you may not understand. Compilers will apply a wide variety of optimizations based on a number of assumptions. The compiler may write a program that performs perfectly fine while you're doing a bunch of work, you finish it, you tell the compiler to create an optimized release version and because you've been digging into undefined behaviors you broke an assumption the compiler made and, it may apply an optimization that suddenly breaks your code and tracking it down may prove tremendously difficult. Always try to stick within well defined behaviors.

Darinth
  • 511
  • 3
  • 14
1

Why do objects of the types bool and _Bool in C and bool in C++ can store only the values of 0 or 1 if they occupy 1 byte in memory which could hold 256 values?

If bool can store the whole value range of a char then why don't just use char?


Of course, Their purpose is to represent only the values of 0 and 1 or true and false, but which unit or macro decides that it only can store 0 or 1?

The compiler will handle the conversion when you assign a value to a bool variable. If it's truthy then the variable will contains true. That behavior was defined in the C and C++ standards. That means bool a; a = 25; is completely valid and not "an inappropriate assignment" as you though. After that a will always contain true/1. You can never set a bool to anything other than 0 and 1 via normal variable assignment

There's no problem using a char or an int as bool like how it was before modern C and C++, but by limiting the value range it also allows the compiler to do a lot of optimizations. For example bool x = !y; will be done by a simple XOR instruction, which will not work if y contains any values other than 0 and 1. If y is a normal integer type then you'll need to normalize y to 0 and 1 first. See demo

As a matter of fact, not all bits in the representation have to involve in value calculation, and not all bit patterns have to be valid. C and C++ allow types to contain padding bits and trap representations, so a 32-bit type may have only 30 value bits, or be able to store only 232-4 different values. This is not to say that bool definitely contains padding bits, just a proof that you are permitted to have a type narrower than the possible range

The only exception we are aware of is _Bool (as observed by Joseph Myers wrt GCC). It seems one could either (a) take non {0,1} values to be trap representations in the current sense, or (b) regard operations on non {0,1} values of that type as giving an unspecified value. The latter would bound possible misbehaviour, which would be good for programmers; the only possible downside we are aware of is that it could limit compilation via computed branch tables indexed by unchecked _Bool values.

N2091: Clarifying Trap Representations (Draft Defect Report or Proposal for C2x)

However some implementations do consider them trap representations

In fact, as implemented by GCC and Clang, the _Bool type has two values and 254 trap representations.

Trap representations and padding bits - Pascal Cuoq


And what would happen if the value of a boolean type is accidentally modified in memory to a greater value?

If you manipulate the value of the bool to another value directly via a pointer then in C++ undefined behavior will happen

6.9.1 Fundamental types

Values of type bool are either true or false. 50 [Note: There are no signed, unsigned, short, or long bool types or values. — end note] Values of type bool participate in integral promotions (7.6).

50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false.

C++17

I couldn't find the reference in C99 but it will be undefined behavior if the value you set is a trap representation

6.2.6 Representations of types

Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.41) Such a representation is called a trap representation.

There are already many questions regarding that "weird" behavior

Community
  • 1
  • 1
phuclv
  • 37,963
  • 15
  • 156
  • 475
  • "*If you manipulate the value of the bool to another value directly via a pointer then in C it looks like the value modulo 2 will be used,*": I don't understand what you try to say here. In what case do you think this happens? – walnut Jan 22 '20 at 18:12
  • @walnut `bool b; memset(&b, 2, 1)` for example. See the links at the end, there are tons of examples – phuclv Jan 22 '20 at 23:15
  • But the links are saying that (in both C and C++) it is undefined behavior (to use it afterwards), not that it will be interpreted taking the value as integer modulo 2. Your quoted passage is about type conversions and from what I can tell does not even apply to conversions to `_Bool`, because that is explicitly handled in the preceding paragraph and because of the wording "*other than `_Bool`*". – walnut Jan 22 '20 at 23:19
  • @walnut you're right. I couldn't find any references regarding the behavior in C99 and I glossed over the previous part so I thought that it's what was defined in the standard – phuclv Jan 23 '20 at 05:34