1

A bool takes 1 byte in C++. But, why does a bool[8] take 8 bytes instead of 1 byte? One byte has enough space for 8 bits.

I compiled this with GCC using -Os flag:

#include <iostream>

using namespace std;

class Foo
{
    public:
        bool m_bool[8];
};

int main ()
{
    cout << "Size: " << sizeof(Foo) << " byte(s) " << endl;

   return 0;
}

It returns "Size: 8 byte(s)".

Is there a way to optimize it?

Ry-
  • 218,210
  • 55
  • 464
  • 476
Lucas Lima
  • 1,465
  • 2
  • 16
  • 28
  • 4
    You can use a bitset if you *need* the size. Using a width of 1 bit is slower than just using a normal data type, though. – chris Aug 31 '12 at 02:13
  • 6
    A bool does not take up 1 bit in C++, only a vector can provide that compression. – Richard J. Ross III Aug 31 '12 at 02:14
  • 3
    No object can be smaller than 1 byte in C/C++. Putting them in an array doesn't "magically" compress them. – Mysticial Aug 31 '12 at 02:14
  • Ah, good point @RichardJ.RossIII. I forgot those behaved specially. – chris Aug 31 '12 at 02:15
  • 1
    I also think there is no requirement for a bool to be 1 byte in size - trying to find it – Adrian Cornish Aug 31 '12 at 02:17
  • @AdrianCornish, No, I've seen it 4 bytes to go hand in hand with a 32-bit processor and I'm pretty darn sure there's no width requirement. – chris Aug 31 '12 at 02:17
  • @chris - Thanks - I've never seen one either - cannot find it though in C++11 standard. – Adrian Cornish Aug 31 '12 at 02:19
  • @AdrianCornish, Try § 3.9.1. It has size requirements mentioned for other types, but nothing for `bool`. – chris Aug 31 '12 at 02:21
  • @chris: `73) sizeof(bool) is not required to be 1.` However, all compilers I know of use `1` though. – Jesse Good Aug 31 '12 at 02:23
  • @chris - Historically also the standard says things like 'must be 16bits wide` but never specifies a maximum – Adrian Cornish Aug 31 '12 at 02:23
  • @JesseGood Hearsay is a dangerous standard to code to – Adrian Cornish Aug 31 '12 at 02:23
  • @JesseGood, Ah, that's pretty clear, thanks. The other part is pretty explicit as well: *sizeof(bool), sizeof(char16_t), sizeof(char32_t), and sizeof(wchar_t) are implementation-defined.* – chris Aug 31 '12 at 02:25
  • @JesseGood Got it - you should have added the 'implementation-defined' I took your answer as that it is 1 :-) Sorry. – Adrian Cornish Aug 31 '12 at 02:27
  • @Mysticial: http://www.cplusplus.com/reference/stl/vector/ at the bottom. It seems to be standard, and a mistake from standard makers: http://en.wikipedia.org/wiki/Sequence_container_%28C%2B%2B%29#Specialization_for_bool – nhahtdh Aug 31 '12 at 02:36
  • 1
    @nhahtdh Indeed. A lot of people hate `vector` because it breaks several consistencies with vector behavior. I've honestly only seen one situation where it ended up being faster than `deque`. – Mysticial Aug 31 '12 at 02:38
  • If a bool didn't take up at least one byte, it would be impossible to have a pointer to a boolean. (Unless you added support for floating-point pointers. Hmm.... ;)) – Jeremy Friesner Aug 31 '12 at 03:23
  • @Mysticial: "no faster than `deque`" might be true but irrelevant. I thought the motivation for `vector` is that it's *smaller*, not that it's faster. Some users might expect it to therefore sometimes be faster, because of caching, but I don't think that's the intention. But anyway, regardless of whether it does what it was invented to do, it's *not a Container*, so I absolutely agree that it causes problems. One of the biggest being that even if you have no intention to "cleverly optimize" anything, if you write `vector` in generic code you risk someone might supply `T = bool`. – Steve Jessop Aug 31 '12 at 08:42
  • @Adrian: but the great thing about hearsay is that you can reach whatever conclusion you prefer, whereas the standard is mostly unambiguous. Compare Jesse's comment with http://stackoverflow.com/a/266902/13005 :-) – Steve Jessop Aug 31 '12 at 08:53

5 Answers5

7

The compiler has to allow for you to take the addresses of the individual bools, i.e. things like

Foo foo;
bool* p = &foo.m_bool[0];
bool* q = &foo.m_bool[1];

If the bools were packed what would p and q be?

user515430
  • 3,341
  • 2
  • 17
  • 13
  • If the standards committee wanted to get *crazy* they could define a bool* to be special, similar to a member function pointer. It would then be a structure containing a pointer to a byte and a bit offset. – Zan Lynx Mar 11 '13 at 17:20
  • @ZanLynx That would clutter the standard a lot, since `char` is basically defined as the minimum pointer stride, and `CHAR_BITS >= 8` must hold. You'd have to introduce a separate pointer type for bits to pull that off. –  Jul 28 '14 at 16:44
  • @Rhymoid Misprint, sizeof(char) = 1 by definition. – user515430 Jul 28 '14 at 16:48
  • @Rhymoid: A separate pointer type for bool was exactly where my comment was going. I did say it would be a crazy idea. – Zan Lynx Jul 30 '14 at 20:27
  • @Rhymoid: The pointer type would be `struct {unsigned char *p; size_t offset;}` and `*p = true` from the example above would have the compiler synthesize `*p.p |= (1< – Zan Lynx Jul 30 '14 at 20:29
  • @ZanLynx Something like that! But like I wrote, the *really* crazy part would be that everything a user knew about pointers would be wrong, so it wouldn't really be a pointer anymore. So where I wrote 'separate pointer type', I sort of meant 'separate pointer kind'. Maybe. I don't know the theory behind C's type system too well. –  Jul 31 '14 at 03:59
  • 1
    @Rhymoid: Everything a user knows about pointer types *IS* wrong if it is a pointer to class member. See? C++ *already has* crazy pointer types. – Zan Lynx Aug 11 '14 at 15:43
2

First off bool is not guaranteed to be a size of 1. Second when you group 8 ones together why would you still expect the result to be 1?

8 x 1 = 8

Adrian Cornish
  • 23,227
  • 13
  • 61
  • 77
  • 1 byte = 8 bits. If the compiler use 1 bit per bool should be possible. – Lucas Lima Aug 31 '12 at 02:27
  • 3
    Wrong - there is no requirement for a byte to be 8 bits. Thats why we have the macro CHAR_BIT – Adrian Cornish Aug 31 '12 at 02:28
  • Oh! Ok. I didn't know about the CHAR_BIT macro. In that case CHAR_BIT is 8. But, a byte is 8 bits, a 'char' is CHAR_BIT bits. – Lucas Lima Aug 31 '12 at 02:52
  • 1
    No a byte is a byte CRAY's I think used 10bit bytes - down to the CPU arch. char==byte but the size if not mandated. A byte is not a technically defined term. – Adrian Cornish Aug 31 '12 at 02:54
  • 1
    You are right. I researched here and i found the IEEE 1541-2002. It defines the 'byte' term but not the size, like you said. I really didn't know it. – Lucas Lima Aug 31 '12 at 03:15
  • @LucasNunes Cudos (and +1) on actually looking :-) – Adrian Cornish Aug 31 '12 at 03:18
  • @AdrianCornish According to IEEE 1541-2002, byte (abbreviated as 'B') *is* a technically defined term; it's "a set of adjacent bits operated on as a group". An octet (abbreviated as 'o') is such a group of eight bits. –  Jul 28 '14 at 16:48
1

Since I didn't see it mentioned in the comments above, I'll mention a concept in response to "Is there a way to optimize it?" in case you haven't worked with it yet. It's called bitmasking, where you essentially use an int as a series of bits, and use the bitwise operators to evaluate individual bits in the integer.

In order to easily set the bits in the string appropriately, it is common to define some constants which are semantically named and set to values of powers of 2 (so that they only "flip" one bit. You can easily use the bitshift operator to make it obvious which bit is being flipped:

#define IS_ADMIN = 1<<0;
#defing CLEAR_CACHE = 1<<1;

Then you test for admin like this:

if(userSettings & IS_ADMIN) { ...

Here's a starting-point wiki article

Chris Trahey
  • 18,202
  • 1
  • 42
  • 55
  • I have never tried it but I have a suspicion that bit twiddling is slower on modern Intel processors than just storing an overly large int value - especially with 64bit registers - to test bool it still needs a 64bit register – Adrian Cornish Aug 31 '12 at 02:32
  • I think the real gains come when you start comparing more than one bit at a time, for instance if the presence of any one of 3 different bits (think permissions) would indicate a "truthy" case for your business rules. – Chris Trahey Aug 31 '12 at 02:36
  • Possibly in that scenario. I am stuck with some bitfields I deal with at work which are 'flags' of sort and they vary between 1-4 bits in length - originally they were stored in a 32bit because we transmitted this data via a satellite feed. Today it adds unnecessary complication and computation power to decode them, since all our client have 100Mb connections minimum and that I data rate has increased exponentially since then ;-) – Adrian Cornish Aug 31 '12 at 02:40
  • @AdrianCornish: Bit twiddling is not significantly time consuming or slower. A bit shift or mask is even simpler than addition and takes 1 cycle. Two or three non-dependent bit operations in series can also take ... 1 cycle. In comparison a L1 cache access on Intel Nehalem is 4 cycles. If you can pack two data items in one byte you've doubled your speed when accessing main RAM, network or disk storage. – Zan Lynx Mar 11 '13 at 17:35
0

"Is there a way to optimize it?"

Are you sure optimization would be beneficial? You could use enums as flags instead of bools to store them all in a byte.

How to use enums as flags in C++?

Community
  • 1
  • 1
Steve Wellens
  • 20,506
  • 2
  • 28
  • 69
0

Is there a way to optimize it?

You can do that using a so-called bit-field. Bit-field values cannot have their address taken.

Quirin F. Schroll
  • 1,302
  • 1
  • 11
  • 25