Why does boolean consume more memory than char?

Question

Why does a Boolean consume 4 bytes and a char 2 bytes in the .NET framework? A Boolean should take up 1bit or at least be smaller than a char.

out of curiosity, how much space do 2 booleans in a struct take up? — workmad3, Oct 15 '08 at 10:31
Just how many booleans are you expecting? Normally valuetypes will just be consumed by the stack, so unless you are dealing with a huge number of bools (like a string of chars), I would not worry. — leppie, Oct 15 '08 at 10:34
You can use one integer instead of 32 booleans. This saves more space... J/K — Yuval Peled, Oct 15 '08 at 10:35
Yuval, when dealing with an array of 'bits' that makes perfect sense. — leppie, Oct 15 '08 at 10:37
Use System.Collections.BitArray if you have a lot of binary values. — WOPR, Jan 12 '09 at 01:27
You're looking at the size of a boxed bool, not of a real one (see my answer for details)! The answer you chose is wrong. — Blaisorblade, Feb 05 '10 at 20:31
beacuase it is 32 bit and 32-bit processor typically works with 32-bit values. Working with smaller values involves longer instructions — AminM, May 31 '13 at 16:51

score 51 · Accepted Answer · answered Oct 15 '08 at 10:32

51

It is a question of memory alignment. 4-byte variables work faster than 2-byte ones. This is the reason why you should use int instead of byte or short for counters and the like.

You should use 2-byte variables only when memory is a bigger concern than speed. And this is the reason why char (which is Unicode in .NET) takes two bytes instead of four.

answered Oct 15 '08 at 10:32

Gorpik

10,940
4
36
56

5

You generally can't reference single bits of memory using standard architectures and doing so would be very inefficient. Bytes are usually the smallest addressable unit, and in this case a char is being considered which is 2 bytes. – workmad3 Oct 15 '08 at 11:01
4

Unboxed booleans take 1 byte - see below; this is simply not a valid answer to the questions. – Blaisorblade Oct 18 '11 at 13:06

score 17 · Answer 2 · edited Sep 03 '22 at 14:12

About boolean

Most other answers get it wrong - alignment and speed is why a programmer should stick to int for loop counters, not why the compiler can make a byte be 4-bytes wide. All of your reasonings, in fact, apply to byte and short as well as boolean.

In C# at least, bool (or System.Boolean) is a 1-byte wide builtin structure, which can be automatically boxed, so you have an object (which needs two memory words to be represented, at the very least, i.e. 8/16 bytes on 32/64 bits environments respectively) with a field (at least one byte) plus one memory word to point to it, i.e. in total at least 13/25 bytes.

That's indeed the 1st Google entry on "C# primitive types". http://msdn.microsoft.com/en-us/library/ms228360(VS.80).aspx

Also the quoted link (Link) also states that a boolean, by the CLI standard, takes 1 byte.

Actually, however, the only place where this is visible is on arrays of booleans - n booleans would take n bytes. In the other cases, one boolean may take 4 bytes.

Inside a structure, most runtimes (also in Java) would align all fields to a 4 byte boundary for performance. The Monty JVM for embedded devices is wiser - I guess it reorders fields optimally.
On the local frame/operand stack for the interpreter, in most implementation, for performance, one stack entry is one memory-word wide (and maybe on .NET it must be 64-bit wide to support double and long, which on .NET uses just 1 stack entry instead of 2 in Java). A JIT compiler can instead use 1 byte for boolean locals while keeping other vars aligned by reordering fields without performance impact, if the additional overhead is worth it.

About char

char are two bytes because when support for internationalization is required, using two-byte characters internally is the safest bet. This is not related directly to choosing to support Unicode, but to the choice to stick to UTF-16 and to the Basic Multilingual Plane. In Java and C#, you can assume all the time that one logical char fits into a variable of type char.

You *can* use characters outside of BMP in C#, and they are represented using two `char`s. Although it should be pretty rare. — svick, Jul 07 '11 at 21:25

score 8 · Answer 3 · answered Oct 15 '08 at 10:35

That's because in a 32-bit environment, the CPU can handle 32-bit values quicker than 8-bit or 16-bit values, so this is a speed/size tradeoff. If you have to save memory and you have a large quantity of bools, just use uints and save your booleans as the bits of 4 byte uints. Chars are 2 bytes wide since they store 16-bit Unicode characters.

score 3 · Answer 4 · answered Jan 12 '09 at 02:24

Regardless of the minor difference in memory storage, using Boolean for true/false yes/no values is important for developers (including yourself, when you have to revisit the code a year later), because it more accurately reflects your intent. Making your code more understandable is much more important than saving two bytes.

Making your code more accurately reflect your intent also reduces the likelihood that some compiler optimisation will have a negative effect. This advice transcends platforms and compilers.

score 2 · Answer 5 · answered Jan 12 '09 at 01:31

2

You should also use boolean to help write maintanable code. If I'm glancing at code seeing that something is a boolean is more then worth the memory savings to figure out that your using char as booleans.

answered Jan 12 '09 at 01:31

Jared

39,513
29
110
145

score 1 · Answer 6 · edited Aug 09 '22 at 14:03

1

I found this: "Actually, a Boolean is 4 bytes, not 2. The reason is that that's what the CLR supports for Boolean. I think that's what it does because 32 bit values are much more efficient to manipulate, so the time/space tradeoff is, in general, worth it. You should use the bit vector class (forget where it is) if you need to jam a bunch of bits together..."

It's written by Paul Wick at http://geekswithblogs.net/cwilliams/archive/2005/09/18/54271.aspx

edited Aug 09 '22 at 14:03

Glorfindel

21,988
13
81
109

answered Oct 15 '08 at 10:34

Ville Salonen

2,654
4
29
34

Huh! .NET should stop taking decisions for us. – Agnel Kurian Oct 15 '08 at 11:08
2

@Vulcan Eager: That's a joke right? The whole point of .NET is that it makes a lot of decisions for us (like garbage collection....) – Giovanni Galbo Oct 15 '08 at 11:28
1

I agree with Giovanni Galbo, if you want full control, you should do stuff in C or ASM. The beauty of .NET and C# that problems like this are taken care of by people who are probably much smarter than me or you. – Tamas Czinege Oct 15 '08 at 13:31

Joe · Answer 7 · 2008-10-15T11:13:41.803

1

Memory is only a concern if you have a large array of bits, in which case you can use the System.Collections.BitArray class.

edited Oct 15 '08 at 11:13

answered Oct 15 '08 at 10:52

Joe

122,218
32
205
338

score 1 · Answer 8 · answered Oct 15 '08 at 11:03

1

First of all you should use a profiler to determine where do you have memory problem, IMHO.

answered Oct 15 '08 at 11:03

Aen Sidhe

1,181
12
25

score 0 · Answer 9 · answered Oct 15 '08 at 11:15

Its because Windows and .Net have used Unicode (UTF 16) since inception as their internal character set. UTF 16 uses 2 bytes per character or a pair of 2 byte words per character but only if required as it is a variable width encoding.

"For characters in the Basic Multilingual Plane (BMP) the resulting encoding is a single 16-bit word. For characters in the other planes, the encoding will result in a pair of 16-bit words"

My guess regarding booleans would be they are four bytes as the default register is 32 bits and this would be the minimum size .Net could do a logical operation on efficiently, unless using bitwise operations.

Why does boolean consume more memory than char?

9 Answers9

Linked