106

Which value is better to use? Boolean true or Integer 1?

The above topic made me do some experiments with bool and int in if condition. So just out of curiosity I wrote this program:

int f(int i) 
{
    if ( i ) return 99;   //if(int)
    else  return -99;
}
int g(bool b)
{
    if ( b ) return 99;   //if(bool)
    else  return -99;
}
int main(){}

g++ intbool.cpp -S generates asm code for each functions as follows:

  • asm code for f(int)

    __Z1fi:
       LFB0:
             pushl  %ebp
       LCFI0:
              movl  %esp, %ebp
       LCFI1:
              cmpl  $0, 8(%ebp)
              je    L2
              movl  $99, %eax
              jmp   L3
       L2:
              movl  $-99, %eax
       L3:
              leave
       LCFI2:
              ret
    
  • asm code for g(bool)

    __Z1gb:
       LFB1:
              pushl %ebp
       LCFI3:
              movl  %esp, %ebp
       LCFI4:
              subl  $4, %esp
       LCFI5:
              movl  8(%ebp), %eax
              movb  %al, -4(%ebp)
              cmpb  $0, -4(%ebp)
              je    L5
              movl  $99, %eax
              jmp   L6
       L5:
              movl  $-99, %eax
       L6:
              leave
       LCFI6:
              ret
    

Surprisingly, g(bool) generates more asm instructions! Does it mean that if(bool) is little slower than if(int)? I used to think bool is especially designed to be used in conditional statement such as if, so I was expecting g(bool) to generate less asm instructions, thereby making g(bool) more efficient and fast.

EDIT:

I'm not using any optimization flag as of now. But even absence of it, why does it generate more asm for g(bool) is a question for which I'm looking for a reasonable answer. I should also tell you that -O2 optimization flag generates exactly same asm. But that isn't the question. The question is what I've asked.


Community
  • 1
  • 1
Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • would depend if cmpb is faster than cmpl – Yet Another Geek Apr 23 '11 at 15:09
  • 36
    It's also an unfair test unless you compare them with reasonable optimizations enabled. – Daniel Pryden Apr 23 '11 at 15:11
  • 10
    @Daniel: I'm not using any optimization flags with either of them. But even absence of it, why does it generates more asm for `g(bool)` is a question for which I'm looking for a reasonable answer. – Nawaz Apr 23 '11 at 15:14
  • If there is any actual code in the curly braces following the if statement, then ... the answer is.... *they are the same.* – Cheeso Apr 23 '11 at 15:15
  • I suggest you use -O3, but make sure the compiler doesn't just inline and negate your code because it has no effect at the moment. :P – hiddensunset4 Apr 23 '11 at 15:16
  • @Nawaz I also think it depends on the architecture and compiler – Yet Another Geek Apr 23 '11 at 15:16
  • @Nawaz That's the problem. One version may, for extraneous reason, generate code that takes a couple more of instructions when compiled without optimizations, but a better measure of the intrinsic speed of testing int or bool would be to compare both compiled with -O. Without optimizations, you're only testing which one is more directly translated in the compiler's intermediate representation(s). – Pascal Cuoq Apr 23 '11 at 15:18
  • Also, is it moving the bool onto the %eax% register because it isn't word size? In comparison to int, which is word size therefore that is one less optimization; which would surely have been done if optimization flats were enabled. – hiddensunset4 Apr 23 '11 at 15:19
  • 8
    Why would you go to the trouble of reading the asm, *but not just running the program and timing the result*? The number of instructiosn doesn't really say much about performance. You need to factor in not just instruction lengths, but also dependencies and the types of instructions (are some of them decoded using the slower microcode path, which execution units do they require, what is the latency and throughput of the instruction, is it a branch? A memmory access? – jalf Apr 23 '11 at 15:34
  • 1
    @Nawaz: " why does it generates more asm for `g(bool)` is a question for which I'm looking for a reasonable answer" That's not what the title says. – GManNickG Apr 23 '11 at 22:40
  • a) For any reasonable work your code will ever do, the time difference between if (bool) or if (int) should be mostly irrelevant. Write readable code. b) Write a testing environment, where you just pass in 2 alternatives of a function, which provides you with empirical data from 1 to 10 program runs with 1000 to 1000000000 cases with different compiler settings and different compilers - by manufacture and version - and look at your micro benchmarks yourself. – user unknown Apr 24 '11 at 02:32
  • 2
    @user unknown,and @Malvolio: That is obviously; I'm not doing all these for production code. As I already mentioned in the beginning of my post that *"So just out of curiosity I wrote this program"*. So yeah, its a *purely hypothetical one*. – Nawaz Apr 24 '11 at 04:38
  • 3
    It's a legitimate question. They're either equivalent or one is faster. The ASM was probably posted in an attempt to be helpful or think out loud, so rather than use it as a way to dodge the question and say "just write readable code", just answer the question or STFU if you don't know or don't have anything useful to say ;) My contribution is that the question is answerable, and "just write readable code" is nothing but a dodging of the question. – Triynko Apr 27 '11 at 18:12

8 Answers8

107

Makes sense to me. Your compiler apparently defines a bool as an 8-bit value, and your system ABI requires it to "promote" small (< 32-bit) integer arguments to 32-bit when pushing them onto the call stack. So to compare a bool, the compiler generates code to isolate the least significant byte of the 32-bit argument that g receives, and compares it with cmpb. In the first example, the int argument uses the full 32 bits that were pushed onto the stack, so it simply compares against the whole thing with cmpl.

Sherm Pendley
  • 13,556
  • 3
  • 45
  • 57
  • 4
    I agree. This helps to illuminate that when choosing a variable type, you're choosing it for two potentially competing purposes, storage space vs. computational performance. – Triynko Apr 27 '11 at 18:20
  • 7
    Does this also apply to 64-bit processes, that `__int64` is faster than `int`? Or CPU deals 32-bit integer with 32-bit instruction sets separately? – Reci Apr 27 '11 at 21:22
  • 3
    @CrendKing maybe it's worth rolling another question? – Display Name Dec 28 '13 at 19:25
  • 1
    I wish you put more information in your answer; so what you mean the compiler would execute more instructions to cast the one byte `bool` to the stack instead of pushing 32-bit cell to the stack in one go, no casting required, is that what you mean ? that it's in this cast using `int` is faster ? – R1S8K May 27 '21 at 06:36
85

Compiling with -03 gives the following for me:

f:

    pushl   %ebp
    movl    %esp, %ebp
    cmpl    $1, 8(%ebp)
    popl    %ebp
    sbbl    %eax, %eax
    andb    $58, %al
    addl    $99, %eax
    ret

g:

    pushl   %ebp
    movl    %esp, %ebp
    cmpb    $1, 8(%ebp)
    popl    %ebp
    sbbl    %eax, %eax
    andb    $58, %al
    addl    $99, %eax
    ret

.. so it compiles to essentially the same code, except for cmpl vs cmpb. This means that the difference, if there is any, doesn't matter. Judging by unoptimized code is not fair.


Edit to clarify my point. Unoptimized code is for simple debugging, not for speed. Comparing the speed of unoptimized code is senseless.

gsamaras
  • 71,951
  • 46
  • 188
  • 305
Alexander Gessler
  • 45,603
  • 7
  • 82
  • 122
  • `-O2` is more than enough, as it generates the same asm (which I mentioned in the EDIT). – Nawaz Apr 23 '11 at 15:20
  • 8
    As much as I agree with your conclusion, I think you're skipping the interesting part. *Why* does it use `cmpl` for one and `cmpb` for the other? – jalf Apr 23 '11 at 15:35
  • 24
    @jalf: Because a `bool` is a single byte and an `int` is four. I don't think there's anything more special than that. – CB Bailey Apr 23 '11 at 15:38
  • 7
    I think other responses paid greater attention to the reasons: it's because the platform in question treats `bool` as a 8 bit type. – Alexander Gessler Apr 23 '11 at 15:40
  • 3
    @Charles: right, but imo that belongs in the answer. You can't really conclued that "there's no difference" if you find a difference and don't mention what its implications are. :) – jalf Apr 23 '11 at 15:40
  • @Charles Bailey: A `bool` is a single **bit**, not byte. – Nathan Moos Apr 23 '11 at 20:10
  • 9
    @Nathan: No. C++ has no bit data types. The smallest type is `char`, which is a byte by definition, and is the smallest addressable unit. `bool`'s size is implementation-defined, and may be 1, 4, or 8, or whatever. Compilers tend to make it one, though. – GManNickG Apr 23 '11 at 20:20
  • 6
    @Nathan: Well that's tricky in Java, too. Java says the data a boolean represents is the value of one bit, but how that bit is stored is still implementation defined. Pragmatic computers simply don't address bits. – GManNickG Apr 23 '11 at 20:24
  • I thought `-g` was for simple debugging. – tchrist Apr 24 '11 at 01:18
  • `-g` is actually good to make gcc emit debug information, but it gets more and more useless as the optimizer level is increased. The gcc manual puts it nicely: `GCC allows you to use -g with -O. The shortcuts taken by optimized code may occasionally produce surprising results`. – Alexander Gessler Apr 24 '11 at 01:24
27

When I compile this with a sane set of options (specifically -O3), here's what I get:

For f():

        .type   _Z1fi, @function
_Z1fi:
.LFB0:
        .cfi_startproc
        .cfi_personality 0x3,__gxx_personality_v0
        cmpl    $1, %edi
        sbbl    %eax, %eax
        andb    $58, %al
        addl    $99, %eax
        ret
        .cfi_endproc

For g():

        .type   _Z1gb, @function
_Z1gb:
.LFB1:
        .cfi_startproc
        .cfi_personality 0x3,__gxx_personality_v0
        cmpb    $1, %dil
        sbbl    %eax, %eax
        andb    $58, %al
        addl    $99, %eax
        ret
        .cfi_endproc

They still use different instructions for the comparison (cmpb for boolean vs. cmpl for int), but otherwise the bodies are identical. A quick look at the Intel manuals tells me: ... not much of anything. There's no such thing as cmpb or cmpl in the Intel manuals. They're all cmp and I can't find the timing tables at the moment. I'm guessing, however, that there's no clock difference between comparing a byte immediate vs. comparing a long immediate, so for all practical purposes the code is identical.


edited to add the following based on your addition

The reason the code is different in the unoptimized case is that it is unoptimized. (Yes, it's circular, I know.) When the compiler walks the AST and generates code directly, it doesn't "know" anything except what's at the immediate point of the AST it's in. At that point it lacks all contextual information needed to know that at this specific point it can treat the declared type bool as an int. A boolean is obviously by default treated as a byte and when manipulating bytes in the Intel world you have to do things like sign-extend to bring it to certain widths to put it on the stack, etc. (You can't push a byte.)

When the optimizer views the AST and does its magic, however, it looks at surrounding context and "knows" when it can replace code with something more efficient without changing semantics. So it "knows" it can use an integer in the parameter and thereby lose the unnecessary conversions and widening.

JUST MY correct OPINION
  • 35,674
  • 17
  • 77
  • 99
  • 1
    Haha, I like how the compiler simply returned 99, or 99+58 = 157 = -99 (overflow of signed 8bits)... interesting. – hiddensunset4 Apr 23 '11 at 15:29
  • @Daniel: Even I liked that. At first, I said "where is -99" and immediately I realized that its doing something very kinky. – Nawaz Apr 23 '11 at 15:39
  • 7
    `l` and `b` are suffixes used in AT&T syntax only. They just refer to versions of `cmp` using 4 byte (long) and 1 byte (byte) operands respectively. Where there is any ambiguity in intel syntax, conventionally the memory operand is tagged with `BYTE PTR`, `WORD PTR` or `DWORD PTR` instead of putting a suffix on the opcode. – CB Bailey Apr 23 '11 at 16:00
  • Timing tables: http://agner.org/optimize/ Both operand-sizes of `cmp` have the same performance, and there are no partial-register penalties for *reading* `%dil`. (But that doesn't stop clang from amusingly creating a partial-register stall by using byte-size `and` on AL as part of branchlessly case-flipping between 99 and -99.) – Peter Cordes May 04 '18 at 01:30
13

With GCC 4.5 on Linux and Windows at least, sizeof(bool) == 1. On x86 and x86_64, you can't pass in less than an general purpose register's worth to a function (whether via the stack or a register depending on the calling convention etc...).

So the code for bool, when un-optimized, actually goes to some length to extract that bool value from the argument stack (using another stack slot to save that byte). It's more complicated than just pulling a native register-sized variable.

Mat
  • 202,337
  • 40
  • 393
  • 406
  • From the C++03 standard, §5.3.3/1: "*`sizeof(bool)` and `sizeof(wchar_t)` are implementation-defined.*" So saying `sizeof(bool) == 1` is not strictly correct unless you're talking about a specific version of a specific compiler. – ildjarn Apr 23 '11 at 15:40
9

At the machine level there is no such thing as bool

Very few instruction set architectures define any sort of boolean operand type, although there are often instructions that trigger an action on non-zero values. To the CPU, usually, everything is one of the scalar types or a string of them.

A given compiler and a given ABI will need to choose specific sizes for int and bool and when, like in your case, these are different sizes they may generate slightly different code, and at some levels of optimization one may be slightly faster.

Why is bool one byte on many systems?

It's safer to choose a char type for bool because someone might make a really large array of them.

Update: by "safer", I mean: for the compiler and library implementors. I'm not saying people need to reimplement the system type.

DigitalRoss
  • 143,651
  • 25
  • 248
  • 329
  • 2
    +1 Imagine the overhead on x86 if `bool` were represented by bits; so byte will be a nice tradeoff for speed/data compactness in many implementations. – hardmath Apr 23 '11 at 23:57
  • 1
    @Billy: I think he wasn't saying "use `char` instead of `bool`" but instead simply used "`char` type" to mean "1 byte" when referring to the size the compiler chooses for `bool` objects. – Dennis Zickefoose Apr 24 '11 at 06:33
  • Oh, sure, I didn't mean that each program should choose, I was just advancing a rationale for why the system bool type is 1 byte. – DigitalRoss Apr 24 '11 at 06:35
9

Yeah, the discussion's fun. But just test it:

Test code:

#include <stdio.h>
#include <string.h>

int testi(int);
int testb(bool);
int main (int argc, char* argv[]){
  bool valb;
  int  vali;
  int loops;
  if( argc < 2 ){
    return 2;
  }
  valb = (0 != (strcmp(argv[1], "0")));
  vali = strcmp(argv[1], "0");
  printf("Arg1: %s\n", argv[1]);
  printf("BArg1: %i\n", valb ? 1 : 0);
  printf("IArg1: %i\n", vali);
  for(loops=30000000; loops>0; loops--){
    //printf("%i: %i\n", loops, testb(valb=!valb));
    printf("%i: %i\n", loops, testi(vali=!vali));
  }
  return valb;
}

int testi(int val){
  if( val ){
    return 1;
  }
  return 0;
}
int testb(bool val){
  if( val ){
    return 1;
  }
  return 0;
}

Compiled on a 64-bit Ubuntu 10.10 laptop with: g++ -O3 -o /tmp/test_i /tmp/test_i.cpp

Integer-based comparison:

sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null

real    0m8.203s
user    0m8.170s
sys 0m0.010s
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null

real    0m8.056s
user    0m8.020s
sys 0m0.000s
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null

real    0m8.116s
user    0m8.100s
sys 0m0.000s

Boolean test / print uncommented (and integer commented):

sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null

real    0m8.254s
user    0m8.240s
sys 0m0.000s
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null

real    0m8.028s
user    0m8.000s
sys 0m0.010s
sauer@trogdor:/tmp$ time /tmp/test_i 1 > /dev/null

real    0m7.981s
user    0m7.900s
sys 0m0.050s

They're the same with 1 assignment and 2 comparisons each loop over 30 million loops. Find something else to optimize. For example, don't use strcmp unnecessarily. ;)

dannysauer
  • 3,793
  • 1
  • 23
  • 30
2

It will mostly depend on the compiler and the optimization. There's an interesting discussion (language agnostic) here:

Does "if ([bool] == true)" require one more step than "if ([bool])"?

Also, take a look at this post: http://www.linuxquestions.org/questions/programming-9/c-compiler-handling-of-boolean-variables-290996/

Community
  • 1
  • 1
Aleadam
  • 40,203
  • 9
  • 86
  • 108
0

Approaching your question in two different ways:

If you are specifically talking about C++ or any programming language that will produce assembly code for that matter, we are bound to what code the compiler will generate in ASM. We are also bound to the representation of true and false in c++. An integer will have to be stored in 32 bits, and I could simply use a byte to store the boolean expression. Asm snippets for conditional statements:

For the integer:

  mov eax,dword ptr[esp]    ;Store integer
  cmp eax,0                 ;Compare to 0
  je  false                 ;If int is 0, its false
  ;Do what has to be done when true
false:
  ;Do what has to be done when false

For the bool:

  mov  al,1     ;Anything that is not 0 is true
  test al,1     ;See if first bit is fliped
  jz   false    ;Not fliped, so it's false
  ;Do what has to be done when true
false:
  ;Do what has to be done when false

So, that's why the speed comparison is so compile dependent. In the case above, the bool would be slightly fast since cmp would imply a subtraction for setting the flags. It also contradicts with what your compiler generated.

Another approach, a much simpler one, is to look at the logic of the expression on it's own and try not to worry about how the compiler will translate your code, and I think this is a much healthier way of thinking. I still believe, ultimately, that the code being generated by the compiler is actually trying to give a truthful resolution. What I mean is that, maybe if you increase the test cases in the if statement and stick with boolean in one side and integer in another, the compiler will make it so the code generated will execute faster with boolean expressions in the machine level.

I'm considering this is a conceptual question, so I'll give a conceptual answer. This discussion reminds me of discussions I commonly have about whether or not code efficiency translates to less lines of code in assembly. It seems that this concept is generally accepted as being true. Considering that keeping track of how fast the ALU will handle each statement is not viable, the second option would be to focus on jumps and compares in assembly. When that is the case, the distinction between boolean statements or integers in the code you presented becomes rather representative. The result of an expression in C++ will return a value that will then be given a representation. In assembly, on the other hand, the jumps and comparisons will be based in numeric values regardless of what type of expression was being evaluated back at you C++ if statement. It is important on these questions to remember that purely logicical statements such as these end up with a huge computational overhead, even though a single bit would be capable of the same thing.

Artie
  • 493
  • 1
  • 7
  • 18