I want to start by saying that I agree with earlier comments that it's not really useful to optimize this function. We're talking about saving nanoseconds on user interaction that typically takes seconds or more. The processing time is probably less than the time it takes for the "enter" key to be released.
Having said that, here's my implementation. It's a pretty simple implementation, avoiding unnecessary calls to library functions, and giving the compiler enough freedom to optimize the code.
On my machine (Intel Core i7-6500U, compiled with gcc -O3
) this implementation is faster than all current answers.
int str_to_bool(const char *str)
{
if ((str[0] & 0xFE) == 48) { // ch == '0' or '1'
if (str[1] == '\0') {
return str[0] - 48;
}
} else if (str[0] == 't') {
if (str[1] == 'r' && str[2] == 'u' && str[3] == 'e' && str[4] == '\0') {
return 1;
}
} else if (str[0] == 'f') {
if (str[1] == 'a' && str[2] == 'l' && str[3] == 's' && str[4] == 'e' && str[5] == '\0') {
return 0;
}
}
return -1;
}
UPDATATED version
The following versions works with the updated requirements that were not mentioned in the question but in comments. This handles "true", "false", "yes", "no", "t", "f", "y", "n", "1" and "0" and the first letter may be uppercase as well. It's a bit more verbose but still very fast.
int str_to_bool(const char *str)
{
if ((str[0] & 0xFE) == 48) { // ch == '0' or '1'
if (str[1] == '\0') {
return str[0] - 48;
}
} else if ((str[0] | 32) == 't') {
if (str[1] == '\0') {
return 1;
}
if (str[1] == 'r' && str[2] == 'u' && str[3] == 'e' && str[4] == '\0') {
return 1;
}
} else if ((str[0] | 32) == 'f') {
if (str[1] == '\0') {
return 0;
}
if (str[1] == 'a' && str[2] == 'l' && str[3] == 's' && str[4] == 'e' && str[5] == '\0') {
return 0;
}
} else if ((str[0] | 32) == 'y') {
if (str[1] == '\0') {
return 1;
}
if (str[1] == 'e' && str[2] == 's' && str[3] == '\0') {
return 1;
}
} else if ((str[0] | 32) == 'n') {
if (str[1] == '\0') {
return 0;
}
if (str[1] == 'o' && str[2] == '\0') {
return 0;
}
}
return -1;
}
Q&A (explanation and background info)
Some additional information to answer questions that were asked in comments:
Q: Why is this faster than using memcmp()
? I've been told to use library functions when possible.
A: In general it's a good practice to make use of standard library functions such as memcmp()
. They're heavily optimized for their intended use and for the targeted platform. For example, on modern cpu architectures memory alignment heavily influences performance, so a memcmp()
implementation for such platform will make efforts to read data using the optimal memory alignment. Consequently, the start and end of the memory buffer may need to be handled differently, because they are not guaranteed to be aligned. This causes some overhead, making the implementation slower for small buffers and faster for large buffers. In this case only 1-5 bytes are compared, so using memcmp
is not really advantageous. Besides, using the function introduces some calling overhead as well. So, in this case, doing the comparison manually will be much more efficient.
Q: Isn't the use of a switch
statement faster than an if-else
ladder?
A: It could be, but there's no guarantee. First of all, it depends on the compiler how the switch
statement is translated. A common method is to use a jump table. However, this is only feasible if the values used in the case
statements are close too each other, otherwise the jump table would be too big to fit in memory. Also note that the jump table implementation is reasonably expensive to execute. My guess is that it's starting to be efficient to use if there at least five cases. Secondly, a good compiler is able to implement a jump table as separate if
statements, but it's also able to implement an if-else
ladder as a jump table if that would be more efficient. So it really shouldn't matter what you use in C, as long as you make sure that the compiler has enough information and freedom to make such optimizations. (For proof, compile this code for armv7-a using clang 10.0.0 and you'll see that it generates a jump table.)
Q: Isn't it bad to use strcmp()
if you already know the length of the string?
A: Well, that depends...
- If the length of the string is known in advance, using
memcmp()
would make more sense indeed, because it probably is slightly faster. However, this is not guaranteed, so you should really benchmark it to know for sure. I can think of a number of reasons why strcmp()
could be faster in this case.
- If the length of the string is not known, it must be determined (using
strlen()
) before you can use memcmp()
, or access the data otherwise. However, calling strlen()
is quite expensive. It could take more time than the above full function to execute.
- Note that executing
memcmp(Str, "false", 5)
is illegal if the buffer is less than 5 bytes. According to the C standard, this results in Undefined Behavior, meaning that the application could crash or give other unexpected results.
Finally, note that my algorithm basically works like a tree. It first checks the first character. If that's a valid character, it will continue with the second character. As soon as a character is found that's not valid, the function returns -1. So it reads every character only once (if the compiler does it's job correctly), in contrast to some of the other implementations that read the input data multiple times.