There are numerous problems with this code. We'll fix the badly named variables and functions and investigate the problems:
First, CharToInt()
should be renamed to the proper StringToInt()
since it operates on an string not a single character.
The function CharToInt()
[sic.] is unsafe. It doesn't check if the user accidentally passes in a NULL pointer.
It doesn't validate input, or more correctly, skip invalid input. If the user enters in a non-digit the result will contain a bogus value. i.e. If you enter in N
the code *(s+i) & 15
will produce 14 !?
Next, the nondescript temp
in CharToInt()
[sic.] should be called digit
since that is what it really is.
Also, the kludge return result / 10;
is just that -- a bad hack to work around a buggy implementation.
Likewise MAX
is badly named since it may appear to conflict with the standard usage. i.e. #define MAX(X,y) ((x)>(y))?(x):(y)
The verbose *(s+i)
is not as readable as simply *s
. There is no need to use and clutter up the code with yet another temporary index i
.
gets()
This is bad because it can overflow the input string buffer. For example, if the buffer size is 2, and you enter in 16 characters, you will overflow str
.
scanf()
This is equally bad because it can overflow the input string buffer.
You mention "when using scanf() function, the result is completely wrong because first character apparently has a -52 ASCII value."
That is due to an incorrect usage of scanf(). I was not able to duplicate this bug.
fgets()
This is safe because you can guarantee you never overflow the input string buffer by passing in the buffer size (which includes room for the NULL.)
getline()
A few people have suggested the C POSIX standard getline()
as a replacement. Unfortunately this is not a practical portable solution as Microsoft does not implement a C version; only the standard C++ string template function as this SO #27755191 question answers. Microsoft's C++ getline()
was available at least far back as Visual Studio 6 but since the OP is strictly asking about C and not C++ this isn't an option.
Misc.
Lastly, this implementation is buggy in that it doesn't detect integer overflow. If the user enters too large a number the number may become negative! i.e. 9876543210
will become -18815698
?! Let's fix that too.
This is trivial to fix for an unsigned int
. If the previous partial number is less then the current partial number then we have overflowed and we return the previous partial number.
For a signed int
this is a little more work. In assembly we could inspect the carry-flag, but in C there is no standard built-in way to detect overflow with signed int math. Fortunately, since we are multiplying by a constant, * 10
, we can easily detect this if we use an equivalent equation:
n = x*10 = x*8 + x*2
If x*8 overflows then logically x*10 will as well. For a 32-bit int overflow will happen when x*8 = 0x100000000 thus all we need to do is detect when x >= 0x20000000. Since we don't want to assume how many bits an int
has we only need to test if the top 3 msb's (Most Significant Bits) are set.
Additionally, a second overflow test is needed. If the msb is set (sign bit) after the digit concatenation then we also know the number overflowed.
Code
Here is a fixed safe version along with code that you can play with to detect overflow in the unsafe versions. I've also included both a signed
and unsigned
versions via #define SIGNED 1
#include <stdio.h>
#include <ctype.h> // isdigit()
// 1 fgets
// 2 gets
// 3 scanf
#define INPUT 1
#define SIGNED 1
// re-implementation of atoi()
// Test Case: 2147483647 -- valid 32-bit
// Test Case: 2147483648 -- overflow 32-bit
int StringToInt( const char * s )
{
int result = 0, prev, msb = (sizeof(int)*8)-1, overflow;
if( !s )
return result;
while( *s )
{
if( isdigit( *s ) ) // Alt.: if ((*s >= '0') && (*s <= '9'))
{
prev = result;
overflow = result >> (msb-2); // test if top 3 MSBs will overflow on x*8
result *= 10;
result += *s++ & 0xF;// OPTIMIZATION: *s - '0'
if( (result < prev) || overflow ) // check if would overflow
return prev;
}
else
break; // you decide SKIP or BREAK on invalid digits
}
return result;
}
// Test case: 4294967295 -- valid 32-bit
// Test case: 4294967296 -- overflow 32-bit
unsigned int StringToUnsignedInt( const char * s )
{
unsigned int result = 0, prev;
if( !s )
return result;
while( *s )
{
if( isdigit( *s ) ) // Alt.: if (*s >= '0' && *s <= '9')
{
prev = result;
result *= 10;
result += *s++ & 0xF; // OPTIMIZATION: += (*s - '0')
if( result < prev ) // check if would overflow
return prev;
}
else
break; // you decide SKIP or BREAK on invalid digits
}
return result;
}
int main()
{
int detect_buffer_overrun = 0;
#define BUFFER_SIZE 2 // set to small size to easily test overflow
char str[ BUFFER_SIZE+1 ]; // C idiom is to reserve space for the NULL terminator
printf(" Enter some numbers (no spaces): ");
#if INPUT == 1
fgets(str, sizeof(str), stdin);
#elif INPUT == 2
gets(str); // can overflows
#elif INPUT == 3
scanf("%s", str); // can also overflow
#endif
#if SIGNED
printf(" Entered number is: %d\n", StringToInt(str));
#else
printf(" Entered number is: %u\n", StringToUnsignedInt(str) );
#endif
if( detect_buffer_overrun )
printf( "Input buffer overflow!\n" );
return 0;
}