6

If there are non-number characters in a string and you call atoi [I'm assuming wtoi will do the same]. How will atoi treat the string?

Lets say for an example I have the following strings:

  1. "20234543"
  2. "232B"
  3. "B"

I'm sure that 1 will return the integer 20234543. What I'm curious is if 2 will return "232." [Thats what I need to solve my problem]. Also 3 should not return a value. Are these beliefs false? Also... if 2 does act as I believe, how does it handle the e character at the end of the string? [Thats typically used in exponential notation]

monksy
  • 14,156
  • 17
  • 75
  • 124

7 Answers7

10

You can test this sort of thing yourself. I copied the code from the Cplusplus reference site. It looks like your intuition about the first two examples are correct, but the third example returns '0'. 'E' and 'e' are treated just like 'B' is in the second example also.

So the rules are

On success, the function returns the converted integral number as an int value. If no valid conversion could be performed, a zero value is returned. If the correct value is out of the range of representable values, INT_MAX or INT_MIN is returned.

Mike
  • 654
  • 2
  • 7
  • 22
  • 6
    –1. The behavior of `atoi` when the input cannot be represented as an integer is undefined, so you cannot test it yourself; any test that invokes undefined behavior is invalid. Cplusplus.com doesn't say that, but [cplusplus.com is a notoriously unreliable reference](http://stackoverflow.com/q/6520052/33732). What you've quoted are the rules for `strtol`, but adapted to apply to `int` instead of `long` (which means they don't really apply to any function at all). When you need an authoritative citation, use the standard. When you need a quick reference, use cppreference.com. – Rob Kennedy Jul 10 '14 at 15:46
  • atoi() returns 0 on the input of "abc123". Why it is treated as 0 but when "123abc" is input it shows "123" as the output. Can someone please explain this. – Vishnu N K Aug 09 '16 at 16:27
  • 1
    Because POSIX defines `atoi` as having similar behavior as `strtol` which handles leading white-spaces (if any), then digits, then any unrecognized characters (if any) (http://pubs.opengroup.org/onlinepubs/009695399/functions/strtol.html). In your second example, `strtol` hits unrecognized characters and gives up. – gladed May 12 '17 at 00:32
10

According to the standard, "The functions atof, atoi, atol, and atoll need not affect the value of the integer expression errno on an error. If the value of the result cannot be represented, the behavior is undefined." (7.20.1, Numeric conversion functions in C99).

So, technically, anything could happen. Even for the first case, since INT_MAX is guaranteed to be at least 32767, and since 20234543 is greater than that, it could fail as well.

For better error checking, use strtol:

const char *s = "232B";
char *eptr;
long value = strtol(s, &eptr, 10); /* 10 is the base */
/* now, value is 232, eptr points to "B" */

s = "20234543";
value = strtol(s, &eptr, 10);

s = "123456789012345";
value = strtol(s, &eptr, 10);
/* If there was no overflow, value will contain 123456789012345,
   otherwise, value will contain LONG_MAX and errno will be ERANGE */

If you need to parse numbers with "e" in them (exponential notation), then you should use strtod. Of course, such numbers are floating-point, and strtod returns double. If you want to make an integer out of it, you can do a conversion after checking for the correct range.

Alok Singhal
  • 93,253
  • 21
  • 125
  • 158
  • Fail enough, but according to the MSDN integers are 32bit. http://msdn.microsoft.com/en-us/library/296az74e.aspx – monksy Jul 09 '10 at 17:15
  • @steven: it also says "Microsoft Specific" at the top. So if you only care about Microsoft specific code, you are right that you don't need to worry about overflow in the first case. But if you want portability, you need to. Your question wasn't tagged with any platform-specific tag, so I assumed you wanted portability :-). – Alok Singhal Jul 09 '10 at 17:19
  • Fair enough. Most systems I've written for are 32 bit so thats what I'm used to seeing. [Well the 16bit was a long time ago] – monksy Jul 09 '10 at 17:31
  • 1
    POSIX requires `sizeof(int)>=4` too. – R.. GitHub STOP HELPING ICE Jul 09 '10 at 17:52
  • Just to complete your mentioning of strtol a bit, I find the special parameter for the base `0` the most convenient. This converts the number from the usual bases automatically, in particular with base 10 for *normal* decimal numbers and from hexadecimal if the number starts with `0x`. – Jens Gustedt Jul 09 '10 at 19:45
7

atoi reads digits from the buffer until it can't any more. It stops when it encounters any character that isn't a digit, except whitespace (which it skips) or a '+' or a '-' before it has seen any digits (which it uses to select the appropriate sign for the result). It returns 0 if it saw no digits.

So to answer your specific questions: 1 returns 20234543. 2 returns 232. 3 returns 0. The character 'e' is not whitespace, a digit, '+' or '-' so atoi stops and returns if it encounters that character.

See also here.

moonshadow
  • 86,889
  • 7
  • 82
  • 122
4

If atoi encounters a non-number character, it returns the number formed up until that point.

pcent
  • 1,929
  • 2
  • 14
  • 17
0

I tried using atoi() in a project, but it wouldn't work if there were any non-digit characters in the mix and they came before the digit characters - it'll return zero. It seems to not mind if they come after the digits, for whatever reason.

Here's a pretty bare bones string to int converter I wrote up that doesn't seem to have that problem (bare bones in that it doesn't work with negative numbers and it doesn't incorporate any error handling, but it might be helpful in specific instances). Hopefully it might be helpful.

int stringToInt(std::string newIntString)
{
    unsigned int dataElement = 0;
    unsigned int i = 0;

    while ( i < newIntString.length())
    {
        if (newIntString[i]>=48 && newIntString[i]<=57)
        {
         dataElement += static_cast<unsigned int>(newIntString[i]-'0')*(pow(10,newIntString.length()-(i+1)));
        }
        i++;
    }
    return dataElement;
}
0

I blamed myself up to this atoi-function behaviour when I was learning-approached coding program with function calculating integer factorial result given input parameter by launching command line parameter.

atoi-function returns 0 if value is something else than numeral value and "3asdf" returns 3. C -language handles command line input parameters in char -array pointer variable as we all already know.

I was told that down at the book "Linux Hater's Handbook" there's some discussion appealing for computer geeks doesn't really like atoi-function, it's kind of foolish in reason that there's no way to check validity of given input type.

Some guy asked me why I don't brother to use strtol -function located on stdlib.h -library and he gave me an example attached to my factorial-calculating recursive method but I don't care about factorial result is bigger than integer primary type value -range, out of ranged (too large base number). It will result in negative values in my program.

I solved my problem with atoi-function first checking if given user's input parameter is truly numerical value and if that matches, after then I calculate the factorial value.

Using isdigit() -function located on chtype.h -library is following:

int checkInput(char *str[]) {
 for (int x = 0; x < strlen(*str); ++x)
    {
        if (!isdigit(*str[x])) return 1;
    }
    return 0;
}

My forum-pal down in other Linux programming forum told me that if I would use strtol I could handle the situations with out of ranged values or even parse signed int to unsigned long -type meaning -0 and other negative values are not accepted.

It's important upper on my code check if charachter is not numerical value. Negotation way to check this one the function returns failed results when first numerical value comes next to check in string. (or char array in C)

Jere_Sumell
  • 543
  • 1
  • 4
  • 8
-1

Writing simple code and looking to see what it does is magical and illuminating.

On point #3, it won't return "nothing." It can't. It'll return something, but that something won't be useful to you.

http://www.cplusplus.com/reference/clibrary/cstdlib/atoi/

On success, the function returns the converted integral number as an int value.

If no valid conversion could be performed, a zero value is returned.

If the correct value is out of the range of representable values, INT_MAX or INT_MIN is returned.

dash-tom-bang
  • 17,383
  • 5
  • 46
  • 62
  • I knew it would return either 0 [or a set value] or null. But I wasn't sure. But my question was ... does it convert up to the next non-integer value or what? – monksy Jul 09 '10 at 17:05
  • What do you mean by "next non-integer value"? – dash-tom-bang May 31 '13 at 17:06
  • You're correct that it can't return "nothing," but that doesn't mean it will return "something." Behavior is undefined, so it's possible it won't return at all. – Rob Kennedy Jul 10 '14 at 15:53
  • @RobKennedy in what way is the behavior undefined? By the reference that I pasted right there it seems completely defined. That said, if you pass garbage to the function then you'll get garbage back. – dash-tom-bang Apr 19 '15 at 21:42
  • 2
    The reference you pasted comes from a notoriously *wrong* site. Check the standards, or check what the site happens to say today. – Rob Kennedy Apr 19 '15 at 21:46