20

In my limited experience, I've been on several projects that have had some sort of string utility class with methods to determine if a given string is a number. The idea has always been the same, however, the implementation has been different. Some surround a parse attempt with try/catch

public boolean isInteger(String str) {
    try {
        Integer.parseInt(str);
        return true;
    } catch (NumberFormatException nfe) {}
    return false;
}

and others match with regex

public boolean isInteger(String str) {
    return str.matches("^-?[0-9]+(\\.[0-9]+)?$");
}

Is one of these methods better than the other? I personally prefer using the regex approach, as it's concise, but will it perform on par if called while iterating over, say, a list of a several hundred thousand strings?

Note: As I'm kinda new to the site I don't fully understand this Community Wiki business, so if this belongs there let me know, and I'll gladly move it.

EDIT: With all the TryParse suggestions I ported Asaph's benchmark code (thanks for a great post!) to C# and added a TryParse method. And as it seems, the TryParse wins hands down. However, the try catch approach took a crazy amount of time. To the point of me thinking I did something wrong! I also updated regex to handle negatives and decimal points.

Results for updated, C# benchmark code:

00:00:51.7390000 for isIntegerParseInt
00:00:03.9110000 for isIntegerRegex
00:00:00.3500000 for isIntegerTryParse

Using:

static bool isIntegerParseInt(string str) {
    try {
        int.Parse(str);
        return true;
    } catch (FormatException e){}
    return false;
}

static bool isIntegerRegex(string str) {
    return Regex.Match(str, "^-?[0-9]+(\\.[0-9]+)?$").Success;
}

static bool isIntegerTryParse(string str) {
    int bob;
    return Int32.TryParse(str, out bob);
}
ManoDestra
  • 6,325
  • 6
  • 26
  • 50
thorncp
  • 3,587
  • 3
  • 24
  • 20

19 Answers19

13

I just ran some benchmarks on the performance of these 2 methods (On Macbook Pro OSX Leopard Java 6). ParseInt is faster. Here is the output:

This operation took 1562 ms.
This operation took 2251 ms.

And here is my benchmark code:


public class IsIntegerPerformanceTest {

    public static boolean isIntegerParseInt(String str) {
        try {
            Integer.parseInt(str);
            return true;
        } catch (NumberFormatException nfe) {}
        return false;
    }

    public static boolean isIntegerRegex(String str) {
        return str.matches("^[0-9]+$");
    }

    public static void main(String[] args) {
        long starttime, endtime;
        int iterations = 1000000;
        starttime = System.currentTimeMillis();
        for (int i=0; i<iterations; i++) {
            isIntegerParseInt("123");
            isIntegerParseInt("not an int");
            isIntegerParseInt("-321");
        }
        endtime = System.currentTimeMillis();
        System.out.println("This operation took " + (endtime - starttime) + " ms.");
        starttime = System.currentTimeMillis();
        for (int i=0; i<iterations; i++) {
            isIntegerRegex("123");
            isIntegerRegex("not an int");
            isIntegerRegex("-321");
        }
        endtime = System.currentTimeMillis();
        System.out.println("This operation took " + (endtime - starttime) + " ms.");
    }
}

Also, note that your regex will reject negative numbers and the parseInt method will accept them.

Asaph
  • 159,146
  • 25
  • 197
  • 199
  • What about int.TryParse()? Can we assume that it is exactly the same in terms of performance as the first try/catch? – Dan Atkinson Sep 02 '09 at 17:59
  • 7
    Its my understanding that you should compile the regex only once, and then call pattern.matcher(s).matches(). That should be faster than building the regex every time. Also, your test may behave differently depending on the input strings. If most of the time you do not receive ints, my guess is that the regex should be faster. – Fermin Silva Apr 26 '13 at 17:50
  • 1
    I don't see the minus sign `-` in the regex. Wouldn't it fail on `-321`? – Matthieu Aug 16 '13 at 04:31
  • It shows completely different results for JDK17: 1268 vs 747 ms (parseInt vs regex). Difference is much bigger when we use precompiled pattern: 1156 vs 183ms for regex. – kool79 Aug 14 '23 at 07:48
4

Here is our way of doing this:

public boolean isNumeric(String string) throws IllegalArgumentException
{
   boolean isnumeric = false;

   if (string != null && !string.equals(""))
   {
      isnumeric = true;
      char chars[] = string.toCharArray();

      for(int d = 0; d < chars.length; d++)
      {
         isnumeric &= Character.isDigit(chars[d]);

         if(!isnumeric)
         break;
      }
   }
   return isnumeric;
}
Jeremy Cron
  • 2,404
  • 3
  • 25
  • 30
3

If absolute performance is key, and if you are just checking for integers (not floating point numbers) I suspect that iterating over each character in the string, returning false if you encounter something not in the range 0-9, will be fastest.

RegEx is a more general-purpose solution so will probably not perform as fast for that special case. A solution that throws an exception will have some extra overhead in that case. TryParse will be slightly slower if you don't actually care about the value of the number, just whether or not it is a number, since the conversion to a number must also take place.

For anything but an inner loop that's called many times, the differences between all of these options should be insignificant.

Eric J.
  • 147,927
  • 63
  • 340
  • 553
3

I needed to refactor code like yours to get rid of NumberFormatException. The refactored Code:

public static Integer parseInteger(final String str) {
    if (str == null || str.isEmpty()) {
        return null;
    }
    final Scanner sc = new Scanner(str);
    return Integer.valueOf(sc.nextInt());
}

As a Java 1.4 guy, I didn't know about java.util.Scanner. I found this interesting article:

http://rosettacode.org/wiki/Determine_if_a_string_is_numeric#Java

I personaly liked the solution with the scanner, very compact and still readable.

Heiner
  • 131
  • 1
  • 8
  • Not sure what version of JDK you're using, but `Scanner.nextInt()` seems to throw `InputMismatchException`. (even in 1.5) – ebyrob Dec 20 '16 at 15:52
2

Some languages, like C#, have a TryParse (or equivalent) that works fairly well for something like this.

public boolean IsInteger(string value)
{
  int i;
  return Int32.TryParse(value, i);
}
Brandon
  • 68,708
  • 30
  • 194
  • 223
2

Personally I would do this if you really want to simplify it.

public boolean isInteger(string myValue)
{
    int myIntValue;
    return int.TryParse(myValue, myIntValue)
}
Mitchel Sellers
  • 62,228
  • 14
  • 110
  • 173
2

You could create an extension method for a string, and make the whole process look cleaner...

public static bool IsInt(this string str)
{
    int i;
    return int.TryParse(str, out i);
}

You could then do the following in your actual code...

if(myString.IsInt())....
RSolberg
  • 26,821
  • 23
  • 116
  • 160
1

Using .NET, you could do something like:

private bool isNumber(string str)
{
    return str.Any(c => !char.IsDigit(c));
}
Cleiton
  • 17,663
  • 13
  • 46
  • 59
1

That's my implementation to check whether a string is made of digits:

public static boolean isNumeric(String string)
{
    if (string == null)
    {
        throw new NullPointerException("The string must not be null!");
    }
    final int len = string.length();
    if (len == 0)
    {
        return false;
    }
    for (int i = 0; i < len; ++i)
    {
        if (!Character.isDigit(string.charAt(i)))
        {
            return false;
        }
    }
    return true;
}
pschichtel
  • 759
  • 7
  • 18
1

I like code:

public static boolean isIntegerRegex(String str) {
    return str.matches("^[0-9]+$");
}

But it will good more when create Pattern before use it:

public static Pattern patternInteger = Pattern.compile("^[0-9]+$");
public static boolean isIntegerRegex(String str) {
  return patternInteger.matcher(str).matches();
}

Apply by test we have result:

This operation isIntegerParseInt took 1313 ms.
This operation isIntegerRegex took 1178 ms.
This operation isIntegerRegexNew took 304 ms.

With:

public class IsIntegerPerformanceTest {
  private static Pattern pattern = Pattern.compile("^[0-9]+$");

    public static boolean isIntegerParseInt(String str) {
    try {
      Integer.parseInt(str);
      return true;
    } catch (NumberFormatException nfe) {
    }
    return false;
  }

  public static boolean isIntegerRegexNew(String str) {
    return pattern.matcher(str).matches();
  }

  public static boolean isIntegerRegex(String str) {
    return str.matches("^[0-9]+$");
  }

    public static void main(String[] args) {
        long starttime, endtime;
    int iterations = 1000000;
    starttime = System.currentTimeMillis();
    for (int i = 0; i < iterations; i++) {
      isIntegerParseInt("123");
      isIntegerParseInt("not an int");
      isIntegerParseInt("-321");
    }
    endtime = System.currentTimeMillis();
    System.out.println("This operation isIntegerParseInt took " + (endtime - starttime) + " ms.");
    starttime = System.currentTimeMillis();
    for (int i = 0; i < iterations; i++) {
      isIntegerRegex("123");
      isIntegerRegex("not an int");
      isIntegerRegex("-321");
    }
    endtime = System.currentTimeMillis();
    System.out.println("This operation took isIntegerRegex " + (endtime - starttime) + " ms.");
    starttime = System.currentTimeMillis();
    for (int i = 0; i < iterations; i++) {
      isIntegerRegexNew("123");
      isIntegerRegexNew("not an int");
      isIntegerRegexNew("-321");
    }
    endtime = System.currentTimeMillis();
    System.out.println("This operation took isIntegerRegexNew " + (endtime - starttime) + " ms.");
  }
}
Hong Xanh
  • 11
  • 2
1

I think It could be faster than previous solutions if you do the following (Java):

public final static boolean isInteger(String in)
{
    char c;
    int length = in.length();
    boolean ret = length > 0;
    int i = ret && in.charAt(0) == '-' ? 1 : 0;
    for (; ret && i < length; i++)
    {
        c = in.charAt(i);
        ret = (c >= '0' && c <= '9');
    }
    return ret;
}

I ran the same code that Asaph ran and the result was:

This operation took 28 ms.

A huge difference (against 1691 ms and 2049 ms -on my computer). Take in account that this method does not validate if the string is null, so you should do that previously (including the String trimming)

Barenca
  • 209
  • 2
  • 3
1

I think people here is missing a point. The use of the same pattern repeatedly has a very easy optimization. Just use a singleton of the pattern. Doing it, in all my tests the try-catch approach never have a better benchmark than the pattern approach. With a success test try-catch takes twice the time, with a fail test it's 6 times slower.

public static final Pattern INT_PATTERN= Pattern.compile("^-?[0-9]+(\\.[0-9]+)?$");

public static boolean isInt(String s){
  return INT_PATTERN.matcher(s).matches();
}
user1844655
  • 61
  • 1
  • 2
1
public static boolean CheckString(String myString) {

char[] digits;

    digits = myString.toCharArray();
    for (char div : digits) {// for each element div of type char in the digits collection (digits is a collection containing div elements).
        try {
            Double.parseDouble(myString);
            System.out.println("All are numbers");
            return true;
        } catch (NumberFormatException e) {

            if (Character.isDigit(div)) {
                System.out.println("Not all are chars");

                return false;
            }
        }
    }

    System.out.println("All are chars");
    return true;
}
Mark Hall
  • 53,938
  • 9
  • 94
  • 111
Naim
  • 11
  • 2
0

I use this but I liked Asaph's rigor in his post.

public static bool IsNumeric(object expression)
{
if (expression == null)
return false;

double number;
return Double.TryParse(Convert.ToString(expression, CultureInfo.InvariantCulture),   NumberStyles.Any,
NumberFormatInfo.InvariantInfo, out number);
}
Daver
  • 333
  • 2
  • 5
  • 14
0

For long numbers use this: (JAVA)

public static boolean isNumber(String string) {
    try {
        Long.parseLong(string);
    } catch (Exception e) {
        return false;
    }
    return true;
}
ssamuel68
  • 932
  • 13
  • 10
0
 public static boolean isNumber(String str){
      return str.matches("[0-9]*\\.[0-9]+");
    }

to check whether number (including float, integer) or not

0

A modified version of my previous answer:

public static boolean isInteger(String in)
{
    if (in != null)
    {
        char c;
        int i = 0;
        int l = in.length();
        if (l > 0 && in.charAt(0) == '-')
        {
            i = 1;
        }
        if (l > i)
        {
            for (; i < l; i++)
            {
                c = in.charAt(i);
                if (c < '0' || c > '9')
                    return false;
            }
            return true;
        }
    }
    return false;
}
Barenca
  • 209
  • 2
  • 3
0

I just added this class to my utils:

public class TryParseLong {
private boolean isParseable;

private long value;

public TryParseLong(String toParse) {
    try {
        value = Long.parseLong(toParse);
        isParseable = true;
    } catch (NumberFormatException e) {
        // Exception set to null to indicate it is deliberately
        // being ignored, since the compensating action
        // of clearing the parsable flag is being taken.
        e = null;

        isParseable = false;
    }
}

public boolean isParsable() {
    return isParseable;
}

public long getLong() {
    return value;
}
}

To use it:

TryParseLong valueAsLong = new TryParseLong(value);

if (valueAsLong.isParsable()) {
    ...
    // Do something with valueAsLong.getLong();
} else {
    ...
}

This only parses the value once.

It still makes use of the exception and control flow by exceptions, but at least it encapsulates that kind of code in a utility class, and code that uses it can work in a more normal way.

The problem with Java versus C#, is that C# has out values and pass by reference, so it can effectively return 2 pieces of information; the flag to indicate that something is parsable or not, and the actual parsed value. When we reutrn >1 value in Java, we need to create an object to hold them, so I took that approach and put the flag and the parsed value in an object.

Escape analysis is likely to handle this efficiently, and create the value and flag on the stack, and never create this object on the heap, so I think doing this will have minimal impact on performance.

To my thinking this gives about the optimal compromise between keeping control-flow-by-exception out your code, good performance, and not parsing the integer more than once.

user2800708
  • 1,890
  • 2
  • 18
  • 31
0

public static boolean CheckIfNumber(String number){

    for(int i = 0; i < number.length(); i++){
        try{
            Double.parseDouble(number.substring(i));

        }catch(NumberFormatException ex){
            return false;
        }
    }
    return true;     
}

I had this problem before but when I had input a number and then a character, it would still return true, I think this is the better way to do it. Just check if every char is a number. A little longer but it takes care if you have the situation of a user inputting "1abc". For some reason, when I tried to try and catch without iterating, it still thought it was a number so..