0

This question is a follow-up to Does java have a int.tryparse that doesn't throw an exception for bad data?, which is marked as a duplicate of Java: Good way to encapsulate Integer.parseInt().

Both questions seem to be more about how to catch the NumberFormatException thrown by Integer.parseInt() or provide a better API which does not use exceptions by encapsulating Integer.parseInt().

But neither is formulated to specifically address the performance aspect of the fact that Java's Integer.parseInt() throws an exception if the input is not parseable as an int. If most of your input consists of valid ints, this won't matter. But if your input consists of a lot of data which may or may not be ints and you need to parse it, Integer.parseInt() will be inefficient.

So this specific question is about how to parse integers efficiently, given that the input can consists of lots of valid integers but also lots of invalid ones.

Community
  • 1
  • 1
ebruchez
  • 7,760
  • 6
  • 29
  • 41
  • If you're expecting a whole lot of things that aren't integers mixed in, it might be helpful to explain your data stream a bit more. A simple approach would be to match against a regex of something like `-?\d+`. – chrylis -cautiouslyoptimistic- Jan 30 '16 at 06:42
  • you can rewrite parseInt. You can use the openjdk implementation of parseint as a basis, and do something that is not throwing an exception if the int is not valid. this question https://stackoverflow.com/questions/299068/how-slow-are-java-exceptions is relevant to performances of throwing exceptions – njzk2 Jan 30 '16 at 06:42
  • I put my own answer below. The purpose of creating this new question was to address aspects that hadn't been addressed properly by the other two questions I mention. There is no need to rewrite `parseInt()` because Google did it for us ;) – ebruchez Jan 30 '16 at 06:43
  • I am not sure if I understand, what is the issue with throwing an exception? There is nothing inefficient about handling it if it occurs? – Ashwin Gupta Jan 30 '16 at 06:44
  • @AshwinGupta cf the question I linked. throwing an exception is very heavy. – njzk2 Jan 30 '16 at 06:46
  • @njzk2 Throwing exceptions in Java is actually extremely lightweight. Filling in the stack trace takes up nearly all of the overhead, and it's possible to disable it for custom exception types (but there's no API to do so for built-in exception classes). – chrylis -cautiouslyoptimistic- Jan 30 '16 at 06:56
  • @njzk2 I don't know your exact situation of course, but prehaps the most lightweight way then is to do this is instead of handling exceptions when parsing, first check to make sure if what you are parsing is a number, then parse without fear. See this:http://stackoverflow.com/questions/1102891/how-to-check-if-a-string-is-numeric-in-java . It still requires exception handling, but maybe less so then that needed for parsing. – Ashwin Gupta Jan 30 '16 at 07:14
  • `Integer.parseInt()` does throw a `NumberFormatException`, and it will fill the stack trace. So yes, it's costly. – ebruchez Jan 30 '16 at 07:49

2 Answers2

1

Here is a good article about efficiently parsing integers. The original site is down, so I included the link on the way back machine.

Returning an Integer instead of int makes your code much slower. You could take the code in the article and return Integer.MIN_VALUE, or zero, or some other value depending on your scenario:

public static int parseInt( final String s )
{
    if ( string == null )
        return Integer.MIN_VALUE;

    // Check for a sign.
    int num  = 0;
    int sign = -1;
    final int len  = s.length( );
    final char ch  = s.charAt( 0 );
    if ( ch == '-' )
    {
        if ( len == 1 )
            return Integer.MIN_VALUE;
        sign = 1;
    }
    else
    {
        final int d = ch - '0';
        if ( d < 0 || d > 9 )
            return Integer.MIN_VALUE;
        num = -d;
    }

    // Build the number.
    final int max = (sign == -1) ?
        -Integer.MAX_VALUE : Integer.MIN_VALUE;
    final int multmax = max / 10;
    int i = 1;
    while ( i < len )
    {
        int d = s.charAt(i++) - '0';
        if ( d < 0 || d > 9 )
            return Integer.MIN_VALUE;
        if ( num < multmax )
            return Integer.MIN_VALUE;
        num *= 10;
        if ( num < (max+d) )
            return Integer.MIN_VALUE;
        num -= d;
    }

    return sign * num;
}
Jay Askren
  • 10,282
  • 14
  • 53
  • 75
0

The best answer I have found so far without writing my own integer parsing code or using flaky regexes is to use Guava's Ints.tryParse(String string) or Longs.tryParse(String string, int radix) method.

Longs.tryParse() is basically a reimplementation of the standard Java Long.parseLong(), but returning null or a Long instead of throwing a NumberFormatException. This seems to me to be the best approach: it is as efficient at parsing as Long.parseLong() but handles the error case more efficiently.

ebruchez
  • 7,760
  • 6
  • 29
  • 41