1

I came up with a question that initially was going to be a Q/A style question.

The original question:

How much does a higher scale in BigDecimal#divide() affect performance?

So, I created this SSCCE:

import java.math.BigDecimal;
import java.math.RoundingMode;
import java.util.concurrent.TimeUnit;

public class Test {

    public static void main(String args[]) {
        int[] scales = new int[] {1, 10, 50, 100, 500, 1000, 5000, 100000, 1000000};
        for(Integer scale : scales) {
            long start = System.nanoTime();
            BigDecimal.ONE.divide(BigDecimal.valueOf(7), scale, RoundingMode.HALF_UP);
            long end = System.nanoTime();
            long elapsed = end - start;
            String elapsed_str = String.format("%d mins, %d secs, %d millis, %d nanos", 
                TimeUnit.NANOSECONDS.toMinutes(elapsed),
                TimeUnit.NANOSECONDS.toSeconds(elapsed) - TimeUnit.MINUTES.toSeconds(TimeUnit.NANOSECONDS.toMinutes(elapsed)),
                TimeUnit.NANOSECONDS.toMillis(elapsed) - TimeUnit.SECONDS.toMillis(TimeUnit.NANOSECONDS.toSeconds(elapsed)),
                elapsed - TimeUnit.MILLISECONDS.toNanos(TimeUnit.NANOSECONDS.toMillis(elapsed))
            );
            System.out.println("Time for scale = " + scale + ": " + elapsed_str);
        }
    }
}

The output was thus:

Time for scale = 1: 0 mins, 0 secs, 2 millis, 883903 nanos
Time for scale = 10: 0 mins, 0 secs, 0 millis, 13995 nanos
Time for scale = 50: 0 mins, 0 secs, 1 millis, 138727 nanos
Time for scale = 100: 0 mins, 0 secs, 0 millis, 645636 nanos
Time for scale = 500: 0 mins, 0 secs, 1 millis, 250220 nanos
Time for scale = 1000: 0 mins, 0 secs, 4 millis, 38957 nanos
Time for scale = 5000: 0 mins, 0 secs, 15 millis, 66549 nanos
Time for scale = 100000: 0 mins, 0 secs, 500 millis, 873987 nanos
Time for scale = 1000000: 0 mins, 50 secs, 183 millis, 686684 nanos

As the order of magnitude increases, the performance is affected exponentially. But what had me scratching my head were these lines:

Time for scale = 1: 0 mins, 0 secs, 2 millis, 883903 nanos
Time for scale = 10: 0 mins, 0 secs, 0 millis, 13995 nanos
Time for scale = 50: 0 mins, 0 secs, 1 millis, 138727 nanos
Time for scale = 100: 0 mins, 0 secs, 0 millis, 645636 nanos
Time for scale = 500: 0 mins, 0 secs, 1 millis, 250220 nano

It appears that a scale of 10 is optimal for BigDecimal#divide()? And a scale of 100 is faster than 50? I thought this might just be an anomaly, so I ran it again (this time, omitting the highest two scales because I didn't want to wait 50 seconds :)) and this is the result:

Time for scale = 1: 0 mins, 0 secs, 3 millis, 440903 nanos
Time for scale = 10: 0 mins, 0 secs, 0 millis, 10263 nanos
Time for scale = 50: 0 mins, 0 secs, 0 millis, 833169 nanos
Time for scale = 100: 0 mins, 0 secs, 0 millis, 487492 nanos
Time for scale = 500: 0 mins, 0 secs, 0 millis, 802846 nanos
Time for scale = 1000: 0 mins, 0 secs, 2 millis, 475715 nanos
Time for scale = 5000: 0 mins, 0 secs, 16 millis, 646117 nanos

Again, 10 is considerably faster than 1, and 100 is again faster than 50.

I tried again and again, and 100 was always faster than 50. And a scale of 1 was always slower than everything less than 1000.

Anyone have an explanation?

ryvantage
  • 13,064
  • 15
  • 63
  • 112
  • 2
    Your benchmarking methodology is extremely suspect. You _can't_ get useful results without running each method thousands of times; your results can be completely the opposite of what they ought to be otherwise. – Louis Wasserman Dec 19 '13 at 00:05
  • 1
    Your benchmark is not fair. Read this answer: http://stackoverflow.com/a/513259/155137 – Martijn Courteaux Dec 19 '13 at 00:10
  • 2
    Also, the fast ones are powers of ten -- there is a `private static final long[] LONG_TEN_POWERS_TABLE` declared in `BigDecimal`. I assume that this is for performance reasons. – Jesan Fafon Dec 19 '13 at 00:15
  • Discover [JMH](http://openjdk.java.net/projects/code-tools/jmh/). – leventov Dec 19 '13 at 06:37

1 Answers1

0

Java code is optimised dynamically but the first time you run it, it has to be loaded. To avoid confusing results caused by the code being re-compiled as you run it, I suggest

  • doing your longest run first, not last.
  • run the test for at least 2 seconds.
  • ignore the first run of at least 3 to 5 runs (You only have one in this case)

I would keep the scale simple so you can compare all the results. In your case I would do each one at least 1000 times and print the average in micro-seconds.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • I'm not the downvoter, but to me it makes no sense of doing your longest run first. Why did you say that? – Martijn Courteaux Dec 19 '13 at 00:17
  • @MartijnCourteaux The longest run means your code is more likely to be warmed up. In this case one operation with 100000 decimal place might be enough to warm up the code. – Peter Lawrey Dec 19 '13 at 00:46