Numerical Streams Example Optimization

Question

I am reading "Java 8 In Action" (by Raoul-Gabriel Urma, Mario Fusco and Alan Mycroft), section 5.6.3, pages 116 and 117. The code that is presented handles calculating so-called "Pythagorean Triples". Page 116 shows the first attempt, and page 117 shows an improved attempt to generate these triples, where both use the ".rangeClosed()" method.

I have found some optimizations that go beyond the book and I would like to share them here. I did some simple "System.currentTimeMillis()" calculations to see if my modifications were improvements, and they appeared to be marginally better than that found in the book. Can you provide better improvements, explanations or metrics for this code?

    public void test() {

    long time1 = System.currentTimeMillis();

    /*
     * From text, page 116
     */
    IntStream.rangeClosed(1,  100).boxed()
        .flatMap(a -> IntStream.rangeClosed(a, 100)
                .filter(b -> Math.sqrt(a*a + b*b) % 1 == 0)
                .mapToObj(b -> new int[]{a, b, (int)Math.sqrt(a*a + b*b)})
        )
        .forEach(c -> System.out.println("["+c[0]+" "+c[1]+" "+c[2]+"]"));

    long time2 = System.currentTimeMillis();

    System.out.println();

    long time3 = System.currentTimeMillis();

    /*
     * From text, page 117, I added "map(...)" so that end result are ints, not doubles
     */
    IntStream.rangeClosed(1, 100).boxed()
        .flatMap(a -> IntStream.rangeClosed(a, 100)
                .mapToObj(b -> new double[]{a, b, Math.sqrt(a*a + b*b)})
                .filter(t -> t[2] % 1 == 0)
                .map(b -> new int[]{(int)b[0], (int)b[1], (int)b[2]})
        )
        .forEach(c -> System.out.println("["+c[0]+" "+c[1]+" "+c[2]+"]"));

    long time4 = System.currentTimeMillis();

    System.out.println();

    long time5 = System.currentTimeMillis();

    /*
     * My optimization attempt #1: now mapToObj(...) has c%1!=0 conditional, filter checks array element not null
     */
    IntStream.rangeClosed(1, 100).boxed()
    .flatMap(a -> IntStream.rangeClosed(a, 100)
                .mapToObj(b -> {
                    double c = Math.sqrt(a*a + b*b);
                    return new Object[]{a, b, c % 1 == 0 ? (int)c : null};
                })
                .filter(d -> d[2] != null)
                .map(e -> new int[]{(int)e[0], (int)e[1], (int)e[2]})
    )
    .forEach(f -> System.out.println("["+f[0]+" "+f[1]+" "+f[2]));

    long time6 = System.currentTimeMillis();

    System.out.println();

    long time7 = System.currentTimeMillis();

    /*
     * My optimization attempt #2: now mapToObj(...) has c%1!=0 conditional, filter checks "array element" not 0
     */
    IntStream.rangeClosed(1, 100).boxed()
        .flatMap(a -> IntStream.rangeClosed(a, 100)
                .mapToObj(b -> {
                    double c = Math.sqrt(a*a + b*b);
                    return new int[]{a, b, c % 1 == 0 ? (int)c : 0 };
                })
                .filter(t -> t[2] != 0)
        )
        .forEach(d -> System.out.println("["+d[0]+" "+d[1]+" "+d[2]+"]"));

    long time8 = System.currentTimeMillis();

    System.out.println();

    long time9 = System.currentTimeMillis();

    /*
     * My optimization attempt #3: now mapToObj(...) has c%1!=0 conditional, filter checks "collection element" not null
     */
    IntStream.rangeClosed(1, 100).boxed()
        .flatMap(a -> IntStream.rangeClosed(a, 100)
                .mapToObj(b -> {
                    double c = Math.sqrt(a*a + b*b);
                    return (c % 1 != 0) ? null : new int[]{a, b, (int)c};
                })
                .filter(t -> t != null)
        )
        .forEach(d -> System.out.println("["+d[0]+" "+d[1]+" "+d[2]+"]"));

    long time10 = System.currentTimeMillis();

    System.out.println();

    long timeDelta1 = time2 - time1;
    long timeDelta2 = time4 - time3;
    long timeDelta3 = time6 - time5;
    long timeDelta4 = time8 - time7;
    long timeDelta5 = time10 - time9;

    System.out.println("timeDelta1: " + timeDelta1 + ", timeDelta2: " + timeDelta2 + ", timeDelta3: " + timeDelta3 + ", timeDelta4: " + timeDelta4 + ", timeDelta5: " + timeDelta5);

}

public static void main(String[] args){
    ReduceTest reduceTest = new ReduceTest();
    reduceTest.test();
}

NOTE: It appears that you can use "return;" in a ".forEach()" method, but not in a ".mapToInt()" method. Using "return;" in the lambda passed into the ".mapToInt()" method would remove the need to have the ".filter()" method. It seems like that would be an improvement to the streams api.

I think this question is better discussed at: http://codereview.stackexchange.com/ — Flown, Jan 08 '17 at 21:54
@Alexis, thanks, that stream tag should instead be java-stream, you are right of course. — Steve T, Jan 09 '17 at 01:33
You can use `return` in every lambda expression, but, of course` if the function is supposed to return a value, your `return` statement must return a value. — Holger, Jan 09 '17 at 10:35
@Holger, understood, but what I mean is that it would be nice if using a return statement without a value would be nice in the situations when the conditional expression doesn't have a value to return in certain iterations. — Steve T, Jan 22 '17 at 01:29
@Holger, or actually, I mean that continue and break statements are missing. — Steve T, Jan 22 '17 at 01:48
It would be strange to support `continue` here, without having any loop in scope. You are thinking too much in terms of imperative programming… — Holger, Jan 23 '17 at 10:01

score 3 · Answer 1 · edited May 23 '17 at 11:54

First, you “benchmark” is heavily flawed. It will most likely always appear that the last variants are faster, because more of the shared code (e.g. Stream API methods) has been compiled/optimized when they are executed. See “How do I write a correct micro-benchmark in Java?”

Besides that, “%1” is unnecessary when you want to test whether a number is a whole number. You can cast to int, which will remove the fraction part, and compare that with the original double. Further, you don’t need to map to an int[] array, when you already know that, what you are going to print, will be a String:

IntStream.rangeClosed(1, 100).boxed()
    .flatMap(a -> IntStream.rangeClosed(a, 100)
            .mapToObj(b -> new double[]{a, b, Math.sqrt(a*a + b*b)})
            .filter(t -> ((int)t[2]) == t[2])
            .map(arr -> String.format("[%.0f %.0f %.0f]", arr[0], arr[1], arr[2]))
    )
    .forEach(System.out::println);

But, of course, if you know that you need an expensive function like sqrt several times, computing it in advance can be beneficial, especially, when there is a possibility to prepare without using that expensive function or even floating point arithmetic in general:

int[] square = new int[20001];
IntStream.rangeClosed(1, 141).forEach(i -> square[i*i]=i);
IntStream.rangeClosed(1, 100).boxed()
    .flatMap(a -> IntStream.rangeClosed(a, 100)
            .filter(b -> square[a*a+b*b]!=0)
            .mapToObj(b -> String.format("[%d %d %d]", a, b, square[a*a+b*b]))
    )
    .forEach(System.out::println);

Note that this is one of the few cases, where a nested forEach would be an alternative to flatMap:

int[] square=new int[20001];
IntStream.rangeClosed(1, 141).forEach(i -> square[i*i]=i);
IntStream.rangeClosed(1, 100)
    .forEach(a -> IntStream.rangeClosed(a, 100)
            .filter(b -> square[a*a+b*b]!=0)
            .forEach(b -> System.out.printf("[%d %d %d]%n", a, b, square[a*a+b*b]))
    );

@Holger, thank you very much for the comments. I appreciate your insight. I am unfamiliar with benchmarks, so I will have to differ to you on that. I wonder if your cast from double to int in the filter() method is more intensive than the %1 comparison, and your final map() call with the String.format(), while a good way to do it, is not better than using my method with array values. In fact that map() call is an extra step. That said, I still appreciate your approach. — Steve T, Jan 22 '17 at 01:26
Of course, a cast from `double` to `int` is *much* cheaper than a modulo operation, but you can still hope, that the JVM’s optimizer is capable of converting your specific `…%1` to something more efficient, perhaps exactly to a cast to `int`. I don’t see, how the `map` to `String` is an “extra step”, compared to your original code having a `map` to an array *and* the conversion of that array to a `String`. However, the last example comes without any mapping step… — Holger, Jan 23 '17 at 10:05
@Holger, on second thought, your approaches are very good. Thank you for them. — Steve T, Apr 22 '17 at 04:12

Numerical Streams Example Optimization

1 Answers1