54

I would like to know when I can use IntStream.range effectively. I have three reasons why I am not sure how useful IntStream.range is.

(Please think of start and end as integers.)

  1. If I want an array, [start, start+1, ..., end-2, end-1], the code below is much faster.

    int[] arr = new int[end - start];
    int index = 0;
    for(int i = start; i < end; i++)
        arr[index++] = i;
    

    This is probably because toArray() in IntStream.range(start, end).toArray() is very slow.

  2. I use MersenneTwister to shuffle arrays. (I downloaded MersenneTwister class online.) I do not think there is a way to shuffle IntStream using MersenneTwister.

  3. I do not think just getting int numbers from start to end-1 is useful. I can use for(int i = start; i < end; i++), which seems easier and not slow.

Could you tell me when I should choose IntStream.range?

Kevin Panko
  • 8,356
  • 19
  • 50
  • 61
  • 4
    What you can do with `IntStream.range()` is pass the resulting stream to a different method as a parameter. You can't do that with `for`. – biziclop Aug 17 '16 at 14:05
  • 1
    An interview question would have best been solved by an IntStream.range: [Array list algorithm - Interview](http://stackoverflow.com/a/38899680/984823) – Joop Eggen Aug 17 '16 at 14:20
  • 2
    *This is probably because toArray() [...] is very slow.*. How did you measure this? Can you post or link to a valid benchmark? And what does "very slow" mean? – Tunaki Aug 17 '16 at 17:43
  • `public static int[] range1(int begin, int end, int skip){ return IntStream.range(begin, end).filter(i -> (i-begin)%skip == 0).toArray(); }` `public static IntStream range2(int begin, int end, int skip){ return IntStream.range(begin, end).filter(i -> (i-begin)%skip == 0); }` –  Aug 18 '16 at 01:49
  • I just compared these two. range1 was about 10 times slower. I meant converting IntStream to Array is slow. –  Aug 18 '16 at 03:08
  • 4
    @Nickel it's pretty likely that your benchmark is flawed. Measuring Java performance is not like comparing two timestamps. – Tagir Valeev Aug 18 '16 at 03:26
  • 2
    @Nickel: your `range2` method doesn’t do anything. Of course, writing values into an array needs more time than doing nothing, but where’s the relevance to your claim that `toArray` was slower than a `for` loop doing the same? – Holger Aug 18 '16 at 11:02
  • 1
    @Holger Sorry, I made a mistake... I failed to test the speed of `toArray()` properly. These are my functions. `public static int[] range1(int start, int end, int step){ int[] arr = new int[(int)Math.ceil((double)(end-start)/step)]; int idx = 0; for(int i=start; i (i-start)%step == 0).toArray();}` Then, repeated `Arrays.stream(range1(start, end, step)).sum();` and `Arrays.stream(range2(start, end, step)).sum();` about a million times. –  Aug 18 '16 at 12:17
  • 1
    @Nickel (Ugh. Too much code in comments.) These different functions do different things. The `range1` version fills an array of known size, whereas the stream `toArray` version populates an array with an unknown number of results, so it has to do copying. (Of course we humans can compute the number of results, but the stream can't.) Sending a fixed-size stream into an array with `toArray` is as fast as a for-loop. – Stuart Marks Aug 19 '16 at 05:05

7 Answers7

46

There are several uses for IntStream.range.

One is to use the int values themselves:

IntStream.range(start, end).filter(i -> isPrime(i))....

Another is to do something N times:

IntStream.range(0, N).forEach(this::doSomething);

Your case (1) is to create an array filled with a range:

int[] arr = IntStream.range(start, end).toArray();

You say this is "very slow" but, like other respondents, I suspect your benchmark methodology. For small arrays there is indeed more overhead with stream setup, but this should be so small as to be unnoticeable. For large arrays the overhead should be negligible, as filling a large array is dominated by memory bandwidth.

Sometimes you need to fill an existing array. You can do that this way:

int[] arr = new int[end - start];
IntStream.range(0, end - start).forEach(i -> arr[i] = i + start);

There's a utility method Arrays.setAll that can do this even more concisely:

int[] arr = new int[end - start];
Arrays.setAll(arr, i -> i + start);

There is also Arrays.parallelSetAll which can fill an existing array in parallel. Internally, it simply uses an IntStream and calls parallel() on it. This should provide a speedup for large array on a multicore system.

I've found that a fair number of my answers on Stack Overflow involve using IntStream.range. You can search for them using these search criteria in the search box:

user:1441122 IntStream.range

One application of IntStream.range I find particularly useful is to operate on elements of an array, where the array indexes as well as the array's values participate in the computation. There's a whole class of problems like this.

For example, suppose you want to find the locations of increasing runs of numbers within an array. The result is an array of indexes into the first array, where each index points to the start of a run.

To compute this, observe that a run starts at a location where the value is less than the previous value. (A run also starts at location 0). Thus:

    int[] arr = { 1, 3, 5, 7, 9, 2, 4, 6, 3, 5, 0 };
    int[] runs = IntStream.range(0, arr.length)
                          .filter(i -> i == 0 || arr[i-1] > arr[i])
                          .toArray();
    System.out.println(Arrays.toString(runs));

    [0, 5, 8, 10]

Of course, you could do this with a for-loop, but I find that using IntStream is preferable in many cases. For example, it's easy to store an unknown number of results into an array using toArray(), whereas with a for-loop you have to handle copying and resizing, which distracts from the core logic of the loop.

Finally, it's much easier to run IntStream.range computations in parallel.

Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
  • 3
    Why do you use `forEach` to write into a pre-allocated array in your third example rather than using a clean `int[] arr = IntStream.range(start, end).toArray();`? – Holger Aug 18 '16 at 10:59
  • 2
    @Holger Sometimes you want to fill a pre-existing array. The OP claimed that `IntStream.range(start, end).toArray()` was too slow so he clearly knew about that. But I should clarify the context anyway. Also, only `Arrays.parallelSetAll` uses an `IntStream` so I'll adjust that too. – Stuart Marks Aug 18 '16 at 15:17
  • @Holger Thank you for the suggestion. Sorry, I was really confused, and made mistakes on my benchmark. However, this is true that on my laptop (Core i7 4710MQ, Java8u92), using a pre-made array is faster than using toArray() –  Aug 18 '16 at 23:42
  • `int[] arr = new int[end - start]; // Arrays.setAll(i -> i + start);` doesn't look right - missing an argument to Arrays.setAll()? – MyStackRunnethOver Apr 23 '18 at 21:41
  • 2
    @MyStackRunnethOver Yep, missing arg, thanks. Fixed. – Stuart Marks Apr 23 '18 at 23:09
7

IntStream.range returns a range of integers as a stream so you can do stream processing over it.

like taking square of each element

IntStream.range(1, 10).map(i -> i * i);  
tonakai
  • 805
  • 1
  • 7
  • 15
  • 1
    Similiar to C#'s Enumerable.Range https://msdn.microsoft.com/en-us/library/system.linq.enumerable.range(v=vs.110).aspx – JonH Aug 17 '16 at 15:46
7

Here's an example:

public class Test {

    public static void main(String[] args) {
        System.out.println(sum(LongStream.of(40,2))); // call A
        System.out.println(sum(LongStream.range(1,100_000_000))); //call B
    }

    public static long sum(LongStream in) {
        return in.sum();
    }

}

So, let's look at what sum() does: it counts the sum of an arbitrary stream of numbers. We call it in two different ways: once with an explicit list of numbers, and once with a range.

If you only had call A, you might be tempted to put the two numbers into an array and pass it to sum() but that's clearly not an option with call B (you'd run out of memory). Likewise you could just pass the start and end for call B, but then you couldn't support the case of call A.

So to sum it up, ranges are useful here because:

  • We need to pass them around between methods
  • The target method doesn't just work on ranges but any stream of numbers
  • But it only operates on individual numbers of the stream, reading them sequentially. (This is why shuffling with streams is a terrible idea in general.)

There is also the readability argument: code using streams can be much more concise than loops, and thus more readable, but I wanted to show an example where a solution relying on IntStreans is functionally superior too.

I used LongStream to emphasise the point, but the same goes for IntStream

And yes, for simple summing this may look like a bit of an overkill, but consider for example reservoir sampling

biziclop
  • 48,926
  • 12
  • 77
  • 104
3

Here are few differences that comes to my head between IntStream.range and traditional for loops :

  • IntStream are lazily evaluated, the pipeline is traversed when calling a terminal operation. For loops evaluate at each iteration.
  • IntStream will provides you some functions that are commonly applied to a range of ints such as sum and avg.
  • IntStream will allow you to code multiple operation over a range of int in a functional way which read more fluently - specially if you have a lot of operations.

So basically use IntStream when one or more of these differences are useful to you.

But please bear in mind that shuffling a Stream sound quite strange as a Stream is not a data structure and therefore it does not really make sense to shuffle it (in case you were planning on building a special IntSupplier). Shuffle the result instead.

As for the performance, while there may be a few overhead, you will still iterate N times in both case and should not really care more.

Jean-François Savard
  • 20,626
  • 7
  • 49
  • 76
2

Basically, if you want Stream operations, you can use the range() method. For example, to use concurrency or want to use map() or reduce(). Then you are better off with IntStream.

For example:

IntStream.range(1, 5).parallel().forEach(i -> heavyOperation());

Or:

IntStream.range(1, 5).reduce(1, (x, y) -> x * y)  
// > 24

You can achieve the second example also with a for-loop, but you need intermediate variables etc.

Also, if you want the first match for example, you can use findFirst() and cousins to stop consuming the rest of the Stream

Rob Audenaerde
  • 19,195
  • 10
  • 76
  • 121
  • *better* here is not really appropriate in my opinion. – Jean-François Savard Aug 17 '16 at 14:02
  • @Jean-FrançoisSavard please explain your opinion? – Rob Audenaerde Aug 17 '16 at 14:04
  • Well the second is equivalent to `for(int i = 1; i < 5; i++) result *= i;`. Both are different, but not *better*. The intermediate variable is not really a concern (there will be a lot of overhead using the stream approach anyway if all you do it multypling). The first one seems to lack a little of explanation. Sure parralelism can be a good approach, but in your example I'd rather create a Threadpool of 5 threads which execute heavyOperation, `i` is actually not used in the example anyway. – Jean-François Savard Aug 17 '16 at 14:17
  • 2
    To make it clear - I am not arguing that traditional loops are better or worse. Both are as fine, but I'd rather use `IntStream` if I know that I may have a lot of execution to do in my loop so that my code read more fluently (in a functional maneer). – Jean-François Savard Aug 17 '16 at 14:21
2

It totally depends on the use case. However, the syntax and stream API adds lot of easy one liners which can definitely replace the conventional loops.

IntStream is really helpful and syntactic sugar in some cases,

IntStream.range(1, 101).sum();
IntStream.range(1, 101).average();
IntStream.range(1, 101).filter(i -> i % 2 == 0).count();
//... and so on

Whatever you can do with IntStream you can do with conventional loops. As one liner is more precise to understand and maintain.

Still for negative loops we can not use IntStream#range, it only works in positive increment. So following is not possible,

for(int i = 100; i > 1; i--) {
    // Negative loop
}
  • Case 1 : Yes conventional loop is much faster in this case as toArray has a bit overhead.

  • Case 2 : I don't know anything about it, my apologies.

  • Case 3 : IntStream is not slow at all, IntStream.range and conventional loop are almost same in terms of performance.

See :

Community
  • 1
  • 1
akash
  • 22,664
  • 11
  • 59
  • 87
  • 1
    Alternatively, one could do `IntStream.iterate(from - 1, i -> i - 1).limit(from - to)` – Jean-François Savard Aug 17 '16 at 14:48
  • 1
    IntStream is as fast as conventional for loops (unless calling parallel()), but it is more memory-efficient and requires shorter code. Is this right? –  Aug 18 '16 at 02:19
  • 3
    @Nickel: for most use cases, a sequential `IntStream` will be as fast as a conventional loop though it depends on a lot of factors. It’s best to say that the order of magnitude is the same and the slight, unpredictable differences are irrelevant. It will tend to require *more* memory, but that’s a *temporary* memory usage which might even stay unnoticed. The most important difference is that you can indeed express a lot of tasks in much simpler code when using an `IntStream`. – Holger Aug 18 '16 at 10:48
  • 2
    @Jean-François Savard: don’t underestimate the HotSpot optimizer. A `%2`, applied on an `int`, is easy to recognize. And due to the way it’s implemented, `range(…).filter(…)` may be on par or even more efficient than `.iterate(…).limit(…)`. – Holger Aug 18 '16 at 10:53
0

You could implement your Mersenne Twister as an Iterator and stream from that.

Community
  • 1
  • 1
OldCurmudgeon
  • 64,482
  • 16
  • 119
  • 213
  • 1
    I was thinking something similar, but implementing MT as `IntSupplier` and then `IntStream#generate`. – bradimus Aug 17 '16 at 14:39