3

Suppose I have a LongStream with a range and I want to place the long values in a map as keys and a function result as value.

For instance:

Map<Long, Long> m = LongStream.range(1, 20) ...

long someFunction(long n) {
  return n * n;
}

The map should then contain 1 to 20 mapped to the squares of these values. I've looked at collect and Collectors but I can't seem to find the right solution.

Edit: I got the following to work.

Map<Long, Long> map = LongStream
        .range(1, 20)
        .boxed()
        .collect(toMap(identity(), AmicablePairs::properDivsSum));

Apart from not calling boxed, I also got the calling notation for the function wrong. I can't use the double colon of course, because I need to pass an argument.

shmosel
  • 49,289
  • 6
  • 73
  • 138
fwend
  • 1,813
  • 2
  • 15
  • 17
  • "_I can't use the double colon of course, because I need to pass an argument_" - what? Of course you can, something else is wrong. If `someFunction` is an instance method, you need to use `this::someFunction` so specify the instance to call it on. Further, use `Function.identity()` (as in my answer) in preference to `n -> n` as it 1) has clearer intent and 2) doesn't cause the creation of a lambda. – Boris the Spider Feb 21 '16 at 02:40
  • It's not an instance method. It's static, and I'm calling from main. – fwend Feb 21 '16 at 02:51
  • Full code is here: http://rosettacode.org/wiki/Amicable_pairs#Java – fwend Feb 21 '16 at 10:51
  • In that case you need to use a static method reference - `Class::someFunction`. If you are going to use Java 8 feature, I would strongly suggest learning about the three different types of method references - they are much more readable than lambdas and can also be faster. – Boris the Spider Feb 21 '16 at 12:26
  • I've tried that but I can't get it to work. It's says: "argument mismatch; bad return type in lambda expression. Long is not a functional interface". (n -> AmicablePairs::properDivsSum(n)) – fwend Feb 21 '16 at 14:01
  • `(n -> AmicablePairs::properDivsSum(n))` is **not** the correct syntax. This is some sort of conflation of lambda and method reference syntax. – Boris the Spider Feb 21 '16 at 14:01
  • Right, so how do I pass the argument? – fwend Feb 21 '16 at 14:03
  • Please read [the tutorial](https://docs.oracle.com/javase/tutorial/java/javaOO/methodreferences.html). The syntax in this case is `.map(AmicablePairs::properDivsSum)` as the method is `static` and takes a `long` and returns a `long` - this means that the method can implement `Function`. – Boris the Spider Feb 21 '16 at 14:04
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/104096/discussion-between-fwend-and-boris-the-spider). – fwend Feb 21 '16 at 14:19

2 Answers2

9

You'll need to box the values:

Map<Long, Long> m = LongStream.range(1, 20)
        .boxed()
        .collect(toMap(identity(), this::magic));

Slightly ugly, yes. But Java collections only support objects.

Boris the Spider
  • 59,842
  • 6
  • 106
  • 166
2

Note that in most cases you can retrace what the built-in collectors do and provide these functions directly to a stream of primitive values, e.g.:

Map<Long, Long> map = LongStream.rangeClosed(1, limit).parallel()
  .collect(HashMap::new, (m,l)->m.put(l, properDivsSum(l)), Map::putAll);

This only differs in the treatment of key collisions, but since we know that there won’t be any collisions, that’s irrelevant here.


However, you should ask yourself, why are you storing long values in a Map? That’s a really bad data structure for this task. Instead, consider:

public class AmicablePairs {

    public static void main(String[] args) {
        final int limit = 20_000;
        long[] map = LongStream.rangeClosed(1, limit).parallel()
                .map(AmicablePairs::properDivsSum).toArray();
        IntStream.rangeClosed(1, limit).parallel()
                 .forEach(n -> {
                     long m = map[n-1];
                     if(m > n && m <= limit && map[(int)m-1] == n)
                         System.out.printf("%s %s %n", n, m);
                 });
    }

    public static Long properDivsSum(long n) {
        return LongStream.rangeClosed(1, (n+1)/2).filter(i -> n%i == 0).sum();
    }
}

Note that, since the range streams have a predictable size, the array generation will be much more efficient than the toMap collector which doesn’t know the expected size. That’s especially relevant for the parallel processing as with a known size, the toArray operation doesn’t require intermediate storage, that has to be merged afterwards. Plus, there’s no boxing conversion required.

By the way, the second operation, which will print the values, is unlikely to become accelerated by parallel processing as the internal synchronation of System.out.printf will negate most potential benefit of parallel processing. I’d remove the .parallel() from it.

Another option is to separate the arithmetic, which could benefit from parallel processing, from the printing, i.e.

long[] map = LongStream.rangeClosed(1, limit).parallel()
        .map(AmicablePairs::properDivsSum).toArray();
int[] found = IntStream.rangeClosed(1, limit).parallel()
    .filter(n -> {
        long m = map[n-1];
        return m > n && m <= limit && map[(int)m-1] == n;
    }).toArray();
Arrays.stream(found).forEach(n -> System.out.printf("%s %s %n", n, map[n-1]));

but I don’t know whether it will improve the performance, as the operations of the second stream are possibly too simple to compensate the initial overhead of parallel processing.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • Interesting. Using an array in stead of map is slower on my system: when I set the limit to 200_000, I get 24 seconds vs 20. You're right that the parallel on the second part has no effect, I'll remove it. Moving the printf out of the stream makes no difference either. – fwend Feb 22 '16 at 21:22
  • On my system, the array variant is faster. There’s only a slightly higher first-time initialization overhead as it uses the class `IntStream` additional to `LongStream`, so that class and its internally used classes have to be loaded and initialized. And for the second operation, it depends on whether sequential or parallel processing is used. There, using `.unordered().parallel()` may improve the speed and will mimic the behavior of your `Map` solution of unspecified result order. Generally, you need multiple runs to get a useful result. See also http://stackoverflow.com/questions/504103/ – Holger Feb 23 '16 at 10:13