50

I have a List<Item> collection. I need to convert it into Map<Integer, Item> The key of the map must be the index of the item in the collection. I can not figure it out how to do this with streams. Something like:

items.stream().collect(Collectors.toMap(...));

Any help?

As this question is identified as possible duplicate I need to add that my concrete problem was - how to get the position of the item in the list and put it as a key value

Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334
Nikolay
  • 1,111
  • 2
  • 12
  • 16
  • 7
    [Maybe](http://stackoverflow.com/questions/20363719/java-8-listv-into-mapk-v) or [maybe](http://www.leveluplunch.com/java/examples/convert-list-to-map/) or [maybe](http://www.leveluplunch.com/java/examples/convert-java-util-stream-to-map/) – MadProgrammer Sep 30 '15 at 06:15
  • 3
    `EntryStream.of(items).toMap();` using my free [StreamEx](https://github.com/amaembo/streamex) library. JavaDoc is [here](http://amaembo.github.io/streamex/javadoc/javax/util/streamex/EntryStream.html#of-java.util.List-). – Tagir Valeev Sep 30 '15 at 06:35
  • 1
    doing a little research for this question, I learned that there is amazingly no `zip` in java 8 streams. – njzk2 Sep 30 '15 at 14:51
  • @njzk2, that's simply because you cannot parallelize zipped stream. Having random access sources (for example, two `ArrayList`) it's not very difficult to zip them via `IntStream.range(0,list1.size()).mapToObj(idx -> doSomethingWith(list1.get(idx), list2.get(idx)))` and the result will be parallel-friendly – Tagir Valeev Oct 01 '15 at 03:41

6 Answers6

50

You can create a Stream of the indices using an IntStream and then convert them to a Map :

Map<Integer,Item> map = 
    IntStream.range(0,items.size())
             .boxed()
             .collect(Collectors.toMap (i -> i, i -> items.get(i)));
Eran
  • 387,369
  • 54
  • 702
  • 768
  • that stops working as soon as `items` is no longer a `List`. And is highly inefficient if `items` is, for example, a `LinkedList` instead of an `ArrayList`. – njzk2 Sep 30 '15 at 15:03
  • @njzk2 See [my answer](http://stackoverflow.com/a/32918035/1441122) if you don't have a `List` or if it's expensive to access by index. – Stuart Marks Oct 03 '15 at 01:34
13

One more solution just for completeness is to use custom collector:

public static <T> Collector<T, ?, Map<Integer, T>> toMap() {
    return Collector.of(HashMap::new, (map, t) -> map.put(map.size(), t), 
            (m1, m2) -> {
                int s = m1.size();
                m2.forEach((k, v) -> m1.put(k+s, v));
                return m1;
            });
}

Usage:

Map<Integer, Item> map = items.stream().collect(toMap());

This solution is parallel-friendly and does not depend on the source (you can use list without random access or Files.lines() or whatever).

Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334
  • this works iff a/ the combiner is guaranteed to be called with m1 and m2 in the right order b/ each accumulating map is called on a continuous sequence of items. This breaks if for example, odd values are accumulated in a map and even values in another one. I haven't found any source that suggests this could not happen. – njzk2 Sep 30 '15 at 15:12
  • 1
    @njzk2, nevertheless this could not happen if your stream is ordered. This way all the existing collectors (like `toList()`) actually work. – Tagir Valeev Sep 30 '15 at 15:16
  • I guess that makes sense. I will be researching into how collectors guarantee the order of the stream after parallelization happens. – njzk2 Sep 30 '15 at 15:28
  • @njzk2, the Collector contract is described in [API docs](https://docs.oracle.com/javase/8/docs/api/java/util/stream/Collector.html). I fulfill the contract. When correct map and new element is passed to the accumulator, it produces new correct map. When two correct maps are passed to combiner, it produces new correct map. Just fulfill the contract, and you'll get the correct result. That's the beauty of interfaces. – Tagir Valeev Sep 30 '15 at 15:47
  • thanks. It appears that the stream is partitioned in substrings, not subsequences, so that works indeed! – njzk2 Sep 30 '15 at 16:00
11

Don't feel like you have to do everything in/with the stream. I would just do:

AtomicInteger index = new AtomicInteger();
items.stream().collect(Collectors.toMap(i -> index.getAndIncrement(), i -> i));

As long as you don't parallelise the stream this will work and it avoids potentially expensive and/or problematic (in the case of duplicates) get() and indexOf() operations.

(You cannot use a regular int variable in place of the AtomicInteger because variables used from outside a lambda expression must be effectively final. Note that when uncontested (as in this case), AtomicInteger is very fast and won't pose a performance problem. But if it worries you you can use a non-thread-safe counter.)

Pepijn Schmitz
  • 2,143
  • 1
  • 17
  • 18
  • 4
    You call `List.get()` expensive but recommend using an `AtomicInteger`? – Holger Sep 30 '15 at 10:09
  • 4
    @Holger No. I call `List.get()` *potentially* expensive. `AtomicInteger` is O(1), `List.get()` could be anything up to O(n). – Pepijn Schmitz Sep 30 '15 at 10:12
  • 3
    Only if you use `LinkedList` which is not really useful. On the other hand, `AtomicInteger` being `O(1)` isn’t really relevant when it comes to the hidden cost of thread safety in an operation that you admit yourself, doesn’t work in parallel. If you start your answer with “Don't feel like you have to do everything in/with the stream”, why don’t you provide a stream-less alternative, like a straight-forward loop? That would be better than presenting a discouraged stream usage… – Holger Sep 30 '15 at 10:18
  • 2
    @Holger The OP doesn't specify the `List` implementation. You seem to be prejudiced against `LinkedList`, but of course in reality there is nothing wrong with it and the `List` could easily be one, or possibly even another implementation which is even more expensive. Why second guess it? This way will always be the fastest. – Pepijn Schmitz Sep 30 '15 at 10:27
  • 1
    @Holger The reason for the stream is that the OP specifically asked for it: *" I can not figure it out how to do this with streams."* Regarding the `AtomicInteger`: when uncontested it is practically as fast as incrementing a regular variable; there is no "hidden cost". – Pepijn Schmitz Sep 30 '15 at 10:30
  • 2
    I’m not *prejudicing* against `LinkedList` as it already exists for more than fifteen years now, which is enough time to determine that it is not useful in real life. The theoretical advantage is only one operation, inserting at an arbitrary index, but since it has to allocate memory for that and update half a dozen node references, that advantage doesn’t really materialize. It would require very large lists to outplay an `ArrayList`, however for large lists, the insane memory overhead of `LinkedList` will counter-act it. `LinkedList` wins only in `O(…)` comparisons which ignore memory effects – Holger Sep 30 '15 at 10:45
  • @Holger That's just more theoretical and irrelevant second guessing. The point is that it shouldn't matter which implementation of `List` the OP uses, and with this solution, it doesn't. – Pepijn Schmitz Sep 30 '15 at 10:51
  • LinkedList.size() is O(1). – Reinstate Monica Sep 30 '15 at 14:51
  • @PepijnSchmitz You said "List.get() could be anything up to O(n)" and Holger replied "Only if you used LinkedList", which I thought implied (s)he thought LinkedList.get() was O(n). In fact it is, but more specifically it's also O(1). If I misunderstood, apologies. – Reinstate Monica Sep 30 '15 at 17:07
  • @Solomonoff'sSecret LinkedList.get() *is* O(n). But you said size(), not get(). – Pepijn Schmitz Sep 30 '15 at 17:20
  • @PepijnSchmitz Of course. Obviously I repeatedly can't tell the difference between the words "size" and "get"... – Reinstate Monica Sep 30 '15 at 17:31
7

This is updated answer and has none of the problems mentioned in comments.

Map<Integer,Item> outputMap = IntStream.range(0,inputList.size()).boxed().collect(Collectors.toMap(Function.identity(), i->inputList.get(i)));
akhil_mittal
  • 23,309
  • 7
  • 96
  • 95
1

Using a third party library (protonpack for example, but there are others) you can zip the value with its index and voila:

StreamUtils.zipWithIndex(items.stream())
    .collect(Collectors.toMap(Indexed::getIndex, Indexed::getValue));

although getIndex returns a long, so you may need to cast it using something similar to:

i -> Integer.valueOf((int) i.getIndex())
njzk2
  • 38,969
  • 7
  • 69
  • 107
  • 1
    Using @TagirValeev 's library, it would be as simple as : `EntryStream.of(items).toMap();`. – Jean-François Savard Sep 30 '15 at 15:57
  • 1
    @Jean-FrançoisSavard, not to mention that zipWithIndex creates the stream which cannot be parallelized at all. – Tagir Valeev Oct 01 '15 at 01:22
  • @TagirValeev and you are confident that `EntryStream` can? – njzk2 Oct 01 '15 at 01:29
  • 1
    @njzk2, surely it is. It however relies on fast random access as [documentation](http://amaembo.github.io/streamex/javadoc/javax/util/streamex/EntryStream.html#of-java.util.List-) says. Internally it's similar to Eran solution (which is also parallelizable and works with reasonable speed only for random access source). In contrast protonpack solution need no random access (roughly speaking it's closer to Pepijn Schmitz answer). – Tagir Valeev Oct 01 '15 at 02:04
  • @TagirValeev the same comment applies, then: it only works with `List`, and its complexity depends on the access time of `get(i)` of the list's implementation. – njzk2 Oct 01 '15 at 02:09
1

Eran's answer is usually the best approach for random-access lists.

If your List isn't random access, or if you have a Stream instead of a List, you can use forEachOrdered:

Stream<Item> stream = ... ;
Map<Integer, Item> map = new HashMap<>();
AtomicInteger index = new AtomicInteger();
stream.forEachOrdered(item -> map.put(index.getAndIncrement(), item));

This is safe, if the stream is parallel, even though the destination map is thread-unsafe and is operated upon as a side effect. The forEachOrdered guarantees that items are processed one-at-a-time, in order. For this reason it's unlikely that any speedup will result from running in parallel. (There might be some speedup if there are expensive operations in the pipeline before the forEachOrdered.)

Community
  • 1
  • 1
Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
  • 1
    Why so complex? With `forEachOrdered` you don't need `AtomicInteger`, simply use `stream.forEachOrdered(item -> map.put(map.size(), item))`. Reading non-volatile field `HashMap.size` which is updated anyways is no worse than using CAS in `AtomicInteger`. – Tagir Valeev Oct 03 '15 at 02:30
  • @TagirValeev I suppose so. My main point is the use of `forEachOrdered` which hadn't been mentioned. – Stuart Marks Oct 03 '15 at 04:33
  • Sure it's quite short solution, though using the `Collector` proposed in my version is more conceptually correct. Actually the best solution would be to use `toList()` and write a special adapter (based on `AbstractMap`) which adapts a `List` to `Map`. Storing them into `HashMap` is just a waste of time and memory. – Tagir Valeev Oct 03 '15 at 07:24