12

I want to something which is similar to the scala grouped function. Basically, pick 2 elements at a time and process them. Here is a reference for the same :

Split list into multiple lists with fixed number of elements

Lambdas do provide things like groupingBy and partitioningBy but none of them seem to do the same as the grouped function in Scala. Any pointers would be appreciated.

Community
  • 1
  • 1
vamosrafa
  • 685
  • 5
  • 11
  • 35

6 Answers6

27

You can use Guava library.

List<Integer> bigList = ... List<List<Integer>> smallerLists = Lists.partition(bigList, 10);

Alidad
  • 5,463
  • 1
  • 24
  • 47
haki
  • 401
  • 1
  • 3
  • 8
18

It sounds like a problem that is better handled like a low-level Stream operation just like the ops provided by the Stream API itself. A (relative) simple solution may look like:

public static <T> Stream<List<T>> chunked(Stream<T> s, int chunkSize) {
    if(chunkSize<1) throw new IllegalArgumentException("chunkSize=="+chunkSize);
    if(chunkSize==1) return s.map(Collections::singletonList);
    Spliterator<T> src=s.spliterator();
    long size=src.estimateSize();
    if(size!=Long.MAX_VALUE) size=(size+chunkSize-1)/chunkSize;
    int ch=src.characteristics();
    ch&=Spliterator.SIZED|Spliterator.ORDERED|Spliterator.DISTINCT|Spliterator.IMMUTABLE;
    ch|=Spliterator.NONNULL;
    return StreamSupport.stream(new Spliterators.AbstractSpliterator<List<T>>(size, ch)
    {
        private List<T> current;
        @Override
        public boolean tryAdvance(Consumer<? super List<T>> action) {
            if(current==null) current=new ArrayList<>(chunkSize);
            while(current.size()<chunkSize && src.tryAdvance(current::add));
            if(!current.isEmpty()) {
                action.accept(current);
                current=null;
                return true;
            }
            return false;
        }
    }, s.isParallel());
}

Simple test:

chunked(Stream.of(1, 2, 3, 4, 5, 6, 7), 3)
  .parallel().forEachOrdered(System.out::println);

The advantage is that you do not need a full collection of all items for subsequent stream processing, e.g.

chunked(
    IntStream.range(0, 1000).mapToObj(i -> {
        System.out.println("processing item "+i);
        return i;
    }), 2).anyMatch(list->list.toString().equals("[6, 7]")));

will print:

processing item 0
processing item 1
processing item 2
processing item 3
processing item 4
processing item 5
processing item 6
processing item 7
true

rather than processing a thousand items of IntStream.range(0, 1000). This also enables using infinite source Streams:

chunked(Stream.iterate(0, i->i+1), 2).anyMatch(list->list.toString().equals("[6, 7]")));

If you are interested in a fully materialized collection rather than applying subsequent Stream operations, you may simply use the following operation:

List<Integer> list=Arrays.asList(1, 2, 3, 4, 5, 6, 7);
int listSize=list.size(), chunkSize=2;
List<List<Integer>> list2=
    IntStream.range(0, (listSize-1)/chunkSize+1)
             .mapToObj(i->list.subList(i*=chunkSize,
                                       listSize-chunkSize>=i? i+chunkSize: listSize))
             .collect(Collectors.toList());
Holger
  • 285,553
  • 42
  • 434
  • 765
  • I have used the last lambda expression. Seems pretty concise and has worked for me! – vamosrafa Feb 02 '15 at 13:35
  • Once `tryAdvance` returns false, it will return it always thereafter, so why do you need to cache the list across invocations? With normal usage that means that the Spliterator always holds on to a list after it is consumed. – Marko Topolnik Aug 18 '15 at 11:34
  • 1
    @Marko Topolnik: honestly, I don’t remember. Maybe I encountered a spliterator not behaving properly, maybe it’s an artifact of a previous implementation attempt or a `forEachRemaining` method… but it doesn’t hold a reference after consumption as it is explicitly `null`ed. – Holger Aug 18 '15 at 12:09
  • I took that into account: it nulls it out whenever it is about to return `true`, but the last invocation is always the one returning `false` and in this case the list is retained. – Marko Topolnik Aug 18 '15 at 12:11
  • 1
    @Marko Topolnik: but then the list is empty and not consumed. I will see whether I find out if there was a reason for doing it this way, otherwise I’ll edit it… – Holger Aug 18 '15 at 12:15
  • I'll be interested to learn the reason if you find it. So far I was assuming a spliterator could never change its mind and return `true` after having returned `false`. – Marko Topolnik Aug 18 '15 at 12:18
  • @Holger I wish I could upvote more: it took just a few minutes to find a solution that works like a charm, thank you! May I ask a question despite it's a two-year answer? Would it be possible to re-implement your solution in a sort of something that could fit a stream methods chain fluently? Having it `static` requires to break the streams fluent interface chain, and I was thinking of something like an "`unflatMap`" method, or a special collector producing a stream of lists. Would it be possible? I don't believe though, but it would be nice to have some lights shed on it. Thank you in advance. – Lyubomyr Shaydariv Jan 20 '17 at 18:22
  • @Lyubomyr Shaydariv: unfortunately, there is no way to add methods in a simple way. The only solution would be to create and implement an extended Stream interface, but you would have to implement all existing methods too (can be as simple as delegating to a wrapped stream, still impractical due to the number of methods). – Holger Jan 23 '17 at 09:28
  • @Holger Ah, sorry, I was not clear asking the question. I wanted to ask, whether it's possible to create a collector thus to be "chainable" to regular streams? Let's say, something like `static Collector collectToSlicedLists(int sliceSize)`. If this idea would not violate the streams contracts, of course... – Lyubomyr Shaydariv Jan 23 '17 at 09:36
  • 2
    @Lyubomyr Shaydariv: indeed, that’s a different question. The focus of this answer was to provide an operation that returns a Stream that can be used to chain more Stream operations, keeping the laziness. A `collect` operation is a terminal operation, initiating the actual processing. Such a `Collector` should be possible, I’m quiet sure, that such a solution already exists here on SO. – Holger Jan 23 '17 at 10:06
1

You can create your own collector. Something like this:

class GroupingCollector<T> implements Collector<T, List<List<T>>, List<List<T>>> {
    private final int elementCountInGroup;

    public GroupingCollector(int elementCountInGroup) {
        this.elementCountInGroup = elementCountInGroup;
    }

    @Override
    public Supplier<List<List<T>>> supplier() {
        return ArrayList::new;
    }

    @Override
    public BiConsumer<List<List<T>>, T> accumulator() {
        return (lists, integer) -> {
            if (!lists.isEmpty()) {
                List<T> integers = lists.get(lists.size() - 1);
                if (integers.size() < elementCountInGroup) {
                    integers.add(integer);
                    return;
                }
            }

            List<T> list = new ArrayList<>();
            list.add(integer);
            lists.add(list);
        };
    }

    @Override
    public BinaryOperator<List<List<T>>> combiner() {
        return (lists, lists2) -> {
            List<List<T>> r = new ArrayList<>();
            r.addAll(lists);
            r.addAll(lists2);
            return r;
        };
    }

    @Override
    public Function<List<List<T>>, List<List<T>>> finisher() {
        return lists -> lists;
    }

    @Override
    public Set<Characteristics> characteristics() {
        return Collections.emptySet();
    }
}

And then you can use it in a way like this:

    List<List<Integer>> collect = Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).collect(new GroupingCollector<>(3));
    System.out.println(collect);

Will print:

[[1, 2, 3], [4, 5, 6], [7, 8, 9], [10]]

Ιναη ßαbαηιη
  • 3,410
  • 3
  • 20
  • 41
  • This collector will work incorrectly for parallel streams, thus `combiner()` implementation is useless (better to throw `UnsupportedOperationException`). You cannot perform effective parallel collection for this task without knowing the indices of source elements. – Tagir Valeev Jun 23 '15 at 09:07
  • Yes, I know. Author didn't mention anything about parallelism – Ιναη ßαbαηιη Jun 23 '15 at 13:02
1

A recursive solution to transform the list to a list-of-lists would also be possible

int chunkSize = 2;

private <T> List<List<T>> process(List<T> list) {
    if (list.size() > chunkSize) {
        List<T> chunk = list.subList(0, chunkSize);
        List<T> rest = list.subList(chunkSize, list.size());
        List<List<T>> lists = process(rest);
        return concat(chunk, lists);
    } else {
        ArrayList<List<T>> retVal = new ArrayList<>();
        retVal.add(list);
        return retVal;
    }
}

private <T> List<List<T>> concat(List<T> chunk, List<List<T>> rest) {
    rest.add(0, chunk);
    return rest;
}
mariatsji
  • 81
  • 4
0

You could write your own collector finisher, similar to

final List<String> strings = Arrays.asList("Hello", "World", "I", "Am", "You");
final int size = 3;

final List<List<String>> stringLists = strings.stream()
        .collect(Collectors.collectingAndThen(Collectors.toList(), new Function<List<String>, List<List<String>>>() {
            @Override
            public List<List<String>> apply(List<String> strings) {
                final List<List<String>> result = new ArrayList<>();
                int counter = 0;
                List<String> stringsToAdd = new ArrayList<>();

                for (final String string : strings) {
                    if (counter == 0) {
                        result.add(stringsToAdd);
                    } else {
                        if (counter == size) {
                            stringsToAdd = new ArrayList<>();
                            result.add(stringsToAdd);
                            counter = 0;
                        }
                    }

                    ++counter;
                    stringsToAdd.add(string);
                }

                return result;
            }
        }));

System.out.println("stringLists = " + stringLists); // stringLists = [[Hello, World, I], [Am, You]]
Smutje
  • 17,733
  • 4
  • 24
  • 41
  • Thanks for the reply. I have done something on these lines.. Just wanted to know, is this is best that we can do using lambdas? I was wondering whether there's a more elegant way of doing this.. – vamosrafa Jan 29 '15 at 09:30
0

A simple version with java 8 streams api:

static <T> List<List<T>> partition(List<T> list, Integer partitionSize) {
    int numberOfLists = BigDecimal.valueOf(list.size())
        .divide(BigDecimal.valueOf(partitionSize), 0, CEILING)
        .intValue();

    return IntStream.range(0, numberOfLists)
        .mapToObj(it -> list.subList(it * partitionSize, Math.min((it+1) * partitionSize, list.size())))
        .collect(Collectors.toList());
}
ndr_brt
  • 121
  • 1
  • 6