19

How can I limit groupBy by each entry?

For example (based on this example: stream groupBy):

studentClasses.add(new StudentClass("Kumar", 101, "Intro to Web"));
studentClasses.add(new StudentClass("White", 102, "Advanced Java"));
studentClasses.add(new StudentClass("Kumar", 101, "Intro to Cobol"));
studentClasses.add(new StudentClass("White", 101, "Intro to Web"));
studentClasses.add(new StudentClass("White", 102, "Advanced Web"));
studentClasses.add(new StudentClass("Sargent", 106, "Advanced Web"));
studentClasses.add(new StudentClass("Sargent", 103, "Advanced Web"));
studentClasses.add(new StudentClass("Sargent", 104, "Advanced Web"));
studentClasses.add(new StudentClass("Sargent", 105, "Advanced Web"));

This method return simple group:

   Map<String, List<StudentClass>> groupByTeachers = studentClasses
            .stream().collect(
                    Collectors.groupingBy(StudentClass::getTeacher));

What if I want to limit the returned collections? Let's assume I want only the first N classes for every teacher. How can it be done?

Tunaki
  • 132,869
  • 46
  • 340
  • 423
yossico
  • 3,421
  • 5
  • 41
  • 76
  • 3
    What do you mean by first?, Do you means the classes with the lowest class number, the lowest names ASCIIBetically, or any random selection of N classes. Note: the set of classes may be unordered. – Peter Lawrey Nov 22 '15 at 10:29
  • @PeterLawrey You are right, I didn't mention that, for me the order is irrelevant but if we want a more thorough and general solution - ill be happy if you add a sorting example (by one of the fields) – yossico Nov 22 '15 at 11:45

4 Answers4

22

It would be possible to introduce a new collector that limits the number of elements in the resulting list.

This collector will retain the head elements of the list (in encounter order). The accumulator and combiner throw away every elements when the limit has been reached during collection. The combiner code is a little tricky but this has the advantage that no additional elements are added only to be thrown away later.

private static <T> Collector<T, ?, List<T>> limitingList(int limit) {
    return Collector.of(
                ArrayList::new, 
                (l, e) -> { if (l.size() < limit) l.add(e); }, 
                (l1, l2) -> {
                    l1.addAll(l2.subList(0, Math.min(l2.size(), Math.max(0, limit - l1.size()))));
                    return l1;
                }
           );
}

And then use it like this:

Map<String, List<StudentClass>> groupByTeachers = 
       studentClasses.stream()
                     .collect(groupingBy(
                          StudentClass::getTeacher,
                          limitingList(2)
                     ));
Community
  • 1
  • 1
Tunaki
  • 132,869
  • 46
  • 340
  • 423
7

You could use collectingAndThen to define a finisher operation on the resulting list. This way you can limit, filter, sort, ... the lists:

int limit = 2;

Map<String, List<StudentClass>> groupByTeachers =
    studentClasses.stream()
                  .collect(
                       groupingBy(
                           StudentClass::getTeacher,
                           collectingAndThen(
                               toList(),
                               l -> l.stream().limit(limit).collect(toList()))));
eee
  • 3,241
  • 1
  • 17
  • 34
  • This would still be filtering the values after they're already added to the map, but the best answer so far. – Razvan Manolescu Nov 22 '15 at 11:26
  • 3
    The idea of a finisher is nice but there's no need for an O(n) cost in the finisher. You can do something like `list -> list.size() <= limit ? list : list.subList(0, limit))` instead. But I still far prefer Tunaki's solution, which doesn't require sticking the extra elements in the list at all. – Brian Goetz Nov 23 '15 at 02:25
4

For this you need to .stream() the result of your Map. You can do this by doing:

// Part that comes from your example
Map<String, List<StudentClass>> groupByTeachers = studentClasses
            .stream().collect(
                    Collectors.groupingBy(StudentClass::getTeacher));

// Create a new stream and limit the result
groupByTeachers =
    groupByTeachers.entrySet().stream()
        .limit(N) // The actual limit
        .collect(Collectors.toMap(
            e -> e.getKey(),
            e -> e.getValue()
        ));

This isn't a very optimal way to do it. But if you .limit() on the initial list, then the grouping results would be incorrect. This is the safest way to guarantee the limit.

EDIT:

As stated in the comments this limits the teacher, not the class per teacher. In that case you can do:

groupByTeachers =
        groupByTeachers.entrySet().stream()
            .collect(Collectors.toMap(
                e -> e.getKey(),
                e -> e.getValue().stream().limit(N).collect(Collectors.toList()) // Limit the classes PER teacher
            ));
Albert Bos
  • 2,012
  • 1
  • 15
  • 26
3

This would give you the desired result, but it still categorizes all the elements of the stream:

final int N = 10;
final HashMap<String, List<StudentClass>> groupByTeachers = 
        studentClasses.stream().collect(
            groupingBy(StudentClass::getTeacher, HashMap::new,
                collectingAndThen(toList(), list -> list.subList(0, Math.min(list.size(), N)))));
muued
  • 1,666
  • 13
  • 25