4

I have a List of objects that look like this:

{
    value=500
    category="GROCERY"
},
{
    value=300
    category="GROCERY"
},
{
    value=100
    category="FUEL"
},
{
    value=300
    category="SMALL APPLIANCE REPAIR"
},
{
    value=200
    category="FUEL"
}

I would like to transform that into a List of objects that looks like this:

{
    value=800
    category="GROCERY"
},
{
    value=300
    category="FUEL"
},
{
    value=300
    category="SMALL APPLIANCE REPAIR"
}

Basically add up all the values with the same category.

Should I be using flatMap? Reduce? I don't understand the nuances of these to figure it out.

Help?

EDIT:

There are close duplicates of this question: Is there an aggregateBy method in the stream Java 8 api? and Sum attribute of object with Stream API

But in both cases, the end result is a map, not a list

The final solution I used, based on answers by @AndrewTobilko and @JBNizet was:

List<MyClass> myClassList = list.stream()
    .collect(Collectors.groupingBy(YourClass::getCategory,
                    Collectors.summingInt(YourClass::getValue)))
    .entrySet().stream().map(e -> new MyClass(e.getKey(), e.getValue()).collect(toList());
Community
  • 1
  • 1
Somaiah Kumbera
  • 7,063
  • 4
  • 43
  • 44

3 Answers3

5

The Collectors class provides a 'groupingBy' that allows you to perform a 'group by' operation on a stream (similar behavior like GROUP BY in databases). Under the assumption that your list of objects is of type 'Objects', the following code should work:

Map<String, Integer> valueByCategory = myObjects.stream().collect(Collectors.groupingBy(MyObjects::getCategory, Collectors.summingInt(MyObjects::getValue)));

The code basically groups your stream by each category and runs a Collector on each group that sums up the return value of getValue() of every stream element. See https://docs.oracle.com/javase/8/docs/api/java/util/stream/Collectors.html

2

With static import of the Collectors class:

list.stream().collect(groupingBy(Class::getCategory, summingInt(Class::getValue)));

You will get a map Map<String, Integer>. Class has to have getValue and getCategory methods to write method references, something like

public class Class {
    private String category;
    private int value;

    public String getCategory() { return category; }
    public int getValue() { return value; }
}
Andrew Tobilko
  • 48,120
  • 14
  • 91
  • 142
0

Reduce-based method:

List<Obj> values = list.stream().collect(
        Collectors.groupingBy(Obj::getCategory, Collectors.reducing((a, b) -> new Obj(a.getValue() + b.getValue(), a.getCategory())))
).values().stream().map(Optional::get).collect(Collectors.toList());

Bad thing is secondary stream() call to remap result from Optional<Obj> and intermediate Map<String, Optional<Obj>> object.

I can suggest alternative variant (less readable) using sorting:

List<Obj> values2 = list.stream()
    .sorted((o1, o2) -> o1.getCategory().compareTo(o2.getCategory()))
    .collect(
        LinkedList<Obj>::new,
        (ll, obj) -> {
            Obj last = null;
            if(!ll.isEmpty()) {
                last = ll.getLast();
            }

            if (last == null || !last.getCategory().equals(obj.getCategory())) {
                ll.add(new Obj(obj.getValue(), obj.getCategory())); //deep copy here
            } else {
                last.setValue(last.getValue() + obj.getValue());
            }
        },
        (list1, list2) -> {
              //for parallel execution do a simple merge join here
              throw new RuntimeException("parallel evaluation not supported"); 
         }
    );

Here we sort list of Objs by category and then processing it sequentially, squashing consecutive objects from same category.

Unfortunately, there is no method in Java to do it without manually keeping last element or elements list (see also Collect successive pairs from a stream)

Working example with both snippets can be checked here: https://ideone.com/p3bKV8

Nikolay
  • 1,949
  • 18
  • 26
  • 2
    `summingInt` is more straightforward (and efficient) than using `reduce`. – Brian Goetz Jul 05 '16 at 22:09
  • @BrianGoetz ... can't argue with that. Just focused on keep the data structure and missed transforming to `Map` variant. What could you say about the second method? Can sorting and consecutive summing be more effective that `grouping`+`summingInt` (sort-based approach compared to hash-based)? – Nikolay Jul 05 '16 at 22:40
  • @Nikolay Sorting + summing can only be faster, iff the hashbased collection need `O(n)` for lookup. If there is a proper hash function - and there is - then the lookup needs `O(1)` which answers your question. – Flown Jul 06 '16 at 09:26
  • **UPD:** removed excessive qualifiers from answer due to comments. – Nikolay Jul 06 '16 at 09:52