0

I have a list of Settlement class which has the following attributes:

public class Settlement {
    private String contractNo;
    private String smartNo;
    private String dealTrackNo;
    private String buySellFlag;
    private String cashFlowType;
    private String location;
    private String leaseNo;
    private String leaseName;
    private double volume;
    private double price;
    private double settleAmount;

    // getters and setters
}

Now I would like to group the list of Settlement by SmartNo (String) and get the sum over settleAmount which becomes the new settleAmount for each SmartNo.

Since I am using Java 8, stream should be the way to go.

Groupby should be quite straight forward using the following code:

Map<String, List<Settlement>> map = list.stream()
              .collect(Collectors.groupingBy(Settlement::getSmartNo));
System.out.println(map.getValues());

What if I want to get a new list after grouping by SmartNo and summing over settlementAmount? Most of the examples out there only shows how to print out the sums. What I am interested is how to get the aggregated list?

Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334
ddd
  • 4,665
  • 14
  • 69
  • 125
  • `stream` the entrySet and `map`, it to your result object, and `collect` it to a `List`. – teppic Feb 12 '17 at 05:10
  • @teppic Can you be more specific? I am quite new to Java 8. Not sure how to `map` it to my result object. – ddd Feb 12 '17 at 05:12
  • What type of object do you require in your result list? Is a simple tuple with your `SmartNo` and sum adequate? – teppic Feb 12 '17 at 05:15
  • @teppic these two and two others: volume and price – ddd Feb 12 '17 at 05:19
  • You want to sum volume, price, and settlementAmount for each smartNo? – teppic Feb 12 '17 at 05:22
  • @teppic, Actually volume is always same for `Settlement`s with same SmartNo, no need for price in new list. So `SmartNo`, `volume` and sum of `settleAmount` is sufficient. – ddd Feb 12 '17 at 05:24

2 Answers2

2

I think the not-too-complex way through is a new stream on each member of the values() of your map and then a map() and reduce(). I am mapping to a new class AggregatedSettlement with just the three fields smartNo, volume and settleAmount (the last will be the sum). And then reducing by summing the settleAmounts.

    List<AggregatedSettlement> aggregatedList = list.stream()
            .collect(Collectors.groupingBy(Settlement::getSmartNo))
            .values()
            .stream()
            .map(innerList -> innerList.stream()
                    .map(settlm -> new AggregatedSettlement(settlm.getSmartNo(), 
                            settlm.getVolume(), settlm.getSettleAmount()))
                    .reduce((as1, as2) -> {
                        if (as1.getVolume() != as2.getVolume()) {
                            throw new IllegalStateException("Different volumes " + as1.getVolume() 
                                    + " and " + as2.getVolume() + " for smartNo " + as1.getSmartNo());
                        }
                        return new AggregatedSettlement(as1.getSmartNo(), as1.getVolume(), 
                                as1.getSettleAmount() + as2.getSettleAmount());
                    })
                    .get()
            )
            .collect(Collectors.toList());

I am not too happy about the call to get() on the Optional<AggregatedSettlement> that I get from reduce(); usually you should avoid get(). In this case I know that the original grouping only produced lists of at least one element, so the the reduce() cannot give an empty optional, hence the call to get() will work. A possible refinement would be orElseThrow() and a more explanatory exception.

I am sure there’s room for optimization. I am really producing quite many more AggregatedSettlement objects than we need in the end. As always, don’t optimize until you know you need to.

Edit: If only for the exercise here’s the version that doesn’t construct superfluous AggregatedSettlement objects. Instead it creates two streams on each list from your map, and it’s 5 lines longer:

    List<AggregatedSettlement> aggregatedList = list.stream()
            .collect(Collectors.groupingBy(Settlement::getSmartNo))
            .entrySet()
            .stream()
            .map(entry -> {
                double volume = entry.getValue()
                        .stream()
                        .mapToDouble(Settlement::getVolume)
                        .reduce((vol1, vol2) -> {
                            if (vol1 != vol2) {
                                throw new IllegalStateException("Different volumes " + vol1 
                                        + " and " + vol2 + " for smartNo " + entry.getKey());
                            }
                            return vol1;
                        })
                        .getAsDouble();
                double settleAmountSum = entry.getValue()
                        .stream()
                        .mapToDouble(Settlement::getSettleAmount)
                        .sum();
                return new AggregatedSettlement(entry.getKey(), volume, settleAmountSum);
            })
            .collect(Collectors.toList());

Pick the one you find easier to read.

Edit 2: It seems from this answer that in Java 9 I will be able to avoid the call to Optional.get() if instead of map() I use flatMap() and instead of get() I use stream(). It will be 6 chars longer, I may still prefer it. I haven’t tried Java 9 yet, though (now I know what I’m going to do today :-) The advantage of get() is of course that it would catch a programming error where the inner list comes out empty.

Community
  • 1
  • 1
Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
2

If I understand the question correctly, you need a toMap collector with custom merger like this:

list.stream().collect(Collectors.toMap(
       Settlement::getSmartNo,
       Function.identity(),
       (s1, s2) -> s1.addAmount(s2.getSettleAmount())));

With a helper method inside Settlement class:

Settlement addAmount(double addend) {
    this.settleAmount += addend;
    return this;
}
Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334
  • You really recommend a reduction function that modifies its argument? – Holger Feb 13 '17 at 16:15
  • @Holger, sure. Java is not Haskell. Sometimes it's good solution. – Tagir Valeev Feb 14 '17 at 04:48
  • I thought, you know that this is *broken*. You are modifying arbitrary objects of the source list, without any control over which are modified and which not. I.e. with a parallel stream, different objects might get modified in each execution. – Holger Feb 14 '17 at 10:41
  • @Holger, please show me a counterexample when it could fail or show me a spec statement which states that my solution is *broken*. – Tagir Valeev Feb 14 '17 at 11:19
  • You *know*, how the stream processing works, e.g. how the workload is split into chunks which are processing individually before the partial results are merged. In other words, that instead of `f.apply(f.apply(a, b), c)`, the implementation may call `f.apply(a, f.apply(b, c))`. Of course, the result in `a` will have the desired sum, but whether `b` will be modified, is unpredictable. Granted, the question doesn’t say anything about the desired behavior regarding modifications made to the source objects, but I’m a bit surprised that you are recommending such a solution without any note about it – Holger Feb 14 '17 at 11:31