JAVA 8 distinctByKey

Question

public List<XYZ> getFilteredList(List<XYZ> l1) {
        return l1
                .stream()
                .filter(distinctByKey(XYZ::getName))
                .filter(distinctByKey(XYZ::getPrice))
                .collect(Collectors.toList());
    }

private static <T> Predicate<T> distinctByKey(Function<? super T, Object> 
 keyExtractor) {
        Map<Object,Boolean> seen = new ConcurrentHashMap<>();
        return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
    }

Can anyone please help me, What is the meaning of this line ------->
return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;

Why is the lambda result compared to null?

WJS · Answer 1 · 2022-06-10T13:09:14.363

Your question revolves about the following:

return t -> seen.putIfAbsent(keyExtractor.apply(t),
                Boolean.TRUE) == null;

first, the return returns the entire lambda (from t->.. onward). It still references the created Map as a closure via seen even though the map itself is now out of scope.
The keyExtractor will retrieve the key (either name or price) in your example via the setters provided as method references (e.g. XYZ::getName)
putIfAbsent tries to add the boolean value true to the map for the supplied key (in this case, the name and price from the keyExtractor). If the key was already present, it returns that value which would be true. Since true is not equal to null, false is returned and the filter doesn't pass the value. If the value was not there, null is returned. Since null == null is true, true will be returned and the value passed thru the filter (i.e. it is thusfar distinct).

Here is an example of how this would work. This uses a simple record and only applying a filter on name.

record XYZ(String getName, String other){
    @Override
    public String toString() {
        return String.format("[name=%s, other=%s]", getName, other);
    }
}
    
public static void main(String[] args) {
    List<XYZ> l1 = List.of(
            new XYZ("A","B"),
            new XYZ("B","B"),
            new XYZ("B","C"),
            new XYZ("B","D"),
            new XYZ("C","B"));

    
    Object ob =
            l1.stream().filter(distinctByKey(XYZ::getName))
                    .collect(Collectors.toList());
    System.out.println(ob);
}

prints

[[name=A, other=B], [name=B, other=B], [name=C, other=B]]

Notice that only the first Name of B was allowed thru the filter, the others were blocked.

private static <T> Predicate<T>
        distinctByKey(Function<? super T, Object> keyExtractor) {
    Map<Object, Boolean> seen = new ConcurrentHashMap<>();
    return t -> seen.putIfAbsent(keyExtractor.apply(t),
            Boolean.TRUE) == null;
}

Thank you @WJS for the useful explanation. Can you please give me an Idea on how I can unit test this function (distinctByKey())? Also, why do we have before Predicate? Thanks. — BSL, Sep 10 '22 at 23:50
The second question is best answered in the section of [Generic Methods](https://docs.oracle.com/javase/tutorial/java/generics/methods.html) in the Java Tutorials. As far as Junit testing goes, one way would be to have a test that filters a predefined data structure (e.g a `list`) and returns the contents. Those contents would then be compared to another `list` of the expected results and would either return true or false. — WJS, Sep 11 '22 at 01:33

JAVA 8 distinctByKey

1 Answers1