0

Javadocs say that distinct() - Returns a stream consisting of the distinct elements (according to Object.equals(Object)) of this stream.

I have a list of custom objects with some duplicates. When I run distinct() method on the streamed list, I still get the original list back. Why are the duplicates not getting removed even though I defined an equals method in the custom object ?

Code :

import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;

class CustomType {
    private String data1;

    public CustomType(String data1) { this.data1 = data1; }
    public String getData1() { return data1; }

    @Override
    public boolean equals(Object other){
        CustomType otherC = (CustomType) other;
        return this.getData1().equals(otherC.getData1());
    }

    @Override
    public String toString(){
        return "[" + data1 + "]";
    }
}

public class StreamDistinctTest {
    public static void main(String [] args){
        List<CustomType> data = Arrays.asList(
            new CustomType("a"),
            new CustomType("b"),
            new CustomType("a"),
            new CustomType("c")
        );

        List<CustomType> filtered = data.stream().distinct().collect(Collectors.toList());
        filtered.forEach(System.out::println);
    }
}

Output :

[a]
[b]
[a]
[c]

BTW, I put a breakpoint in CustomType.equals(arg) and noticed that distinct( ) does not even call equals(arg).

armani
  • 156
  • 1
  • 3
  • 13

1 Answers1

6

You always should override hashCode when overriding equals, else your equals method doesn't obey the expected contract:

@Override
public int hashCode() {
    return data1.hashCode();
}

This works suggesting that distinct() first tests using hashCode and then if hashCodes are the same, uses the equals method.

Hovercraft Full Of Eels
  • 283,665
  • 25
  • 256
  • 373
  • @JaeheonShim: no problem, and even if the API for Stream's distinct method doesn't mention this, the API for Object's `equals(...)` method sure does mention that the contracts must be obeyed. – Hovercraft Full Of Eels Mar 20 '20 at 02:08
  • Just now found a [duplicate](https://stackoverflow.com/questions/21333646/stream-and-the-distinct-operation) – Hovercraft Full Of Eels Mar 20 '20 at 02:12
  • Thanks. So, is Object.hashCode() called if I don't override the hashCode() as well ? – armani Mar 20 '20 at 05:49
  • 2
    @armani: every object has a hashCode method, and it is often called *before* equals since it has a lower computational cost than equals. If hashCode values are not the same, there is no need to call the more expensive equals method. If you don't override hashCode, then the default method will be called which will return a unique value for every object. – Hovercraft Full Of Eels Mar 20 '20 at 10:47