36

I have a list/collection of objects that may or may not have the same property values. What's the easiest way to get a distinct list of the objects with equal properties? Is one collection type best suited for this purpose? For example, in C# I could do something like the following with LINQ.

var recipients = (from recipient in recipientList
                 select recipient).Distinct();

My initial thought was to use lambdaj (link text), but it doesn't appear to support this.

Brian
  • 13,412
  • 10
  • 56
  • 82
javacavaj
  • 2,901
  • 5
  • 39
  • 66
  • 1
    Use this LINQ-like library (github.com/nicholas22/jpropel-light) and do List recipientList; recipientList.distinct(); This is exactly what the code you posted does. – NT_ Oct 08 '11 at 10:26

9 Answers9

51
return new ArrayList(new HashSet(recipients));
Chase Seibert
  • 15,703
  • 8
  • 51
  • 58
32

Use an implementation of the interface Set<T> (class T may need a custom .equals() method, and you may have to implement that .equals() yourself). Typically a HashSet does it out of the box : it uses Object.hashCode() and Object.equals() method to compare objects. That should be unique enough for simple objects. If not, you'll have to implement T.equals() and T.hashCode() accordingly.

See Gaurav Saini's comment below for libraries helping to implement equals and hashcode.

glmxndr
  • 45,516
  • 29
  • 93
  • 118
  • 2
    HashSet also uses equals if there's a hash collision. – Steve Kuo Jun 19 '09 at 20:35
  • 1
    This is incorrect - Object.hashCode() checks for identity , not meaningful equality. For 2 objects - different references - which are meaningfully equal Object.hashCode() will return false. Always implement hashCode() and equals() for objects which will be used in Sets or as keys in Maps. – Robert Munteanu Jun 19 '09 at 21:41
  • Not quite correct. Let me give you an example: If hashCode() returns 1 (perfectly legal), then this results in a hash collision and thus equals will then be called by HashSet (actually it's backed by a HashMap). – Steve Kuo Jun 19 '09 at 22:07
  • @Robert : No need to downvote me for wrong reasons, read the javacode from Object.hashCode() : If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result. – glmxndr Jun 19 '09 at 22:12
  • @Steve : Ok thanks. Added mention to .equals in the description of HashSet's comparison. – glmxndr Jun 19 '09 at 22:18
  • 3
    Alternatively, use standard way of overriding hashCode() and equals() method using HashCodeBuilder and EqualsBuilder provided by Apache's Commons. HashCodeBuilder: http://commons.apache.org/lang/api-2.3/org/apache/commons/lang/builder/HashCodeBuilder.html EqualsBuilder: http://commons.apache.org/lang//api-2.4/org/apache/commons/lang/builder/EqualsBuilder.html – Gaurav Saini Jun 21 '09 at 07:06
  • @subtenante: My point was about hashCode _and_ equals - which should both be implemented if you plan to add objects in Sets/Maps. And let's not talk about public int hashCode() { return 1; } ;-) I'm sure there are plenty of places where this is villified. – Robert Munteanu Jun 22 '09 at 14:45
25

Place them in a TreeSet which holds a custom Comparator, which checks the properties you need:

SortedSet<MyObject> set = new TreeSet<MyObject>(new Comparator<MyObject>(){

    public int compare(MyObject o1, MyObject o2) {
         // return 0 if objects are equal in terms of your properties
    }
});

set.addAll(myList); // eliminate duplicates
Kingpin2k
  • 47,277
  • 10
  • 78
  • 96
Robert Munteanu
  • 67,031
  • 36
  • 206
  • 278
  • 3
    Only works if the comparator is consistent with equals(), at which point you're better off with a HashSet anyway. – Michael Myers Jun 19 '09 at 20:31
  • 3
    Why should it be consistent with equals() ? That it the usually the case but at the moment we need to remove some duplicates based on a custom condition. A Comparator is the most unobstrusive way I can think of. – Robert Munteanu Jun 19 '09 at 21:04
  • 1
    Interestingly, the TreeSet docs indicate that there will be problems if the Comparator is inconsistent with equals(), while the docs for TreeMap (which TreeSet is based on) merely say that equals() will be ignored in that case. I now think this is the best answer. +1 – Michael Myers Jun 19 '09 at 21:09
  • The problem with inconsistent comparator is when you use a TreeSet, pass it around as a Set - which the consumer expects to obey hashCode() and equals(). Then all hell breaks loose :-) – Robert Munteanu Jun 19 '09 at 21:38
16

Java 8:

recipients = recipients.stream()
    .distinct()
    .collect(Collectors.toList());

See java.util.stream.Stream#distinct.

Radiodef
  • 37,180
  • 14
  • 90
  • 125
Dexter Legaspi
  • 3,192
  • 1
  • 35
  • 26
11

order preserving version of the above response

return new ArrayList(new LinkedHashSet(recipients));
rudnev
  • 2,281
  • 3
  • 22
  • 31
7

If you're using Eclipse Collections, you can use the method distinct().

ListIterable<Integer> integers = Lists.mutable.with(1, 3, 1, 2, 2, 1);
Assert.assertEquals(
    Lists.mutable.with(1, 3, 2),
    integers.distinct());

The advantage of using distinct() instead of converting to a Set and then back to a List is that distinct() preserves the order of the original List, retaining the first occurrence of each element. It's implemented by using both a Set and a List.

MutableSet<T> seenSoFar = Sets.mutable.with();
int size = list.size();
for (int i = 0; i < size; i++)
{
    T item = list.get(i);
    if (seenSoFar.add(item))
    {
        targetCollection.add(item);
    }
}
return targetCollection;

If you cannot convert your original List into an Eclipse Collections type, you can use ListAdapter to get the same API.

MutableList<Integer> distinct = ListAdapter.adapt(integers).distinct();

Note: I am a committer for Eclipse Collections.

Donald Raab
  • 6,458
  • 2
  • 36
  • 44
Craig P. Motlin
  • 26,452
  • 17
  • 99
  • 126
4

You can use a Set. There's couple of implementations:

  • HashSet uses an object's hashCode and equals.
  • TreeSet uses compareTo (defined by Comparable) or compare (defined by Comparator). Keep in mind that the comparison must be consistent with equals. See TreeSet JavaDocs for more info.

Also keep in mind that if you override equals you must override hashCode such that two equals objects has the same hash code.

Steve Kuo
  • 61,876
  • 75
  • 195
  • 257
3

The ordinary way of doing this would be to convert to a Set, then back to a List. But you can get fancy with Functional Java. If you liked Lamdaj, you'll love FJ.

recipients = recipients
             .sort(recipientOrd)
             .group(recipientOrd.equal())
             .map(List.<Recipient>head_());

You'll need to have defined an ordering for recipients, recipientOrd. Something like:

Ord<Recipient> recipientOrd = ord(new F2<Recipient, Recipient, Ordering>() {
  public Ordering f(Recipient r1, Recipient r2) {
    return stringOrd.compare(r1.getEmailAddress(), r2.getEmailAddress());
  }
});

Works even if you don't have control of equals() and hashCode() on the Recipient class.

Apocalisp
  • 34,834
  • 8
  • 106
  • 155
2

Actually lambdaj implements this feature through the selectDistinctArgument method

http://lambdaj.googlecode.com/svn/trunk/html/apidocs/ch/lambdaj/Lambda.html#selectDistinctArgument(java.lang.Object,%20A)

Mario Fusco
  • 13,548
  • 3
  • 28
  • 37