17

Possible Duplicate:
Prevent duplicate entries in arraylist

I have an arraylist of a particular class C.

List<C> myList = new ArrayList<C>();

Class C has two attributes viz.

String str1;
String str2;

Now as and when I am adding objects of type C to the ArrayList myList, I want to check if there already exists an object in the list with the values of str1 and str2 matching the values of the parameters (str1 and str2) of the object I am about to add.

Is there any efficient way to do this without having to iterate everytime through the complete list and checking for matching between the parameters?

Community
  • 1
  • 1
London guy
  • 27,522
  • 44
  • 121
  • 179
  • 3
    Use a java.util.Set interface with preferably TreeSet implementation instead of List. A set automatically takes care of your requirement. You just have to add the string values to the set. Search the net or read a book on data structures for how Set works. – Kiran Mohan Jan 07 '13 at 08:54
  • 1
    here: [**helpful link**](http://stackoverflow.com/questions/3951547/java-array-finding-duplicates?answertab=votes#tab-top) – Grijesh Chauhan Jan 07 '13 at 08:54

3 Answers3

40

When you need to check for duplicates or ensure unique values, consider using a Set - like data structure, rather than a List.

You can choose from one of the below -

  • HashSet

    • Faster access - O(1) access roughly speaking.
    • not sorted
    • Hash table used as base storage.
  • TreeSet

    • Slower access (relative to HashSet) - O(log(n))
    • values sorted automatically.
    • Red-Black tree used as base storage.

Set automatically only allows unique values. Attempts to add values that previously exist will fail.

Note that for this to work you will need to override equals and hashcode to tell the Set how to compare your objects. This step is better explained at What issues should be considered when overriding equals and hashCode in Java?

Community
  • 1
  • 1
Karthik T
  • 31,456
  • 5
  • 68
  • 87
  • 3
    It should be pointed out that a Set considers two objects as equal when the value returned by their `hashCode` method is equal. When you want two different objects with equal content to be considered duplicates, you need to override the hashCode method so that it is calculated from all relevant fields of the object. – Philipp Jan 07 '13 at 08:58
  • So, if I use a HashSet (I don't want a sorted set), and if I try to add an object, would it basically check if there exists another object in the set with the same values for the parameters? – London guy Jan 07 '13 at 08:58
  • @Philipp Thanks Philipp. This is exactly what I want. I want to check if there exists another object with the same equal content, and NOT if the two objects are equal according to any different criterion. How do you override the hashCode method? – London guy Jan 07 '13 at 09:01
  • @AbhishekShivkumar as Philipp mentions you would need to override `equals` and `hashcode` to tell the `Set` how to compare your objects – Karthik T Jan 07 '13 at 09:02
  • 1
    @AbhishekShivkumar To override the hashCode method, create a method `@Override public int hashCode()` in your class C and implement it in a way that it calculates an integer based on the contents of str1 and str2. When you aren't experienced with writing good hash functions, you could just take the hashCode method of str1 and str2 and combine the values somehow. You could XOR or add them, while also bit-shifting or multiplying one of the values (so that they are not interchangeable). – Philipp Jan 07 '13 at 09:10
  • Brilliant references. Thanks! – London guy Jan 07 '13 at 09:16
  • @AbhishekShivkumar you are welcome, they explain better than I would have been able to. – Karthik T Jan 07 '13 at 09:24
21

You need to override the equals method in Class C.

e.g.

public boolean equals(Object c) {
    if(c !instanceof C) {
        return false;
    }

    C that = (C)c;
    return this.str1.equals(that.getStr1()) && this.str2.equals(that.getStr2());
}

Then you can call myList.contains(viz) to see if the list already contains an equal object.

This is untested, you may want some additional error handling.

If you do override the equals method like this, you should also make sure you override the hashcode() method. See: http://www.technofundo.com/tech/java/equalhash.html

Edit: As pointed out in the comments, the set implementation is going to be more efficient, though you will still need to override equals / hashcode method so the above example may be best used in conjunction with Karthiks answer above.

cowls
  • 24,013
  • 8
  • 48
  • 78
  • 2
    It should be pointed out that the `contains` method of most List-implementing classes does scale badly for large lists, because every single entry of the list needs to be checked. Most Set- implementations are superior in this case. – Philipp Jan 07 '13 at 08:55
  • 1
    This will iterate through the entire list though. As inefficient as the handwritten version OP wanted to avoid, but cleaner code – Karthik T Jan 07 '13 at 08:56
  • @Karthik, good point, updated my answer with a note :) – cowls Jan 07 '13 at 09:05
  • Thanks, I accepted your answer because the length of my list was not large, and time optimization was not a strict constraint. I feel this is a nice tiny way of writing the equals method and simply using the contains check to see if it is already there. It works! – London guy Jan 07 '13 at 09:14
13
if (yourList.contains(Object object))
{
    // do not add
}
Romczyk
  • 361
  • 3
  • 12
  • 2
    It should be pointed out that the contains method of most List-implementing classes does scale badly for large lists, because every single entry of the list needs to be checked. Most Set- implementations are superior in this case. – Philipp Jan 07 '13 at 08:55