127

How to create a list of unique/distinct objects (no duplicates) in Java?

Right now I am using HashMap<String, Integer> to do this as the key is overwritten and hence at the end we can get HashMap.getKeySet() which would be unique. But I am sure there should be a better way to do this as the value part is wasted here.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154

7 Answers7

201

You can use a Set implementation:

Some info from the JAVADoc:

A collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. As implied by its name, this interface models the mathematical set abstraction.

Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set. A special case of this prohibition is that it is not permissible for a set to contain itself as an element.`

These are the implementations:

  • HashSet

    This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets. Iterating over this set requires time proportional to the sum of the HashSet instance's size (the number of elements) plus the "capacity" of the backing HashMap instance (the number of buckets). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.

    When iterating a HashSet the order of the yielded elements is undefined.

  • LinkedHashSet

    Hash table and linked list implementation of the Set interface, with predictable iteration order. This implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is the order in which elements were inserted into the set (insertion-order). Note that insertion order is not affected if an element is re-inserted into the set. (An element e is reinserted into a set s if s.add(e) is invoked when s.contains(e) would return true immediately prior to the invocation.)

    So, the output of the code above...

     Set<Integer> linkedHashSet = new LinkedHashSet<>();
     linkedHashSet.add(3);
     linkedHashSet.add(1);
     linkedHashSet.add(2);
    
     for (int i : linkedHashSet) {
         System.out.println(i);
     }
    

    ...will necessarily be

    3
    1
    2
    
  • TreeSet

    This implementation provides guaranteed log(n) time cost for the basic operations (add, remove and contains). By default he elements returned on iteration are sorted by their "natural ordering", so the code above...

     Set<Integer> treeSet = new TreeSet<>();
     treeSet.add(3);
     treeSet.add(1);
     treeSet.add(2);
    
     for (int i : treeSet) {
         System.out.println(i);
     }
    

    ...will output this:

    1
    2
    3
    

    (You can also pass a Comparator instance to a TreeSet constructor, making it sort the elements in a different order.)

    Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all element comparisons using its compareTo (or compare) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Set interface.

brandizzi
  • 26,083
  • 8
  • 103
  • 158
Frank
  • 16,476
  • 7
  • 38
  • 51
  • Now I am confused, which one shall I use? I just need to have maintain a list of unique strings. So basically even when an existing string is added it should actually get added. –  Nov 06 '12 at 21:25
  • 2
    The choice is yours...HashSet is universal and fast, treeset is ordered,LinkedHashset keeps insertion order... – Frank Nov 06 '12 at 21:31
  • 10
    This is not a LIST... so, not all the LIST interface methods are available. – marcolopes Mar 04 '16 at 04:06
  • 2
    A set is not a list, I cannot look up elements by index in a set in O(1) time (random access). – wilmol Sep 18 '19 at 01:32
  • Almost every set operates over a map inside. So there will be no memory benefits in changing a map to a set. – Vitaliy Tsirkunov Dec 29 '22 at 10:51
18

I want to clarify some things here for the original poster which others have alluded to but haven't really explicitly stated. When you say that you want a Unique List, that is the very definition of an Ordered Set. Some other key differences between the Set Interface and the List interface are that List allows you to specify the insert index. So, the question is do you really need the List Interface (i.e. for compatibility with a 3rd party library, etc.), or can you redesign your software to use the Set interface? You also have to consider what you are doing with the interface. Is it important to find elements by their index? How many elements do you expect in your set? If you are going to have many elements, is ordering important?

If you really need a List which just has a unique constraint, there is the Apache Common Utils class org.apache.commons.collections.list.SetUniqueList which will provide you with the List interface and the unique constraint. Mind you, this breaks the List interface though. You will, however, get better performance from this if you need to seek into the list by index. If you can deal with the Set interface, and you have a smaller data set, then LinkedHashSet might be a good way to go. It just depends on the design and intent of your software.

Again, there are certain advantages and disadvantages to each collection. Some fast inserts but slow reads, some have fast reads but slow inserts, etc. It makes sense to spend a fair amount of time with the collections documentation to fully learn about the finer details of each class and interface.

Paul Connolly
  • 189
  • 1
  • 2
  • 3
    This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post - you can always comment on your own posts, and once you have sufficient [reputation](http://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](http://stackoverflow.com/help/privileges/comment). – Zach Saucier Dec 14 '14 at 19:34
  • 3
    It does provide an answer, actually. If he just wants a list that acts like a Set, use org.apache.commons.collections.list.SetUniqueList, but as a programmer, he/we should be more careful than that and should think more about the problem. If this makes my answer better, "How to create a Unique List in Java?" List uniqueList = new SetUniqueList();, that's how.... – Paul Connolly Dec 14 '14 at 20:29
  • 7
    And Zach, I'm not trying to be a jerk, but did you even read my answer before your comment? Or do you just not understand it? If you don't understand it, that's OK - let me know and I'll expand on the topic. I don't think I should have to write a treatise on data structures in order to give a friendly answer to somebody's question. Nor do I care to go about some meek way of building up my comment reputation when I know the answer and nobody else has really provided it. – Paul Connolly Dec 14 '14 at 20:48
  • 3
    And by the way, I was neither critiquing or requesting clarification from the author, I was just saying that he can either A) quickly use the class I gave him, or B) take the time to really understand the differences between these classes and relate them to his needs. B obviously takes longer, but will result in better code in the long term. – Paul Connolly Dec 14 '14 at 20:54
13

Use new HashSet<String> An example:

import java.util.HashSet;
import java.util.Set;

public class MainClass {
  public static void main(String args[]) {
    String[] name1 = { "Amy", "Jose", "Jeremy", "Alice", "Patrick" };

    String[] name2 = { "Alan", "Amy", "Jeremy", "Helen", "Alexi" };

    String[] name3 = { "Adel", "Aaron", "Amy", "James", "Alice" };

    Set<String> letter = new HashSet<String>();

    for (int i = 0; i < name1.length; i++)
      letter.add(name1[i]);

    for (int j = 0; j < name2.length; j++)
      letter.add(name2[j]);

    for (int k = 0; k < name3.length; k++)
      letter.add(name3[k]);

    System.out.println(letter.size() + " letters must be sent to: " + letter);

  }
}
tim_a
  • 940
  • 1
  • 7
  • 20
  • 2
    Just adding out put of above program--> 11 letters must be sent to: [Aaron, Alice, James, Adel, Jose, Jeremy, Amy, Alan, Patrick, Helen, Alexi] – Ammad Apr 05 '16 at 18:45
6

I do not know how efficient this is, However worked for me in a simple context.

List<int> uniqueNumbers = new ArrayList<>();

   public void AddNumberToList(int num)
    {
        if(!uniqueNumbers .contains(num)) {
            uniqueNumbers .add(num);
        }
    }
Zapnologica
  • 22,170
  • 44
  • 158
  • 253
4

You could just use a HashSet<String> to maintain a collection of unique objects. If the Integer values in your map are important, then you can instead use the containsKey method of maps to test whether your key is already in the map.

Ted Hopp
  • 232,168
  • 48
  • 399
  • 521
3

HashSet<String> (or) any Set implementation may does the job for you. Set don't allow duplicates.

Here is javadoc for HashSet.

kosa
  • 65,990
  • 13
  • 130
  • 167
1

You may want to use one of the implementing class of java.util.Set<E> Interface e.g. java.util.HashSet<String> collection class.

A collection that contains no duplicate elements. More formally, sets contain no pair of elements e1 and e2 such that e1.equals(e2), and at most one null element. As implied by its name, this interface models the mathematical set abstraction.

Yogendra Singh
  • 33,927
  • 6
  • 63
  • 73