I wanted to get a list
of unique elements from a list
with duplicate elements and the order of elements occurring in the list should be maintained.
To achieve this, I could have written an algorithm like:
private ArrayList<T> getUnique(ArrayList<T> list)
{
// maintain a hashmap of numbers and a uniqueList to be returned(ArrayList<T>)
// Add element in result list and the hashmap if the element isn't already present in the hashmap, else just add in the hashmap
HashMap<T, Boolean> map = new HashMap<>();
ArrayList<T> uniqueList = new ArrayList<>();
for (T t: list)
{
if (map.get(t) == null)
{
// t wasn't present so, adding them in map as well as in the list
map.put(t, true);
uniqueList.add(t);
}
}
return uniqueList;
}
This algorithm will take O(n)
time with O(n)
extra space (for HashMap).
Or simply, I could have used the below syntax:
Set<T> set = new LinkedHashSet<>(list);
The above syntax in Java is used for getting a set
of unique elements from list
with occurrence order of elements same as that of list
.
Then convert this set in a list. (ArrayList<T> uniqueList = new ArrayList<>(set);
)
I am assuming the time complexity here is also O(n)
. I wanted to know what algorithm Java uses for it.
I see the class is named LinkedHashSet, so I thought that they might be using some LinkedList
concepts for achieving this, So I looked into the source code, and found these stuff:
- In
LinkedHashSet.java
, the constructor looks like:
143: public LinkedHashSet(Collection<? extends T> c)
144: {
145: super(c);
146: }
here is the source.
- So, I looked at the parent class constructor i.e.
HashSet
, I found:
public HashSet(Collection<? extends E> c) {
map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
addAll(c);
}
- Next, I searched for
addAll
method, I found it inAbstractCollection
class(which is the grandparent ofHashSet
class), the definition of the function is:
public boolean addAll(Collection<? extends E> c) {
boolean modified = false;
for (E e : c)
if (add(e))
modified = true;
return modified;
}
This is calling add
which is like:
public boolean add(E e) {
throw new UnsupportedOperationException();
}
here.
I couldn't understand this. What algorithm do they use for this task?