0

I wanted to 'remove' elements in ArrayList by creating a new ArrayList that contains all elements except the duplicates. Duplicates are selected by an comparator and the ordering should be maintained. The algorithm is just iterating over the ArrayList and adding each element that isn't already in the new list. The work is done in a method that is going through the entire ArrayList and checking for equality. Here is the code:

public static <T> ArrayList<T> noDups(Comparator<T> cmp, ArrayList<T> l) {

    ArrayList<T> noDups = new ArrayList<T>();
    for(T o : l) {
        if(!isIn(cmp, o, l)) 
            noDups.add(o);
    }
    return noDups;
}

public static <T> boolean isIn(Comparator<T> cmp, T o, ArrayList<T> l) {
    Iterator<T> i = l.iterator();
    if (o==null) {
        while (i.hasNext())
            if (i.next()==null)
                 return true;
    } else {
        while (!(i.hasNext()))
            if (o.equals(i.next()))
                return true;
    }
    return false;
}

I'm curious about the time complexity this algorithm has. My thoughts are: For the first step there is no work, because the ArrayList noDups is empty. For the second step there is one check, for the third 3 and so on to n-1. So we have 1+2+...+n-1. Is that correct and what time complexity would that be?

  • 1
    This is `O(n^2)`. For each element in the array you have to check every single one added before for a possible duplicate. Creating a `HashSet` from `ArrayList` would be `O(n*log(n))`, which is better in terms of time-complexity – Fureeish Jan 18 '18 at 20:33
  • As far as I know HashSets don't maintain the ordering of the ArrayList, which I don't want. –  Jan 18 '18 at 20:35
  • Did you mean "which I want to maintain"? If you care about the order, consider `LinkedHashSet`, which additionally keeps a linked list that represents the order of the added elements – Fureeish Jan 18 '18 at 20:36
  • Yes, that's what I meant. I am not very familiar with HashSets, how would this work with a LinkedHashSet? –  Jan 18 '18 at 20:41
  • Let me write an answer then – Fureeish Jan 18 '18 at 20:41
  • Not critical to the answer but ... I want to point out that you should usually use the _interface_, so `public static List noDups(Comparator cmp, List l) {` — you will take a `List<>` - you don't care what kind, it doesn't have to be specifically an `ArrayList`; and you will return a `List<>`, the caller doesn't care what the actual list is, ArrayList or LinkedList etc. – Stephen P Jan 18 '18 at 21:41

2 Answers2

3

The time-complexity of your algorithm is O(n^2), since for each element in the array you have to compare it with every single one added before. You can improve that to O(n*log(n)) actually O(n) using a Set.

You mentioned that you care about the order and as far as you know, the HashSet does not maintain it. It's true, but there is a way to make it work. Use LinkedHashSet like so:

ArrayList<Integer> array = 
    new ArrayList<>(Arrays.asList(1, 5, 4, 2, 2, 0, 1, 4, 2));
LinkedHashSet<Integer> set = new LinkedHashSet<>(array);

This will construct a LinkedHashSet, which time-complexity of insertion, deletion and finding is O(1). The important part is the difference between this data structure and your typical HashSet. LinkedHashSet additionally creates a linked list that indicates the order of the elements in terms of insertion.

If you run the above code with this loop:

for(Integer i : set) {
    System.out.print(i + " ");
}

The output will be: 1 5 4 2 0 - as you wanted.

Fureeish
  • 12,533
  • 4
  • 32
  • 62
  • This helped a lot, thanks. However I am wondering on how to do this with any equality check given by a comparator. I think your code works only for already implemented equality checks like '==' for ints, correct me if I'm wrong. –  Jan 18 '18 at 20:58
  • 1
    If you wish to use `ArrayList` and convert it into `LinkedHashSet` you should implement `equals` and `hashCode` in `YourClass`. – Fureeish Jan 18 '18 at 21:07
  • Simply `List x = Arrays.asList(1, 5, 4, 2, 2, 0, 1, 4, 2);` will create that initial list, no `new ArrayList()` required. You don't really care what _kind_ of list it is; ArrayList or LinkedList etc. – Stephen P Jan 18 '18 at 21:44
  • I completely disagree. We **do** care what kind of list it is, since OP specifially mentioned `ArrayList` in his question and we should not assume that *it might not matter*. – Fureeish Jan 18 '18 at 21:49
  • The OP specification that states what needs to be accomplished will work on anything that implements the List interface; there is no reason to limit it to one specifically chosen implementation of a List. One can (and I have) return the same kind of thing that was passed in, **if** that's even a requirement (so pass in a LinkedList, get a LinkedList back; pass in an ArrayList get an ArrayList back). If you absolutely _must have_ an ArrayList (for some totally bizarre reason) you can `new ArrayList<>(noDups(origList))` – Stephen P Jan 18 '18 at 22:40
  • Comment section isn't for arguing and I'm not going to continue this discussion, but it is to be considered that OP specifically mentions `ArrayList` **six** times in his question. The fact that what he's doing with it can be accomplished by using the `List` interface instead of sticking with the ArrayList does not mean that the answer and/or question should be altered. He asked specifically for X and got an answer using X. The fact that X implements Y does not mean we all should stop considering X, even though Y would also work. I do agree though that it's worth to mention what has been said – Fureeish Jan 18 '18 at 22:48
0

You are correct that the time complexity of this would be O(N^2). However, "a list without duplicates" is the definition of a set. This question has some details on implementation and corner cases that might cover what you're interested in.

Brian Ellis
  • 70
  • 1
  • 5