Removing duplicates in an arraylist

Question

Just a disclaimer: I repeated my java mod for the second time so my questions may tend to be a bit simple and hopefully I don't sound too stupid.

Write a method removeDuplicates that takes as a parameter a sorted ArrayList of Strings and that eliminates any duplicates from the list. For example, suppose that a variable called list contains the following values: {"be", "be", "is", "not", "or", "question", "that", "the", "to", "to"} After calling removeDuplicates(list); the list should store the following values: {"be", "is", "not", "or", "question", "that", "the", "to"}

Because the values will be sorted, all of the duplicates will be grouped together.

My attempt at this:

public static void removeDuplicates(ArrayList <String>a){
    for(int i=0;i<a.size();i++){
        String word=a.get(i);
        String word2=a.get(i+1);

        if(word.equals(word2)){
            a.remove(word);

        }
        else{
            System.out.print(word);

        }
    }
}

The issue is that when I call it with:

["duplicate", "duplicate", "duplicate", "duplicate", "duplicate"]

it returns indexoutofbound. I understand it has something to do with i=i-1 with reference to the remove method. Tried inserting it here and there but it doesn't work. But I am very puzzled as this works with my code. When I call it with:

["be", "be", "is", "not", "or", "question", "that", "the", "to", "to"]

it works.

Does this help? http://stackoverflow.com/questions/13429119/get-unique-values-from-arraylist-in-java — Aiias, Apr 07 '13 at 05:54
Why cant you use a Set for the same. This will not allow you to enter duplicate values at first place itself. — Ankur Shanbhag, Apr 07 '13 at 05:54
Hi aiias,it doesn't really help.I am not similar with the set method.Will prefer something not related to set since it is not taught yet — user2179615, Apr 07 '13 at 05:56
As Ankur noted, you should use a `Set` for uniqueness. If you still want it to be ordered, then use a `SortedSet` (e.g., `TreeSet`). — pickypg, Apr 07 '13 at 05:57
@user2179615 : It is similar to a List. Not much difference from implementation point of view. Just refer some sample example you can easily understand it (given you understand list :-)) — Ankur Shanbhag, Apr 07 '13 at 06:01
@user2179615 it could be better if you use Set for your implementation but if you have List check answer. — Sumit Singh, Apr 07 '13 at 07:55
possible duplicate of [Java - Removing duplicates in an ArrayList](http://stackoverflow.com/questions/2435156/java-removing-duplicates-in-an-arraylist) — AZ_, Sep 17 '13 at 07:53

score 1 · Answer 1 · edited Apr 07 '13 at 06:01

1

Your implementation has flaws.

String word=a.get(i);
String word2=a.get(i+1);

will give out of bound when u reach last element.

Secondly you are removing elements as you are iterating directly from arraylist , this will not work. You iterator instead.

edited Apr 07 '13 at 06:01

Sumit Singh

15,743
6
59
89

answered Apr 07 '13 at 05:59

Lokesh

7,810
6
48
78

score 1 · Answer 2 · answered Apr 07 '13 at 07:55

I would suggest you change return type to ArrayList<String> and use Set to eliminate duplicates . Here's how :

public static ArrayList<String> removeDuplicates(ArrayList <String>a){
    return new ArrayList<String>(new HashSet<String>(a));
}

Or , in your current code change upper limit of for loop to a.size()-1:

for(int i=0;i<a.size()-1;i++) // this should prevent arrayindexoutofbound exception.

BobTheBuilder · Answer 3 · 2013-04-07T06:30:50.503

0

You have two errors: First error is you try to access non-existing object. When i = a.size(), String word2=a.get(i+1) which does not exist!

Another error is removing elements while iterating over the list.

You should use iterator instead.

A way to fix it without using iterator is: Use:

for(int i=0;i<a.size() - 1;i++){

and:

if(word.equals(word2)){
    a.remove(word);
    i--;
}

edited Apr 07 '13 at 06:30

answered Apr 07 '13 at 05:59

BobTheBuilder

18,858
6
40
61

Hmm,it does not work for ["duplicate", "duplicate", "duplicate", "duplicate", "duplicate"] – user2179615 Apr 07 '13 at 06:04
new error when i call for ["duplicate", "duplicate", "duplicate", "duplicate", "duplicate"],it gives back 3 [duplicate, duplicate, duplicate] – user2179615 Apr 07 '13 at 06:17
That's because you remove elements while you iterate over them. Ill fix my answer – BobTheBuilder Apr 07 '13 at 06:19

score 0 · Answer 4 · answered Apr 07 '13 at 05:59

0

You can use Set which will remove duplicate elements when adding to it

answered Apr 07 '13 at 05:59

thoitbk

269
2
9
21

This might remove too many elements, per the constraints. Elements that appear such as [A, B, B, A] would be considered an invalid input and shouldn't be subject to duplicate removal. – Makoto Apr 07 '13 at 06:56

score 0 · Answer 5 · edited Jan 10 '14 at 05:26

0

Your for loop should be i < a.size() - 1.

Let your size is 4. When you iterate for i = 3, you will get indexoutofbound for word2, trying to access value for index 4 which will be actually from 0 to 3rd index.

edited Jan 10 '14 at 05:26

Aniket Kulkarni

12,825
9
67
90

answered Apr 07 '13 at 06:02

Mohan Raj B

1,015
7
14

score 0 · Answer 6 · answered Apr 07 '13 at 06:16

void unique (ArrayList<String> a)
{
  if( a.length() == 0 )
      return;

  int result = 0;
  int first = 0;
  int last = a.length();
  while (++first<last)
  {
      String r = a.get(result);
      String cur = a.get(first);
      if( !cur.euqals(r) )
          a.set(++result,cur);
  }
  a.removeRange(++result,last);
}

I hope this code block can help you.

score 0 · Answer 7 · answered Apr 07 '13 at 06:55

Alright, so I'm going to throw syntax and a fair bit of list iteration concepts at you. Brace yourself and have your Java 7 API handy.

A solution to this problem is as follows, in plain steps:

Iterate through the list.
Check adjacent elements in the list.
- If they match, remove it.
- Otherwise, let it alone.
Return the resultant non-duplicated list.

Assumption made:

A list that has a duplicate element that isn't adjacent to other duplicate elements is presumed to exhibit non-Set behavior - that is, if I had input [A, B, B, A] I would expect [A, B A] as output. This is why I do not recommend using a Set for this.

There is a word of caution on just using remove() - if this list is accessed concurrently, then you will run into ConcurrentModificationException! A preferred and slightly cleaner approach would be to use the Iterator or ListIterator interfaces instead.

We have four cases to encounter before we iterate:

Empty (no elements) - should be disallowed, since we're [virtually] guaranteed that we won't have an empty list
Singleton (only one element, no duplicates)
Binary (two elements, one might be a duplicate)
Poly (n > 2 elements)

There is an edge case we have to account for - more than three repeated elements. What this means is, while we're iterating, we have to look at both the previous and the next element to determine if we should remove it.

Take, for instance, this example input:

[A, A, A, B, C, D]

If we iterate in the naive method (looking at i+1 while advancing), then we'll skip an element entirely. The above resultant without looking both left and right would be:

[A, A, B, C, D]

To get around this, we use ListIterator, which supports previous operations.

One run through this with the naive previous approach would yield worse results than before - because we've reset our current cursor's position, and we've already examined it, the next node we'll advance to will be considered a duplicate in error! (Ironically, you wouldn't get rid of the first couple of duplicates.)

To resolve that, we reset the cursor to the original spot we were at before we looked left.

Here's the solution. It works on any size list, with respect to our defined constraints and expected behavior above.

public List<String> removeDuplicates(final ArrayList<String> dupeList) {
    if(dupeList.size() == 0) {
        throw new IllegalArgumentException("Zero-length list == evil");
    }
    ListIterator<String> li = dupeList.listIterator();
    String w1;
    String w2;
    if(dupeList.size() == 1) {
        return dupeList;
    } else if(dupeList.size() == 2) {
        w1 = li.next();
        w2 = li.next();
        if(w1.equals(w2)) {
            li.remove();
        }
    } else {
        while(li.hasNext()) {
            if(li.hasPrevious()) {
                w1 = li.previous();
                li.next(); // explained a bit above
            } else {
                w1 = li.next();
            }
            if(li.hasNext()) {
                w2 = li.next();
                if(w1.equals(w2)) {
                    li.remove();
                }
            }
        }
    }
    return dupeList;
}

Removing duplicates in an arraylist

7 Answers7