3

I am performing some maintenance tasks on an old system. I have an arraylist that contains following values:

a,b,12
c,d,3
b,a,12
d,e,3
a,b,12

I used following code to remove duplicate values from arraylist

ArrayList<String> arList;
  public static void removeDuplicate(ArrayList arlList)
  {
   HashSet h = new HashSet(arlList);
   arlList.clear();
   arlList.addAll(h);
  }

It works fine, if it finds same duplicate values. However, if you see my data carefully, there are some duplicate entries but not in same order. For example, a,b,12 and b,a,12 are same but in different order.

How to remove this kind of duplicate entries from arraylist?

Thanks

Tweet
  • 678
  • 2
  • 10
  • 26

5 Answers5

3

Assuming the entries are String. Then you can sort each of the entry and then do the duplicate check. Then you can store the entry in a map and use the contains(key) to see if they exist.

EDIT: added a complete code example.

public class Test {

    /**
     * @param args
     */
    public static void main(String[] args) {
        Test test = new Test();
        List<String> someList = new ArrayList<String>(); 
        someList.add("d,e,3");
        someList.add("a,b,12");
        someList.add("c,d,3");
        someList.add("b,a,12");
        someList.add("a,b,12");
            //using a TreeMap since you care about the order
        Map<String,String> dupMap = new TreeMap<String,String>();
        String key = null;
        for(String some:someList){
            key = test.sort(some);
            if(key!=null && key.trim().length()>0 && !dupMap.containsKey(key)){
                dupMap.put(key, some);
            }
        }
        List<String> uniqueList = new ArrayList<String>(dupMap.values());
        for(String unique:uniqueList){
            System.out.println(unique);
        }

    }
    private String sort(String key) {
      if(key!=null && key.trim().length()>0){
        char[] keys = key.toCharArray();
        Arrays.sort(keys);
        return String.valueOf(keys);
      }
      return null;
   }
}

Prints:

a,b,12

c,d,3

d,e,3

CoolBeans
  • 20,654
  • 10
  • 86
  • 101
  • first solution is expensive in terms of performance, also it can cause some disorder problems. Second solution does not work because contain key will not work for different order. – Tweet Jan 24 '11 at 03:26
  • Why do you think, String.split is too expensive? I don't think it could be done faster. What disorder problems? Do you care about the order of items in you strings or not? Shouldn't you use List> or better Set> instead of List? Your duplicate removal destroys the order, anyway, so why to use List? – maaartinus Jan 24 '11 at 03:31
  • I do care of order in string. It is small part of the main processing module of the software, if I start splitting, it may cost me performance. All arraylists contain couple of hundred thousands records. – Tweet Jan 24 '11 at 03:34
  • I am open to use any data structure other than ArrayList if it can solve my problem. Could you provide pseudo code? – Tweet Jan 24 '11 at 03:36
  • @Tweety - I will provide you code details. It's useful to know that you do care about the order. – CoolBeans Jan 24 '11 at 03:38
  • @Tweety - added code sample. Hopefully this gives you a start. – CoolBeans Jan 24 '11 at 04:06
2

Wrap the element as "Foo" instead of "String", rest of code 'removeDuplicate' remains:

public class Foo {
    private String s1;
    private String s2;
    private String s3;

    public Foo(String s1, String s2, String s3) {
     this.s1 = s1;
     this.s2 = s2;
     this.s3 = s3;
    }

 @Override
    public int hashCode() {
     final int prime = 31;
     int result = 1;
     result = prime * result + ((s1 == null) ? 0 : s1.hashCode());
     result = prime * result + ((s2 == null) ? 0 : s2.hashCode());
     result = prime * result + ((s3 == null) ? 0 : s3.hashCode());
     return result;
    }

 @Override
    public boolean equals(Object obj) {
     if (this == obj)
      return true;
     if (obj == null)
      return false;
     if (getClass() != obj.getClass())
      return false;
     Foo other = (Foo) obj;
     //Notice here: 'a,b,12' and 'b,a,12' will be same
     if(fieldsAsList().containsAll(other.fieldsAsList())){
      return true;
     }

     return false;
    }

 private List<String> fieldsAsList(){
  ArrayList<String> l = new ArrayList<String>(3);
  l.add(s1);
     l.add(s2);
     l.add(s3);
     return l;
 }    
}

Then arList will be ArrayList < Foo>.

卢声远 Shengyuan Lu
  • 31,208
  • 22
  • 85
  • 130
1

Try this simple solution...(No Set interface used)

https://stackoverflow.com/a/19434592/369035

Community
  • 1
  • 1
CarlJohn
  • 727
  • 2
  • 9
  • 20
1

Create a class to wrap around a row string (triplet) to provide your equality semantics. Implement the equals() and hashCode() methods. Then use the HashSet method to remove duplicates.

Konstantin Komissarchik
  • 28,879
  • 6
  • 61
  • 61
-1

In ArrayList we don't have a chance to remove duplicate elements directly. We can achieve it with sets, because sets don't allow duplicates, so, better to use HashSet or LinkedHashSet classes. See reference.

Eric Platon
  • 9,819
  • 6
  • 41
  • 48