2

I have a List which contains a list of objects and I want to remove from this list all the elements which have the same values in two of their attributes. I had though about doing something like this:

List<Class1> myList;
....
Set<Class1> mySet = new HashSet<Class1>();
mySet.addAll(myList);

and overriding hash method in Class1 so it returns a number which depends only in the attributes I want to consider.

The problem is that I need to do a different filtering in another part of the application so I can't override hash method in this way (I would need two different hash methods).

What's the most efficient way of doing this filtering without overriding hash method?

Thanks

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Javi
  • 19,387
  • 30
  • 102
  • 135
  • @Bark K. No, I can't create two classes it's the same class but from different points of views I need to consider different information to determinate whether it's the same object or not. Indeed no one of both hash() method would be correct, they would be just there to make the trick so it't not a good practice. For that reason I wanted to use something which doen't take into account hash method. – Javi Apr 20 '10 at 11:39
  • For clarity:Do you mean you want the final 'set' to have no two instances with the same values of both attr1 and attr2? – DJClayworth Apr 20 '10 at 13:39
  • @ DJClayworth exactly, that's it – Javi Apr 20 '10 at 15:29

5 Answers5

4

Overriding hashCode and equals in Class1 (just to do this) is problematic. You end up with your class having an unnatural definition of equality, which may turn out to be other for other current and future uses of the class.

Review the Comparator interface and write a Comparator<Class1> implementation to compare instances of your Class1 based on your criteria; e.g. based on those two attributes. Then instantiate a TreeSet<Class>` for duplicate detection using the TreeSet(Comparator) constructor.

EDIT

Comparing this approach with @Tom Hawtin's approach:

  • The two approaches use roughly comparable space overall. The treeset's internal nodes roughly balance the hashset's array and the wrappers that support the custom equals / hash methods.

  • The wrapper + hashset approach is O(N) in time (assuming good hashing) versus O(NlogN) for the treeset approach. So that is the way to go if the input list is likely to be large.

  • The treeset approach wins in terms of the lines of code that need to be written.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
3

Let your Class1 implements Comparable. Then use TreeSet as in your example (i.e. use addAll method).

Roman
  • 64,384
  • 92
  • 238
  • 332
  • 1
    TreeSet was also my first thought. Though implements Comparable is not required. The possibility of a custom comparator in the TreeSet constructor makes it the perfect tool for the job. – extraneon Apr 20 '10 at 11:33
2

As an alternative to what Roman said you can have a look at this SO question about filtering using Predicates. If you use Google Collections anyway this might be a good fit.

Community
  • 1
  • 1
extraneon
  • 23,575
  • 2
  • 47
  • 51
1

I would suggest introducing a class for the concept of the parts of Class1 that you want to consider significant in this context. Then use a HashSet or HashMap.

Tom Hawtin - tackline
  • 145,806
  • 30
  • 211
  • 305
0

Sometimes programmers make things too complicated trying to use all the nice features of a language, and the answers to this question are an example. Overriding anything on the class is overkill. What you need is this:

class MyClass {
  Object attr1;
  Object attr2;
}

List<Class1> list;
Set<Class1> set=....
Set<MyClass> tempset = new HashSet<MyClass>;

for (Class1 c:list) {
  MyClass myc = new MyClass();
  myc.attr1 = c.attr1;
  myc.attr2 = c.attr2;

  if (!tempset.contains(myc)) {
    tempset.add(myc);
    set.add(c);
  }
}

Feel free to fix up minor irregulairites. There will be some issues depending on what you mean by equality for the attributes (and obvious changes if the attributes are primitive). Sometimes we need to write code, not just use the builtin libraries.

DJClayworth
  • 26,349
  • 9
  • 53
  • 79