0

So I searched for this "problem" and only came accross questions asking how to remove real duplicates from a list. But what I want is to remove every Object being equal to another Object in the list according to a custom .equals() method.

Here I have an example class with the equals() method being overriden:

    private static class Test {

        int x;
        float[] data;

        public Test(int x, float[] data) {
            this.x = x;
            this.data = data;
        }

        @Override
        public boolean equals(Object obj) {
            if (obj instanceof Test) {
                Test compare = (Test) obj;
                if (
                        compare.x == this.x &&
                        Arrays.equals(compare.data, this.data)
                ) {
                    return true;
                }
            }
            return false;
        }

    }

Now the following would not be the same of course (no duplicates which could be eliminated by a HashMap for example):

    Test test1 = new Test(3, new float[]{0.1f, 0.4f});
    Test test2 = new Test(3, new float[]{0.1f, 0.4f});

But in my case they are a duplicate and I want to keep only one of them.

I came up with this approach:

    Test test1 = new Test(3, new float[]{0.1f, 0.4f});
    Test test2 = new Test(3, new float[]{0.1f, 0.4f});
    Test test3 = new Test(2, new float[]{0.1f, 0.5f});

    List<Test> list = new ArrayList<>();
    list.add(test1);
    list.add(test2);
    list.add(test3);

    Set<Test> noDuplicates = new HashSet<>();

    for (Test testLoop : list) {

        boolean alreadyIn = false;

        for (Test testCheck : noDuplicates) {
            if (testLoop.equals(testCheck)) {
                alreadyIn = true;
                break;
            }
        }

        if (!alreadyIn) {
            noDuplicates.add(testLoop);
        }

    }

And this works fine but is not that nice in terms of performance. (In my case it is important because the list size can be big)

Now my question: Is there a more convenient approach to achieve this?

Jakob
  • 141
  • 1
  • 13

3 Answers3

2

I may have totally misunderstood what you need, but I think you just need to also overwrite the hashCode() to produce the same hash code in cases where equals is true.

So a method which generate a hash code for compare.data. If you do this, then you can just add all the elements to a hastSet to remove duplicates.

Remember the rule: If you overwrite equals, you must also overwrite hashCode.

MTilsted
  • 5,425
  • 9
  • 44
  • 76
1

By definition, a set doesn't allow duplicates.

Set<Test> noDuplicates = new HashSet<>();
noDuplicates.addAll(list);

EDIT: for this to work, you must define hashCode() too, not just equals().

Olivier
  • 13,283
  • 1
  • 8
  • 24
0

HashSets use the hashCode() function to determine if an object is a duplicate or not.

So you will want to override the hashCode() function for your Test class.

This will look like:

private static class Test {

    int x;
    float[] data;

    ...

    @Override
    public int hashCode() {
        int hash = Arrays.hashCode(data);
        hash = hash * 31 + x;
        return hash;
    }
}

Now if you add elements to a HashSet that holds Test, it will decipher duplicates properly:

Test test1 = new Test(3, new float[]{0.1f, 0.4f});
Test test2 = new Test(3, new float[]{0.1f, 0.4f});
Test test3 = new Test(2, new float[]{0.1f, 0.5f});

Set<Test> noDuplicates = new HashSet<>();

noDuplicates.add(test1);
noDuplicates.add(test2);
noDuplicates.add(test3);

Keep in mind that you will have to update the hashCode() function in Test every time that you add a member variable you want included in the equality check.

Credit goes to Jon Skeet for the method of concatenating hash code functions I used above.

agillgilla
  • 859
  • 1
  • 7
  • 22