1

I have

 Array<Array<Array<CustomObject>>> 

and I need to eliminate the duplicates of

Array<Array<CustomObject>>

The order in the arrays doesn't matter, so I don't compare by the index, I compare the elements inside the arrays. The final result needs to keep the 3D array format. So I don't need to flatten the arrays into only one big array.

One way that I thought it might help, was by sorting the arrays. I got to sort the arrays like this:

  • the most nested (1D) array (Array<CustomObject>) -> alphabetically by name

  • the next (2D) array (Array<Array<CustomObject>>) -> alphabetically but just by the first element of the most nested (1D) array (Array<CustomObject>)

  • the outer most (3D) array (Array<Array<Array<CustomObject>>>) is sorted by the size of the 2D array (Array<Array<CustomObject>>)

By these three methods, the duplicates should stand one next to the other, but I don't know how to eliminate them without creating too many nested loops to compare the elements.

Example of output for the current implementation:

[[v1], [v2, v3]]

[[v1, v2], [v3]]

[[v1, v2], [v3]]

[[v1, v2], [v2, v3]]

[[v1], [v2, v3]]

[[v1, v2], [v2, v3]]

[[v1], [v2], [v3]]

[[v1], [v2], [v3]]

[[v1], [v2], [v3]]

[[v1], [v2], [v3]]

[[v1], [v2], [v3]]

[[v1], [v2], [v3]]

I already tried using Arrays.equal and deepequal but it gives me an error stating that the method cannot be applied to my array which is of CustomObjects.

The CustomObjects class implements Comparable and I override equals, hashCode and compareTo.

Any suggestions are welcome. Thanks!

Lara
  • 781
  • 3
  • 8
  • 21
  • See [this question](http://stackoverflow.com/questions/203984/how-do-i-remove-repeated-elements-from-arraylist). – No Name Feb 14 '16 at 03:28
  • Does your final answer have to have three dimensions? Or is it okay if the arrays are flattened? Also, does it matter which array the duplicates are removed from? EDIT: by Array>> do you just mean CustomObject[][][] ? – Austin Feb 14 '16 at 04:10
  • Yes, I need the arrays to stay in this form. I can't flatten them. I have to check if the 2D arrays contain the same elements (no matter the order). So as you can see in the example, the last 6 arrays are identical so I need to eliminate the duplicates. The same goes for arrays 2 and 3. I need to keep just one appearance of the 2D arrays. – Lara Feb 14 '16 at 13:58

3 Answers3

2

You have to find a fast answer to question "are they different".

if x.length != y.length then they are different

Fast and clean answer. So your idea to sort by size is really good.

Can we make another "simple" field to make fast answer "are different"? Yes, lets count a hashcode for each and compare them. If hashcodes are different -> objects are different.

To count a hashcode we can use https://docs.oracle.com/javase/7/docs/api/java/util/Arrays.html#deepHashCode(java.lang.Object[]) .. but it will break in your case if order within array is not important, [a,b,c] == [b,a,c] in your case, but will have different results from deepHashCode.

To fight it - sort elements of arrays before using deepHashCode.

So:

  • sort all L1 arrays,
  • sort all L2 arrays (I think you should not stop on first element in your comparator)
  • make a Map[Integer,List[Element]] and populate it for each L3 deepHashCode -> List[L3]

And later iterate:

List<Element> result = ...

for (List<Element> l : map.values()) {
   if ( l.size() > 1 ) {
       final Iterator i = l.iterator();
       Element head = i.next();
       while (i.hasNext()) {
          Element cur = i.next();
          if (comparator.compare(head,cur) == 0) {
             i.remove();
          }
          head = cur;
       }
   }
   result.addAll(l);
 }
 return result.toArray(..);
ya_pulser
  • 2,620
  • 2
  • 16
  • 21
  • 1
    It worked! I used the hashCode idea. Firstly I transformed the ArrayList>> into CustomObject[][][] and then I applied deepHashCode on CustomObject[][] and stored this into a HashMap . In the end I transformed the CustomObject[][] from the hashmap back to ArrayLists (so I returned to the needed format). – Lara Feb 14 '16 at 15:22
  • 1
    @Stranger I am happy that it helped! You got the idea from rough sketch and made working code. Great! BTW: if you have had nested ArrayList's (not arrays) as incoming structure than you have had "list.hashCode" method available and as far as I remember this hashCode in array list is a "deep hash code" and will traverse all elements in the list. But you still have to sort elements in them (Collection.sort()) to have consistent hashCode'ing. – ya_pulser Feb 14 '16 at 15:28
2

If you are simply looking for distinct elements, a simple approach is to flatten the 3D array. In Java-8, you can flatten a 3D array into a Stream<CustomObject> using the following method:

public static Stream<Object> flatten(Object[][][] arr){
    return Arrays.stream(arr).parallel()
    .flatMap(Arrays::stream)
    .flatMap(Arrays::stream);
}

Then simply call:

 CustomObject[] noDupes = flatten(arr).distinct().toArray();

EDIT: If you are you are using an earlier JVM and a 1D array is acceptable, use a Set to remove duplicates:

public <T> T[] distinct(T[][][] arr){
    Set<T> noDupes = new HashSet<>();

    for(T[][] oll: arr){
        for(T[] ol: oll){
            noDupes.addAll(Arrays.asList(ol));
        }
    }
    return (T[]) noDupes.toArray();
}
Austin
  • 8,018
  • 2
  • 31
  • 37
  • I need to compare the whole 2D arrays if they have the same elements, so I need to compare the 1D arrays in each 2D array with the other 1D arrays in the other 2D arrays. For example: in the examples given in the question the last 6 2D arrays need to be pruned because they are identical, but on the other hand the first 2D array and the second (first and second line) are not the same 2D arrays. – Lara Feb 14 '16 at 15:28
-2

Have you tried using the Collections class List, which doesn't allow duplicates? Or ArrayList, which does allow duplicates and has the method to test if equal?