0

So, I have two multi-dimension arrays.

    double[][] combinations = new double[10000][3];
    double[][] uniqueCombinations = new double[100][3];

Array values example:

[[1.233, 1.333, 0.76], [1.1, 1.333, 1.333], [0.9, 1.1, 0.9], [1.1, 1.333, 1.333]]

Here is what I want

[[1.233, 1.333, 0.76], [1.1, 1.333, 1.333], [0.9, 1.1, 0.9]]

I want to get all the unique arrays from combinations and populate uniqueCombinations with that.

I tried with this function, but it populates with only 5 arrays, weird!

public static double[][] removeDuplicate(double[][] matrix) {
    double[][] newMatrix = new double[matrix.length][matrix[0].length];
    int newMatrixRow = 1;

    for (int i = 0; i < matrix[0].length; i++)
        newMatrix[0][i] = matrix[0][i];

    for (int j = 1; j < matrix.length; j++) {
        List<Boolean> list = new ArrayList<>();
        for (int i = 0; newMatrix[i][0] != 0; i++) {
            boolean same = true;
            for (int col = 2; col < matrix[j].length; col++) {
                if (newMatrix[i][col] != matrix[j][col]) {
                    same = false;
                    break;
                }
            }
            list.add(same);
        }

        if (!list.contains(true)) {
            for (int i = 0; i < matrix[j].length; i++) {
                newMatrix[newMatrixRow][i] = matrix[j][i];
            }
            newMatrixRow++;
        }
    }

    int i;
    for(i = 0; newMatrix[i][0] != 0; i++);

    double finalMatrix[][] = new double[i][newMatrix[0].length];
    for (i = 0; i < finalMatrix.length; i++) {
        for (int j = 0; j < finalMatrix[i].length; j++)
            finalMatrix[i][j] = newMatrix[i][j];
    }

    return finalMatrix;
}
Tahmid
  • 315
  • 3
  • 12
  • 1
    [Be careful with `==` and `!=` when looking at `double`s.](https://stackoverflow.com/questions/322749/retain-precision-with-double-in-java) – BeUndead Nov 30 '18 at 14:20
  • 1
    TLDR; Do you know that combination is not ordered? I.e. `[1.5, 2.0]` is the same as `[2.0, 1.5]`? – zlakad Nov 30 '18 at 14:22
  • 1
    no it's not ordered. It requires to be not ordered as it's a combination. – Tahmid Nov 30 '18 at 14:26
  • 1
    O.K. check this sample of [code](https://stackoverflow.com/questions/53174866/algorithm-for-generating-lists-of-shuffled-integers-from-one-original-list-of-in/53190248#53190248), it maybe useful to you... – zlakad Nov 30 '18 at 14:29

1 Answers1

0

You can try hash table based algorithm, i.e. calculate hash for each matrix vector and save vector index in hash map with hash key. Then construct a resulting matrix based on hash table index values. For example:

   import static org.junit.Assert.assertArrayEquals;

import java.util.Arrays;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;

import org.junit.Test;

import com.google.common.hash.HashFunction;
import com.google.common.hash.Hasher;
import com.google.common.hash.Hashing;

public class ArraysCombination {

    private static double[][] COMBINATIONS = { 
            {1.233, 1.333, 0.76 }, 
            { 1.1, 1.333, 1.333 }, 
            { 0.9, 1.1, 0.9 },
            { 1.1, 1.333, 1.333 } };


    private static double[][] uniqieCombinations(double[][] all) {
        final Map<Integer,Integer> uniqueIdx = new HashMap<>();
        // hashing can be replaced with Arrays.hashCode(all[i])
        final HashFunction hashFunction = Hashing.murmur3_32(all.length);
        for (int i = 0; i < all.length; i++) {
            final Hasher hasher = hashFunction.newHasher();
            for (int j = 0; j < all[i].length; j++) {
                hasher.putDouble(all[i][j]);
            }
            final Integer hash = hasher.hash().asInt();
            if( !uniqueIdx.containsKey(hash) ) {
                uniqueIdx.put(hash, Integer.valueOf(i));
            } 
        }
        double[][] arr = new double[uniqueIdx.size()][];
        Iterator<Integer> it = uniqueIdx.values().iterator();
        for (int i=0; i < arr.length; i++ ) {
            int idx = it.next();
            arr[i] = Arrays.copyOf( all[ idx ], all[idx].length  );
        }
        return arr;
    }



    @Test
    public void shouldFindUniqueCombinations() {
        double [][] uniqueCombination = uniqieCombinations(COMBINATIONS);
        for (double[] ds : uniqueCombination) {
            System.out.println(Arrays.toString(ds));
        }
        double[][] expected  = {{1.233, 1.333, 0.76}, {1.1, 1.333, 1.333}, {0.9, 1.1, 0.9}};
        for (int i = 0; i < expected.length; i++) {
            assertArrayEquals("Wrong unique combinations", expected[i] , uniqueCombination[i], 0 );
        }
    }

}

It is still a possibility on hash miss on huge matrix, so MurMur3A is provided by Google Guava is used instead of Arrays.hashCode(all[i])

Victor Gubin
  • 2,782
  • 10
  • 24