1

I need to find non similar rows in matrix and return set of such rows.

A rows is said to be similar if the sets of numbers occurring in these rows coincide.

Example: origin:

1 2 2 4 4
4 2 1 4
3 2 4 1 5 8

expected result:

1 2 2 4 4
3 2 4 1 5 8

My ideas:

Clean duplicates from each row via convert two dimensional array to List>

Create new set of int[] and add row then if row was added its means that row is non similar.then record number of row. return created new set of rows of origin matrix. I know that I can check if element was added via Boolean return value of Add method of Set. But there is problem by forEach, that don't provide index. And I can't use expressions inside forEach. What should I do?

My code:

class NonSimilar {
    private int[][] matrix;
    private List<Set<Integer>> rows = new ArrayList<>();

    public NonSimilar (int[][] matrix) {
        this.matrix = matrix;
        for (int i = 0; i < matrix.length; i++) {
            rows.add(Arrays.stream(matrix[i]).boxed().collect(Collectors.toSet()));
        }
    }

    public Set<int[]> getNonSimilarRows() {
        Set<Set<Integer>> nonSimularRows = new HashSet<>();
        rows.forEach(item -> nonSimularRows.add(item));
        // Now I have to check successfully added rows numbers and construct new Set from Origin matrix
        return new HashSet<int[]>();
    }
}

Ok. I replaced forEach with for iteration and now all works correctly.

  public Set<int[]> getNonSimilarRows() {
        Set<Set<Integer>> nonSimularRows = new HashSet<>();
        //rows.forEach(item -> nonSimularRows.add(item));
        int index = -1;
        ArrayList<Integer> indexes = new ArrayList<>();
        for (Set<Integer> item : rows) {
            index++;
            if (nonSimularRows.add(item)) {
                indexes.add(index);
            }
        }
        HashSet<int[]> newSet = new HashSet<int[]>();
        for (Integer item : indexes) {
            newSet.add(matrix[item]);
        }
        return newSet;
    }

Anyway code looks very ugly and I want to get advice how I can refactor code with modern approaches like forEach and Stream API.

Dmitry Sokolov
  • 383
  • 2
  • 12
  • why is the expected result `1 2 2 4 4` and `3 2 4 1 5 8` but not `4 2 1 4` ? – Ousmane D. Dec 30 '18 at 15:00
  • 1
    @Aomine I believe `1 2 2 4 4` contain the same elements as `4 2 1 4` when broken down to a set. Both would be `{1, 2, 4}`. I'm assuming the OP wants to the keep the first one found. Correct me If I'm wrong. – RoadRunner Dec 30 '18 at 15:02
  • Because first and second rows are similar, but first and third are not. I need to get not similar rows. – Dmitry Sokolov Dec 30 '18 at 15:03
  • Please define how the rows are similar. Is the similarity based on what RoadRunner said? – IPat Dec 30 '18 at 15:05
  • A rows is said to be similar if the sets of numbers occurring in these rows coincide – Dmitry Sokolov Dec 30 '18 at 15:12
  • @Dmitry Sokolov would the result with the rows `1 2 4` and `3 2 4 1 5 8` be accepted also? Or do they have to be original rows? If yes, would the result with the rows `4 2 1 4` and `3 2 4 1 5 8` be accepted or does it have to be the first row encountered? – Ricola Dec 30 '18 at 15:30
  • You might have more luck asking in https://codereview.stackexchange.com/ if your code is working but could be better written. – Eric Duminil Dec 30 '18 at 15:30
  • It's hard to answer what is accepted because I don't have so much examples. From Unit test: matrix1 = {{},{1, 2, 2, 4, 4}, {4, 2, 1, 4}, {3, 2, 4, 1, 5, 8}, {2, 3, 1, 4, 1, 5, 8}, {1, 8, 1, 1, 8}}; Set NonSimilar = new MatrixNonSimilarRows(matrix1).getNonSimilarRows(); contains(matrix1[0])); contains(matrix1[1])); contains(matrix1[3])); contains(matrix1[5])); – Dmitry Sokolov Dec 30 '18 at 15:36
  • And: int[][] matrix2 = {{1, 2, 2, 4, 4}, {4, 5, 1, 4}, {2, 4, 1}, {2, 4, 1, 4, 1, 2}, {1, 8, 1, 1, 8}}; Set nonSimilarRows2 = new NonSimilar (matrix2).getNonSimilarRows(); contains(matrix2[0])) contains(matrix2[1])); contains(matrix2[4])); – Dmitry Sokolov Dec 30 '18 at 15:36
  • Well if you would like us to help you you have to know precisely what your method can do and cannot do. – Ricola Dec 30 '18 at 15:37

4 Answers4

2

You only need 2 lines of code to remove all "similar" rows:

Set<Set<Integer>> sets = new HashSet<>();

List<int[]> nonSimilar = Arrays.stream(matrix)
    .filter(row -> sets.add(Arrays.stream(row).boxed().collect(Collectors.toSet())))
    .collect(Collectors.toList());

The add() method of Set returns true if the set was changed - ie if the element being added is not already in the set, so we can use that as a filter.

List is chosen as the output of the stream to preserve order (a requirement that seems to be implied by the example data).

I leave it to the reader to convert List<int[]> to whatever output is required, because that's unimportant to the question/answer.


Some test code:

int[][] matrix = {{1, 2, 2, 4, 4},{4, 2, 1, 4}, {3, 2, 4, 1, 5, 8}};
Set<Set<Integer>> sets = new HashSet<>();

List<int[]> nonSimilar = Arrays.stream(matrix)
    .filter(row -> sets.add(Arrays.stream(row).boxed().collect(Collectors.toSet())))
    .collect(Collectors.toList());

nonSimilar.stream().map(Arrays::toString).forEach(System.out::println);

Output:

[1, 2, 2, 4, 4]
[3, 2, 4, 1, 5, 8]

See live demo.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
1

Let's just say that it you need to give the first non-duplicate rows of the existing matrix. Then instead of keeping the indexes in a separate list, you could use a Map for which the unique key is the set of numbers in a row and the value is the row itself. Here is the complete class with the main method to test it :

public class NonSimilar {
    private final int[][] matrix;

    public NonSimilar(int[][] matrix) {
        this.matrix = matrix;
    }

    public Set<int[]> getNonSimilarRows() {
        Map<Set<Integer>, int[]> map = new HashMap<>();
        for (int[] row : matrix) {
            map.putIfAbsent(convertRowToSet(row), row);
        }
        return new HashSet<>(map.values());
    }

    public Set<Integer> convertRowToSet(int[] row){
        return Arrays.stream(row).boxed().collect(Collectors.toSet());
    }

    public static void main(String[] args) {
        int[][] matrix = {{1, 2, 2, 4, 4}, {4, 2, 1, 4}, {3, 2, 4, 1, 5, 8}};
        Set<int[]> result = new NonSimilar(matrix).getNonSimilarRows();

        result.forEach(row -> System.out.println(Arrays.toString(row)));
    }
}

Now you might say that it prints

3 2 4 1 5 8
1 2 2 4 4

instead of

1 2 2 4 4
3 2 4 1 5 8

That's because the result is a Set and a set doesn't have the concept of order. If you really want it to be printed in the correct order, you can use a LinkedHashMap and return a LinkedHashSet.


NOTE : you can even make it shorter by using Collectors.toMap:

public Set<int[]> getNonSimilarRows() {
    Map<Set<Integer>, int[]> map = Arrays.stream(matrix)
            .collect(Collectors.toMap(this::convertRowToSet, Function.identity(), (r1, r2) -> null));
    return new HashSet<>(map.values());
}

(r1, r2) -> r1 is to state that you accept duplicate keys and that you should keep the first value encountered. In the case of you want to keep the last value encountered, you can replace it by (r1, r2) -> r2.

Ricola
  • 2,621
  • 12
  • 22
1

With this

you could write it like this:

public class NonSimilarRowsTest {
  @Test
  public void test() {
    int[][] matrix = {{1, 2, 2, 4, 4}, {4, 2, 1, 4}, {3, 2, 4, 1, 5, 8}};
    int[][] expected = {{1, 2, 2, 4, 4}, {3, 2, 4, 1, 5, 8}};
    assertEquals(expected, nonSimilarRows(matrix));
  }

  int[][] nonSimilarRows(int[][] matrix) {
    Set<Set<Integer>> rows = new HashSet<>();
    int[][] result = new int[matrix.length][];
    int length = 0;

    for (int[] row : matrix) {
      if (rows.add(toSet(row))) {
        result[length++] = row;
      }
    }

    return Arrays.copyOf(result, length);
  }

  Set<Integer> toSet(int[] array) {
    return Arrays.stream(array).boxed().collect(Collectors.toSet());
  }
}
G. Fiedler
  • 664
  • 4
  • 12
0

Here's another solution that maintains an unordered set that keeps tracks of duplicate rows and also maintains order by storing the results in list:

import java.util.Set;
import java.util.HashSet;
import java.util.List;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.stream.Collectors;

public class Test {

    private static final int[][] rows = new int[][] {
        { 1, 2, 2, 4, 4 },
        { 4, 2, 1, 4 },
        { 3, 2, 4, 1, 5, 8 }
    };

    private static Set<Set<Integer>> seenRows = new HashSet<>();
    private static List<int[]> uniqueRows = new ArrayList<>();

    public static void main(String[] args) {
        for (int[] row : rows) {
            Set<Integer> uniqueNumbers = Arrays.stream(row).boxed().collect(Collectors.toSet());
            if (!seenRows.contains(uniqueNumbers)) {
                uniqueRows.add(row);
                seenRows.add(uniqueNumbers);
            }
        }

        for (int[] row : uniqueRows) {
            System.out.println(Arrays.toString(row));
        }
    }
}

Output:

[1, 2, 2, 4, 4]
[3, 2, 4, 1, 5, 8]
RoadRunner
  • 25,803
  • 6
  • 42
  • 75