Eliminating array rows that do not meet a matching criterion

Question

Consider an array, M, made up of pairs of elements. (I've used spaces to emphasize that we will be dealing with element PAIRS). The actual arrays will have a large number of rows, and 4,6,8 or 10 columns.

import numpy as np

M = np.array([[1,3,  2,1,  4,2,  3,3],
              [3,5,  6,9,  5,1,  3,4],
              [1,3,  2,4,  3,4,  7,2],
              [4,5,  1,2,  2,1,  2,3],
              [6,4,  4,1,  6,1,  4,7],
              [6,7,  7,6,  9,7,  6,2],
              [5,3,  1,5,  3,3,  3,3]])

PROBLEM: I want to eliminate rows from M having an element pair that has no common elements with any of the other pairs in that row.

In array M, the 2nd row and the 4th row should be eliminated. Here's why:
2nd row: the pair (6,9) has no common element with (3,5), (5,1), or (3,4)
4th row: the pair (4,5) has no common element with (1,2), (2,1), or (2,3)

I'm sure there's a nice broadcasting solution, but I can't see it.

score 1 · Accepted Answer · answered Nov 13 '20 at 01:53

This is a broadcasting solution. Hope it's self-explained:

a = M.reshape(M.shape[0],-1,2)

mask = ~np.eye(a.shape[1], dtype=bool)[...,None]

is_valid = (((a[...,None,:]==a[:,None,...])&mask).any(axis=(-1,-2))
            |((a[...,None,:]==a[:,None,:,::-1])&mask).any(axis=(-1,-2))
           ).all(-1)

M[is_valid]

Output:

array([[1, 3, 2, 1, 4, 2, 3, 3],
       [1, 3, 2, 4, 3, 4, 7, 2],
       [6, 4, 4, 1, 6, 1, 4, 7],
       [6, 7, 7, 6, 9, 7, 6, 2],
       [5, 3, 1, 5, 3, 3, 3, 3]])

A variation of this problem was posted on Dec 1 by user109387. — user109387, Dec 01 '20 at 22:10

Akshay Sehgal · Answer 2 · 2020-12-02T00:14:45.217

1

Another way of solving this would be the following -

M = np.array([[1,3,  2,1,  4,2,  3,3],
              [3,5,  6,9,  5,1,  3,4],
              [1,3,  2,4,  3,4,  7,2],
              [4,5,  1,2,  2,1,  2,3],
              [6,4,  4,1,  6,1,  4,7],
              [6,7,  7,6,  9,7,  6,2],
              [5,3,  1,5,  3,3,  3,3]])

MM = M.reshape(M.shape[0],-1,2)

matches_M = np.any(MM[:,:,None,:,None] == MM[:,None,:,None,:], axis=(-1,-2))
mask = ~np.eye(MM.shape[1], dtype=bool)[None,:]

is_valid = np.all(np.any(matches_M&mask, axis=-1), axis=-1)
M[is_valid]

array([[1, 3, 2, 1, 4, 2, 3, 3],
       [1, 3, 2, 4, 3, 4, 7, 2],
       [6, 4, 4, 1, 6, 1, 4, 7],
       [6, 7, 7, 6, 9, 7, 6, 2],
       [5, 3, 1, 5, 3, 3, 3, 3]])

edited Dec 02 '20 at 00:14

answered Dec 01 '20 at 23:58

Akshay Sehgal

18,741
3
21
51

Nice. I'll do some time tests to compare the methods. – user109387 Dec 02 '20 at 00:01
Mine would be slightly slower, since I am extracting the 'match' matrix (7,3,3) so that it can solve more than just this test, but also the test here - https://stackoverflow.com/questions/65099156/eliminating-array-rows-that-fail-to-meet-two-conditions – Akshay Sehgal Dec 02 '20 at 00:08
simplified the code a little taking inspiration from @Quang Hoang's more efficient answer – Akshay Sehgal Dec 02 '20 at 00:15

Eliminating array rows that do not meet a matching criterion

2 Answers2

Linked