I have an input matrix that is of unknown n x m dimensions that is populated by 1s and 0s
For example, a 5x4 matrix:
A = array(
[[1, 0, 0, 0],
[1, 0, 0, 0],
[0, 1, 1, 0],
[0, 1, 1, 0],
[1, 0, 1, 1]])
Goal
I need to create a 1 : 1 map between as many columns and rows as possible, where the element at that location is 1.
What I mean by a 1 : 1 map is that each column and row can be used once at most.
the ideal solution has the most mappings possible ie. the most rows and columns used. It should also avoid exhaustive combinations or operations that do not scale well with larger matrices (practically, maximum dimensions should be 100x100, but there is no declared limit so they could go higher)
Here's a possible outcome of the above
array([[ 1., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 1., 0.],
[ 0., 1., 0., 0.],
[ 0., 0., 0., 1.]])
Some more Examples:
input:
0 1 1
0 1 0
0 1 1
output (one of several possible ones):
0 0 1
0 1 0
0 0 0
another (this shows one problem that can arise)
input:
0 1 1 1
0 1 0 0
1 1 0 0
a good output (again, one of several):
0 0 1 0
0 1 0 0
1 0 0 0
a bad output (still valid, but has fewer mappings)
0 1 0 0
0 0 0 0
1 0 0 0
to better show how their can be multiple outputs
input:
0 1 1
1 1 0
one possible output:
0 1 0
1 0 0
a second possible output:
0 0 1
0 1 0
a third possible output
0 0 1
1 0 0
What have I done?
I have a really dumb way of handling it right now which is not at all guaranteed to work. Basically I just build a filter matrix out of an identity matrix (because its the perfect map, every row and every column are used once) and then I randomly swap its columns (n times) and filter the original matrix with it, recording the filter matrix with the best results.
My [non] solution:
import random
import numpy as np
# this is a starting matrix with random values
A = np.array(
[[1, 0, 0, 0],
[1, 0, 0, 0],
[0, 1, 1, 0],
[0, 1, 1, 0],
[1, 0, 1, 1]])
# add dummy row to make it square
new_col = np.zeros([5,1]) + 1
A = np.append(A, new_col, axis=1)
# make an identity matrix (the perfect map)
imatrix = np.diag([1]*5)
# randomly swap columns on the identity matrix until they match.
n = 1000
# this will hold the map that works the best
best_map_so_far = np.zeros([1,1])
for i in range(n):
a, b = random.sample(range(5), 2)
t = imatrix[:,a].copy()
imatrix[:,a] = imatrix[:,b]
imatrix[:,b] = t
# is this map better than the previous best?
if sum(sum(imatrix * A)) > sum(sum(best_map_so_far)):
best_map_so_far = imatrix
# could it be? a perfect map??
if sum(sum(imatrix * A)) == A.shape[0]:
break
# jk.
# did we still fail
if sum(sum(imatrix * A)) != 5:
print('haha')
# drop the dummy row
output = imatrix * A
output[:,:-1]
#... wow. it actually kind of works.