0

Given a matrix of 1's and 0's, I want to find a combination of rows and columns with least or none 0's, maximizing the n_of_rows * n_of_columns picked.

For example, rows (0,1,2) and columns (0,1,3) have only one zero in col #0 row #1, and the rest 8 values are 1's.

1 1 0 1 0
0 1 1 1 0
1 1 0 1 1
0 0 1 0 0

Pracical task is to search over 1000's to 1000000's of rows and columns, finding the maximal biclique in a bipartite graph – rows and cols can be viewed as verticles, and values as connections.

The problem in NP-complete, as far as I learned.

Please advice an approach / algorithm that would speed up the task and reduce requirements to CPU and memory.

Serge
  • 1,531
  • 2
  • 21
  • 44
  • The requirement is not really clear. If you want minimal number of 0s or the maximum number of columns+rows then problem is trivial. Do you want to maximise the number of 1s-0s selected? Or are you given a number K representing the maximum number of 0's you can select? – Dinu Sorin Aug 09 '17 at 14:44
  • Even though computers are man-made, it is very difficult to tell what would be faster if the same complexity. – Tony Tannous Aug 09 '17 at 14:45
  • @DinuSorin this is a way to find insights out of data. One task would be to maximize `rows * cols` while having only 1's, no 0's accepted. Another task would be to allow K 0's in the selection (or better K%), and again, maximize `rows * cols`. – Serge Aug 09 '17 at 15:14

2 Answers2

0

Not sure you could minimise thism However, easy way to work this out would be...

  1. Multiple your matrix by a 1 column and n rows full of 1's. This will give you number of ones in each row. Next do a 1 row by n columns multiplcation (at frot of) your matrix full of 1's. This will give you totals of 1's for each column, From there it's a pretty easy compairson........

ie original matrix... 1 0 1 0 1 1 0 0 0

do 1 0 1 x 1 = 2 (row totals) o 1 1 1 2 0 0 0 1 0

do 1 1 1 x 1 0 1 = 1 (Column totals) 0 1 1 2 0 0 0 0

nb max sum is 2 (which you would keep track of as you work it out.

Starmage
  • 1
  • 3
  • Thanks! I was doing this "weight" calculation for every row and every column, however it helps only a little, as within a large number of columns and rows it appears that the maximal selection contains only a few rows and columns, relative to the total number of each. It can happen that the row with max number of 1's is not included in to the final selection at all. – Serge Aug 09 '17 at 15:19
  • For storage complexity would save them as bits> There are then plenty of discussions around the quickest way to work out how many active bits there are. See... https://stackoverflow.com/questions/109023/how-to-count-the-number-of-set-bits-in-a-32-bit-integer But given there isn't much constraint to your matrix size.... – Starmage Aug 09 '17 at 17:37
0

Actually given the following assumptions: 1. You don't care how many 0's are in each row or column 2. You don't need to keep track of their order....

Then you only really need to store values to count the total in each row/column as you read the values in and don't actually store the matrix itself.

If you are given the number of rows and columns prior to reading in the matrix you can do the following heuristics to reduce computational time...

  1. Keep track of the current max. If the current row cannot reach this potential max stop counting for the row (but continue in the columns). Vice versa is true for the columns

But you still have a worst case scenario in which all rows and columns have sme number of 1's and 0's.... :)

Starmage
  • 1
  • 3