2

Given a 2D Boolean array, want to find maximal bicliques (complete bipartite graphs) – set of rows×cols having all cells set to true.

The task is NP-complete and I managed to solve it imperfectly.

Now it gets more interesting:

  1. want to add some "fuzziness": allow a row to be included into a set, if it only misses up to X cells. We'll consider "as if" they're "on" if that helps grow the biclique bigger. How more complex would that make the algorithm? So far I can only see that every found biclique should test all data set again for possible "fakes" that would help grow. Feels wrong. Please advice an algorithm.

  2. the size of the data gets bigger. Maybe if you're fluent with Machine Learning techniques – please advice some "approximate" approach to the problem. With bigger datasets approximate results are sufficient. Please suggest a technique applicable to this problem.

Illustrations of the problem

pic.1 - added cells

Blue cells are true, white false. The "fuzzy" maximal biclique is framed red. Each row has received 3 "fantom" cells, colored grey. Order of rows/cols does not matter. The maximal biclique doesn't have to be a continuous rectangle, here it's only for the sake of illustrating.

pic.2 - tests screenshot

In this test light-blue cells were cancelled out by an optimization phase. The rest form two bi-cliques, rows×cols: (3,5)×(1,3,4,6) and (3,4,5)×(1,3).

However, considering fuzziness=2, row 4 can have cols (4,6) added, and rows (3,5) can have col 5 added, resulting in one bigger biclique (3,4,5)×(1,3,4,5,6).

Serge
  • 1,531
  • 2
  • 21
  • 44
  • 1
    Could you explain what the exact definition of biclique is in your case? I am a bit confused by your second example which seems to allow skipping rows/columns. Also your definition doesn't seem to have the constraint that edges aren't allowed to have both endpoints in the same subset (meaning one of the two defining subsets of the biclique). – H W Jul 01 '15 at 12:56
  • 1
    "The task is NP-complete" -- why? Given a _bipartie_ graph, find its maximal (in terms of number of vertices) subgraph that is a complete _bipartie_ graph is not NP-complete, is it? If you invert (on<->off) all the edges, all you need is an independent set, which has a simple polynomial solution in bipartie graphs. – Petr Jul 01 '15 at 14:21
  • 1
    @HW consider *rows* as one set and *columns* as another. A biclique would contain those rows and columns where every row-col crossing is "on". Order of rows/columns does not matter. Nor any two of them are required to be adjacent. – Serge Jul 01 '15 at 16:20
  • 1
    @Petr [Proof](http://www.sciencedirect.com/science/article/pii/S0166218X03003330) and a [related question](http://stackoverflow.com/questions/29868621/how-to-find-pattern-groups-in-boolean-array). – Serge Jul 01 '15 at 16:29
  • 1
    @Serge, ok, that's if you define the size of biclique to be the number of cells in it. Do you indeed need to maximize the total number of cells? As because if you need to maximize the number of rows + columns, it is (as stated in the same paper) polynomial, and that's what I thought about initially. – Petr Jul 01 '15 at 19:04
  • 1
    @Petr's suggestion is very smart, and will probably find good bicliques quickly in large graphs. In fact if it happens to find a biclique with row and column counts that differ by at most 1, you know this is also the optimal biclique in terms of total cell count. If not, you can use the knowledge that no solution can contain a superset of these rows *and* a strict superset of these columns (or vice versa) to do branch-and-bound: assuming there are more rows than columns in the Petr-solution, choose each row in turn and branch to Petr-solve the subproblem in which this row has been deleted. – j_random_hacker Jul 01 '15 at 21:14
  • 1
    Again using Petr's technique, you could "greedily grow" fuzzy bicliques having at most k gaps per row from the Petr-optimal exact biclique by repeatedly looking for vertices that have <= k neighbours in the independent set. This won't guarantee to find the optimal such fuzzy biclique, but it might be good enough. – j_random_hacker Jul 01 '15 at 21:25

0 Answers0