a minimal partition of a rectangular array into rectangular subarrays of constant value

Question

let A be an MxN matrix with entries a_ij in {0, ..., n-1}.

we can think of the entries in A as a rectangular grid that has been n-colored.

I am interested in partitioning each colored region into rectangles, in such a way that the number of rectangles is minimized. That is, I want to produce n sets of quadruples

L_k = {(i, j, w, h) | a_xy = k forall i <= x < i + w, j <= y < j + h}

satisfying the condition that every a_ij belongs to exactly one rectangle and all of the rectangles are disjoint. Furthermore, the sum L_0 + ... + L_(n-1) is minimized.

Obviously, minimizing each of the L_k can be done independently, but there is also a requirement that this happen extremely fast. Assume this is a real-time application. It may be the case that since the sets are disjoint, sharing information between the L_ks speeds things up more than doing everything in parallel. n can be small (say, <100) and M and N can be large.

I assume there is a dynamic programming approach to this, or maybe there is a way to rephrase it as a graph problem, but it is not immediately obvious to me how to approach this.

EDIT:

There seems to be some confusion about what I mean. Here is a picture to help illustrate.

Imagine this is a 10x10 matrix with red = 0, green = 1, and blue = 2. draw the black boxes like so, minimizing the number of boxes. The output here would be

L_0 = {(0,8,2,2),(1,7,2,1),(2,8,1,1),(4,5,4,2),(6,7,2,2)}

L_1 = {(0,0,4,4),(4,0,6,2),(6,2,2,3),(8,8,2,2)}

L_2 = {(0,4,4,3),(0,7,1,1),(2,9,6,1),(3,7,3,2),(4,2,2,4),(8,2,2,6)}

Re “I want to produce n sets of pairs”: Do you mean sets of quadruples? — Eric Postpischil, Apr 11 '21 at 13:26
Re “the sum L_0 + ... + L_(n-1)”: The L_i are sets of quadruples. What do you mean by their sum? Do you mean the sum of the cardinalities (the numbers of elements) of tje sets? — Eric Postpischil, Apr 11 '21 at 13:28
Re “if a_ij = k, it belongs to… no rectangle in L_l where l != k”: Isn’t that redundant? L_i has only elements with value i. — Eric Postpischil, Apr 11 '21 at 13:31
If I understand the problem, each colored region is independent, and the task reduces to finding a minimum-number-of-rectangles partition of each region. — Eric Postpischil, Apr 11 '21 at 13:34
yes, but as each entry is visited and its value discovered, this information is relevant to every case. I imagine the first step of the algorithm walking one by one through each entry to build up initial information. After that step, everything could be parallelized, but before that point you would only want to do the initial work once. That's why I am specifying multiple colors. — shelljump, Apr 11 '21 at 13:38
Does [this](https://stackoverflow.com/questions/5919298/algorithm-for-finding-the-fewest-rectangles-to-cover-a-set-of-rectangles-without) or [this](https://www.semanticscholar.org/paper/Minimum-rectangular-partition-problem-for-simple-Liou-Tan/fbeb8e4299875b4327bd839eba1198b34a521f1a?p2df) solve your problem for a single region? — Eric Postpischil, Apr 11 '21 at 15:01
this seems like it could be an answer. My initial thought was to scan left to right and detect when a value changes. if you also check above you should be able to find corners, then add those to a list of active boxes, updating the width and height as you go. the links seem significantly more complicated than what I had in mind, but I will try to read through it and reply once I get a grasp on the solution. it could be my idea cant be fitted to give a minimal solution in all cases (I dont see how to fit it all together it which is why I posted) — shelljump, Apr 11 '21 at 15:31

score 0 · Answer 1 · answered Apr 12 '21 at 00:57

One thing to immediately do is to note that you can separate the problem into individual instances of connected regions of colors. From there, the post linked in the article explains how you can use a maximal matching to construct the optimal solution.

But it's probably quite hard to implement that (and judging by your tag of C, even more hard). So I recommend one of two strategies: Backtracking, or greedy.

To do backtracking, you will recurse on the set of tiles which are not yet covered (I assume this makes sense, as you have listed all integer coordinates. This changes but not massively otherwise). Take the highest, leftmost uncovered tile, and loop over all possible rectangles which contain it (there are only ~n^2 of them, and hopefully less). Then recurse.

To optimize this, you will want to prune. An easy way to prune this is to stop recursing if you have already seen a solution with a better answer. This pruning is known as branch and bound.

You can also just quit early in the backtracking, if you only need an approximate answer. Since you mentioned "real-time application," it might be okay if you're only off by a little.

Continuing with the idea of approximation, you could also do something similar by just greedily picking the largest rectangle that you can at the moment. This is annoying to implement, but doable. You can also combine this with recursion and backing out early.

a minimal partition of a rectangular array into rectangular subarrays of constant value

1 Answers1