1

I have an intensity map of an image that I would like to select sub-regions with large average value. To do this, I want to find the sub-regions which maximize the sum of the intensity map pixels covered by the sub-regions. To prevent an excessive number of returned sub-regions, a penalty is applied for each additional sub-region returned. Additionally, it is fine if two sub-regions overlap, but the overlap objective value is only that of the union of the sub-regions.

More formally, suppose you have a matrix A containing non-negative values with dimensions m x n. You would like to cover the matrix with square sub-matrices with dimension s x s such that the sum of the values of A covered by the union of the area of the squares is maximized. For each square you add to the solution, a constant penalty p is subtracted from the objective value of the solution.

For instance, consider the following matrix:

0 0 0 0 0 0
0 1 2 2 1 0
0 1 2 2 2 0
0 0 0 0 0 0
0 3 0 0 0 0

with parameters p = -4 and s = 2. The optimal solution is the two squares S1 = [1, 2; 1, 2] and S2 = [2, 1; 2, 2] with coordinates (2:3,2:3) and (2:3,4:5) respectively (in Matlab notation). Note that in this example that the greedy approach of incrementally adding the squares with maximum value until no squares can be added (without decreasing the objective value) fails.

One brute force way of solving it would be to check all possible combinations using exactly k squares. Starting from k =1, you would compute the optimal combination with exactly k squares, increment k and repeat until the objective value stops increasing. This is clearly very expensive.

You can precompute the sums of values of the (m-s+1)*(n-s+1) possible squares in time O(mn) using an integral image.

Is there an efficient solution to this?

Michael Costa
  • 48
  • 1
  • 6
  • If the entries of A are nonnegative as you say, then if A is also square, the optimal solution is the single square A itself. (You mention "large average value", but don't seem to incorporate this into your objective function.) – j_random_hacker Nov 07 '14 at 15:16
  • 1
    To clarify, the squares are sub-matrices of fixed size s x s (s is a given parameter of the problem) where s <= min(m,n). In the case of the example where s=2 and p=-4, if the example matrix was padded with an extra row of zeroes on the bottom (in order to make it square), the optimal solution would not have changed. When I mentioned "large average value", it is equivalent to maximize the sum (of new elements) in the case of a fixed size covering square. The penalty constant p < 0 prevents covering the matrix with many small squares (if the squares do not sufficiently add to the objective). – Michael Costa Nov 08 '14 at 02:06
  • Apologies, I totally missed the part where you said "of dimension s x s". It's clear now. – j_random_hacker Nov 08 '14 at 17:51

1 Answers1

2

The problem is NP-Hard. This could be proven by reduction from planar minimum vertex cover. Proof for special case s=3, p=2, and A having only values 0 or 1 is identical to the proof for other SO question.

As for brute force solution, it could be made more efficient if instead of trying all combinations with increasing k, you add squares incrementally. When objective value of partial solution plus sum of not-yet-covered values is not greater than best-so-far objective value, rollback to last valid combination by removing recently added square(s) and try other squares. Avoid adding squares that add zero to objective value. Also avoid adding sub-optimal squares: if in example from OP partial solution contains square [1, 2; 1, 2], do not add square [2, 2; 2, 2] because [2, 1; 2, 2] would always be at least as good or even better. And reorder the squares in such a way that you quickly get good enough solution, this allows to terminate all further attempts sooner.

Community
  • 1
  • 1
Evgeny Kluev
  • 24,287
  • 7
  • 55
  • 98