Find rectangle with the largest sum of integers that lies on its border in C++

Question

I'm currently learning and practicing the trial and error method from this document, where I stumbled upon this problem:

Given a grid of integers with m rows and n columns. Find the rectangle whose sides are parallel to the edge of the grids (the length of each sides are greater than 1) and the sum of the numbers that lies on its border is the largest.

Input:
The first line contains 2 integers m and n.
The next m lines: each line contains n integers describing a row of the grid.

Output:
The largest sum that we found.

Constraints:
2 <= m, n <= 500
Runtime < 3 s

For example:

Input:
5 4
  9 -2 -1  3
-10 -5  1 -4
  1 -1  2 -2
  3  0  0 -1
  2  2 -1  2
Output:
8
Explanation:

Here, the rectangle with the largest sum of numbers that lies on its border is:

1 -1 2
3 0 0
2 2 -1

which has a sum of 1 + -1 + 2 + 0 + -1 + 2 + 2 + 3 = 8

3rd party tough input for trying to capitalise on a maximum sum rectangle:

5 5
0  1  0  1  0
0  0 -1  0  0
0  0  0  0  0
0  0 -4  0  0
0  1  0  2  0

I've tried creating a prefix sum p[m][n] of the array and can calculate the border sum of a rectangle with the bottom right position as (i, j) and the width and height of the rectangle as (w, h):

sum = (p[i][j]     - p[i][j-w]   - p[i-h][j]   + p[i-h][j-w])
    - (p[i-1][j-1] - p[i][j-w+1] - p[i-h+1][j] + p[i-h+1][j-w+1])

Here, (p[i][j] - p[i][j-w] - p[i-h][j] + p[i-h][j-w]) calculates the sum of all the integers of the rectangle and (p[i-1][j-1]—p[i][j-w+1] - p[i-h+1][j] + p[i-h+1][j-w+1]) calculates the inside sum of the rectangle (not including the border).

I have to use four nested for loops to find all possible values of i, j, w, and h and update the maximum sum accordingly which gave me a time limit exceeded (TLE), as the constraints are runtime less than 1 second.

As a user suggested, I've created helper functions to help me calculate the border sum of a rectangle with the top left position as (x0, y0) and bottom right position as (x1, y1), which helped me fix some of the cases:

int sum(int x0, int y0, int x1, int y1)
{
    if (x0 < x1 && y0 < y1)
    {
        return p[x1][y1] - p[x0][y1] - p[x1][y0] + p[x0][y0];
    }
    else if (x0 == x1 && y0 == y1)
    {
        return a[x0][y0];
    }
    else
    {
        return 0;
    }
}

int border(int x0, int y0, int x1, int y1)
{
    return sum(x0, y0, x1, y1) - sum(x0+1, y0+1, x1-1, y1-1);
}

But still, I have to find all possible values of x0, y0, x1, y1, which gave me a TLE.

It turns out the time limit is 3 seconds.

**Comments have been [moved to chat](https://chat.stackoverflow.com/rooms/253729/discussion-on-question-by-tyghn-find-rectangle-with-the-largest-sum-of-integers); please do not continue the discussion here.** Before posting a comment below this one, please review the [purposes of comments](/help/privileges/comment). Comments that do not request clarification or suggest improvements usually belong as an [answer](/help/how-to-answer), on [meta], or in [chat]. Comments continuing discussion may be removed. — Samuel Liew, May 19 '23 at 01:45

lastchance · Accepted Answer · 2023-05-19T13:57:10.923

OK, I submit this as a possible dynamic-programming solution. I believe it is O(N³) (or, strictly mn(m+n) if m and n are very different). For a 500x500 random matrix it takes a second or so (by my watch). For a 100x100 random matrix it gives the same result as brute force.

The idea is to construct a temporary - formally B[i][j][w], but if done carefully, then only needing to store B[j][w] for two successive rows - where this represents the best sum for a rectangle of width w ending (bottom-right) at [i][j]. This requires a loop over i,j,w - hence O(N³).

Constructing it row by row from the previous best I believe that there are only two candidates each time (apologies if this diagram doesn't make too much sense)

   //                 *-----*                 
   //                 |     |                         
   // Either          |     |                 or
   //                 |     |                         
   //                 o     o                              o-----o
   //                 *-----*                              *-----*

The below code tests this for (optionally) Motomotes's examples 1 and 2 and for a large random matrix. For testing there is also a brute force solver (but don't bother to use this on large matrices). Note that if the matrix is wider than it is deep then I transpose it first (for convenience).

The relevant routine is int bestRectangleSum( vector<vector> &M )

#include <iostream>
#include <iomanip>
#include <vector>

#include <cstdlib>
#include <ctime>

using namespace std;

//------------------------------------------------------------------------------

template<typename T>
ostream &operator << ( ostream &out, const vector<vector<T>> &M )
{
   for ( auto &row : M )
   {
      for ( auto e : row ) out << setw( 10 ) << e << "  ";
      out << '\n';
   }
   return out;
}

//------------------------------------------------------------------------------

template<typename T>
vector<vector<T>> transpose( const vector<vector<T>> &M )
{
   int m = M.size(), n = M[0].size();
   vector<vector<int>> result( n, vector<int>( m ) );

   for ( int i = 0; i < m; i++ )
   {
      for ( int j = 0; j < n; j++ ) result[j][i] = M[i][j];
   }

   return result;
}

//------------------------------------------------------------------------------

vector<vector<int>> fillRand( int m, int n, int lower, int higher )
{
   vector<vector<int>> M( m, vector<int>( n ) );

   for ( int i = 0; i < m; i++ )
   {
      for ( int j = 0; j < n; j++ ) M[i][j] = lower + rand() % ( higher - lower + 1 );
   }

   return M;
}

//------------------------------------------------------------------------------

int bestRectangleSum( vector<vector<int>> &M )
{
   int m = M.size(), n = M[0].size();

   // Transpose if necessary to work with smaller maximum width
   if ( m < n )
   {
      M = transpose( M );
      int t = m;   m = n;  n = t;
   }

   // sumLeft[i][j] holds the cumulative sum to the left of (and including) [i][j])
   vector<vector<int>> sumLeft( m, vector<int>( n, 0 ) );
   for ( int i = 0; i < m; i++ )
   {
      sumLeft[i][0] = M[i][0];
      for ( int j = 1; j < n; j++ ) sumLeft[i][j] = sumLeft[i][j-1] + M[i][j];
   }

   // Row by row, B[j][w] will hold the best rectangle sum for a rectangle of width w finishing (bottom-right) at (i,j)
   int best = M[0][0] + M[0][1] + M[1][0] + M[1][1];
   vector<vector<int>> B( n, vector<int>( n + 1, 0 ) );

   // First non-zero B will be for second row (i=1)      This part: O(N^2)
   for ( int j = 1; j < n; j++ )
   {
      for ( int w = 2; w <= j; w++ ) 
      {
         B[j][w] = sumLeft[0][j] - sumLeft[0][j-w] + sumLeft[1][j] - sumLeft[1][j-w];
         if ( B[j][w] > best ) best = B[j][w];
      }
      B[j][j+1] = sumLeft[0][j] + sumLeft[1][j];
      if ( B[j][j+1] > best ) best = B[j][j+1];
   }

   // Subsequent rows - each w will give rise to two candidates:          This part O(N^3)
   //                 *-----*                 
   //                 |     |                         
   // Either          |     |                 or
   //                 |     |                         
   //                 o     o                              o-----o
   //                 *-----*                              *-----*
   //
   for ( int i = 2; i < m; i++ )
   {
      auto previous = B;
      for ( int j = 1; j < n; j++ )
      {
         for ( int w = 2; w <= j + 1; w++ )
         {
            int oldBase = sumLeft[i-1][j];
            int newBase = sumLeft[i  ][j];
            if ( w <= j ) 
            {
               oldBase -= sumLeft[i-1][j-w];
               newBase -= sumLeft[i  ][j-w];
            }
            int oldBaseMinusEnds = oldBase - M[i-1][j] - M[i-1][j-w+1];
            int candidate1 = previous[j][w] + newBase - oldBaseMinusEnds;
            int candidate2 = oldBase + newBase;
            B[j][w] = max( candidate1, candidate2 );
            if ( B[j][w] > best ) best = B[j][w];
         }
      }
   }
   return best;
}

//------------------------------------------------------------------------------

int bruteForce( const vector<vector<int>> &M )
{
   int m = M.size(), n = M[0].size();
   vector<vector<int>> sumLeft( m, vector<int>( n, 0 ) ), sumAbove( m, vector<int>( n, 0 ) );
   for ( int i = 0; i < m; i++ )
   {
      sumLeft[i][0] = M[i][0];
      for ( int j = 1; j < n; j++ ) sumLeft[i][j] = sumLeft[i][j-1] + M[i][j];
   }
   for ( int j = 0; j < n; j++ )
   {
      sumAbove[0][j] = M[0][j];
      for ( int i = 1; i < m; i++ ) sumAbove[i][j] = sumAbove[i-1][j] + M[i][j];
   }

   int best = M[0][0] + M[0][1] + M[1][0] + M[1][1];
   for ( int i1 = 0; i1 < m - 1; i1++ )
   {
      for ( int i2 = i1 + 1; i2 < m; i2++ )
      {
         for ( int j1 = 0; j1 < n - 1; j1++ )
         {
            for ( int j2 = j1 + 1; j2 < n; j2++ )
            {
               int candidate = sumLeft[i1][j2] + sumLeft[i2][j2];
               if ( j1 > 0 ) candidate -= ( sumLeft[i1][j1-1] + sumLeft[i2][j1-1] );
               if ( i2 - 1 > i1 ) candidate += ( sumAbove[i2-1][j1] - sumAbove[i1][j1] + sumAbove[i2-1][j2] - sumAbove[i1][j2] );
               if ( candidate > best ) best = candidate;
            }
         }
      }
   }
   return best;
}

//------------------------------------------------------------------------------

int main()
{
// Example 1 (first example of Motomotes)
   vector<vector<int>> M1 = { {  2, -8, -9, -2,  8 },
                              {  6,  0,  2,  7,  4 },
                              { -6, -8, -4, -4,  1 },
                              {  0, -8, -1,  4,  4 } };

// Example 2 (second example of Motomotes)
   vector<vector<int>> M2 = { {   9,  -2,  -1,   3 },
                              { -10,  -5,   1,  -4 },
                              {   1,  -1,  12,  -2 },
                              {   3,   8,   0,  -1 },
                              {   2,   2,  -1,   2 } };

// Example 3 (large random matrix)
   const int m = 100, n = 100, lower = -100, higher = 100;
   srand( time( 0 ) );
   auto M3 = fillRand( m, n, lower, higher );

   auto M = M1;

   cout << M << '\n';                                               // <==== comment out for large matrices
   cout << "Main solver: " << bestRectangleSum( M ) << '\n';        // <==== main solution
   cout << "Brute force: " << bruteForce( M ) << '\n';              // <==== don't use for large matrices
}

//------------------------------------------------------------------------------

Output for Monomotes example 1:

         2          -8          -9          -2           8  
         6           0           2           7           4  
        -6          -8          -4          -4           1  
         0          -8          -1           4           4  

Main solver: 22
Brute force: 22

(For better reproduction, output the seed and optionally accept it specified.) — greybeard, May 19 '23 at 13:34
Hi @Greybeard. At the moment it's actually testing a fixed matrix - so randomness won't cut in. I think everyone will want to generate random numbers in a different way, so I'll leave that to the user's preference. I've just realised that with the order of construction I only need to store two i-levels of B (updating the best sum as I go), so I could probably cut the memory usage substantially. — lastchance, May 19 '23 at 13:40
OK, I've edited the code to remove the profligate use of memory: now I only store B and its previous value for each row of the matrix. Choose the relevant type of matrix to check with the ```M=``` line in ```int main()```. Brute force can be used as a check for 100x100 or so if you want to use a random matrix. — lastchance, May 19 '23 at 14:01
This is a very interesting approach. But may I ask: why do we only need to loop through `i`, `j`, and `w`? What about the height of the rectangle? — tyghn, May 20 '23 at 06:06
@tyghn, the idea of dynamic programming here is that you refer the best at one level to the previous one and so reel off the global optimum in the final line. The best of width w in line i either extends the same width rectangle in line i-1 (candidate 1) or starts afresh with a rather squat rectangle (candidate 2). Height h is obviously embodied in this but you don’t have to loop over it as you are already holding the best at each stage. — lastchance, May 20 '23 at 06:55
Oh, I see, very nice approach indeed. I was too focused on Kadane's without considering other approaches:) — tyghn, May 20 '23 at 14:03
Are you certain that this works? In dynamic programming one normally needs a proof of certain properties to ensure that it generates the correct solution. — Hans Olsson, May 23 '23 at 06:57
Hello @Hans Olsson. Thank-you for looking at the proposed code. If I define B(i,j,w) as the "best" sum for a rectangle of width w ending at (i,j) then this rectangle must be one of the 2 candidate forms that I have shown. Candidate 1 is a rectangle including the best from the previous row of the same width B(i-1,j,w). If there were a top that gave a bigger sum for B(i,j,w) then it would also have to give a bigger sum for B(i-1,j,w) - this leads to a proof by contradiction. I have tried many, many randomly generated matrices and the solution gives the same as brute force. — lastchance, May 23 '23 at 09:00
If you can give a counter-example then obviously I would love to see it. Other than allowing the sum to overflow an int (in which case you could switch to long long) I haven't managed to break it. — lastchance, May 23 '23 at 09:02

Matt · Answer 2 · 2023-05-23T19:54:08.927

Here is an O(m*m*n) in time w/ O(m*n) memory algorithm. (If n<m then the transpose grid is used instead, guaranteeing that m<=n.)

First, calculate two arrays that give prefix sums for the rows and columns:

// prefix_row : (i,j) -> sum of values from (i,0) to (i,j) inclusive
// prefix_col : (i,j) -> sum of values from (0,j) to (i,j) inclusive

Using these, the perimeter sum can be calculated in O(1) time given the opposite corners (i0,j0) and (i1,j1):

perimeter_sum(i0,j0,i1,j1) =
    prefix_col[i1][j0] - prefix_col[i0][j0] // left
  + prefix_col[i1][j1] - prefix_col[i0][j1] // right
  + prefix_row[i0][j1] - prefix_row[i0][j0] // top
  + prefix_row[i1][j1] - prefix_row[i1][j0] // bottom
  + grid[i0][j0] - grid[i1][j1] // compensate missing and double-counted

Next, group these into two sums: those that depend on j0 and those that depend on j1. Let's name them x and y respectively:

perimeter_sum(i0,j0,i1,j1) = x(i0,i1,j0) + y(i0,i1,j1)

where

x(i0,i1,j0) = prefix_col[i1][j0] - prefix_col[i0][j0] -
              prefix_row[i1][j0] - prefix_row[i0][j0] + grid[i0][j0]

y(i0,i1,j1) = prefix_col[i1][j1] - prefix_col[i0][j1] +
              prefix_row[i1][j1] + prefix_row[i0][j1] - grid[i1][j1]

The strategy now is to loop over i0 and i1 with i0<i1. For a given i0 and i1, max(x+y) over varying j0<j1 can now be calculated in O(n) time given that x and y are independent O(1) functions of j0 and j1. Overall this results in O(m*m*n) time to find the final maximum value for x+y.

Below is the working C++ code.

#include <iostream>
#include <limits>
#include <vector>

// Prepare 2 arrays:
// prefix_row : (i,j) -> sum of values from (i,0) to (i,j) inclusive
// prefix_col : (i,j) -> sum of values from (0,j) to (i,j) inclusive

int m, n;
std::vector<std::vector<int> > grid;
std::vector<std::vector<int> > prefix_row;
std::vector<std::vector<int> > prefix_col;

void set_grid() {
  std::cin >> m >> n;
  if (m <= n) {
    grid.resize(m, std::vector<int>(n));
    for (int i = 0; i < m; ++i) {
      for (int j = 0; j < n; ++j) {
        std::cin >> grid[i][j];
      }
    }
  } else {
    std::swap(m, n); // import transposed grid so that m < n.
    grid.resize(m, std::vector<int>(n));
    for (int j = 0; j < n; ++j) {
      for (int i = 0; i < m; ++i) {
        std::cin >> grid[i][j];
      }
    }
  }
}

void set_prefix_row() {
  prefix_row.resize(m, std::vector<int>(n));
  for (int i = 0; i < m; ++i) {
    int prefix_sum = 0;
    for (int j = 0; j < n; ++j) {
      prefix_sum += grid[i][j];
      prefix_row[i][j] = prefix_sum;
    }
  }
}

void set_prefix_col() {
  prefix_col.resize(m, std::vector<int>(n));
  for (int j = 0; j < n; ++j) {
    int prefix_sum = 0;
    for (int i = 0; i < m; ++i) {
      prefix_sum += grid[i][j];
      prefix_col[i][j] = prefix_sum;
    }
  }
}

int x(int const i0, int const i1, int const j0) {
  return prefix_col[i1][j0] - prefix_col[i0][j0] -
         prefix_row[i1][j0] - prefix_row[i0][j0] + grid[i0][j0];
}

int y(int const i0, int const i1, int const j1) {
  return prefix_col[i1][j1] - prefix_col[i0][j1] +
         prefix_row[i1][j1] + prefix_row[i0][j1] - grid[i1][j1];
}

int get_max_sum() {
  int max_sum = std::numeric_limits<int>::min();
  for (int i0 = 0; i0 < m - 1; ++i0) {
    for (int i1 = i0 + 1; i1 < m; ++i1) {
      int max_x = std::numeric_limits<int>::min();
      for (int j = 1; j < n; ++j) {
        max_x = std::max(max_x, x(i0, i1, j - 1)); // O(1)
        max_sum = std::max(max_sum, max_x + y(i0, i1, j)); // O(1)
      }
    }
  }
  return max_sum;
}

int main() {
  set_grid();                        // O(m*n)
  set_prefix_row();                  // O(m*n)
  set_prefix_col();                  // O(m*n)
  int const max_sum = get_max_sum(); // O(m*m*n)
  std::cout << max_sum << std::endl;
  return 0;
}

Credit: n. m. linked to this answer of which this algorithm is based on with some modifications/corrections.

Note: If you copy and paste sample grid data into an input file from this page, it may not work due to non-whitespace blank characters. For std::cin to read in int values, they must be separated by "regular" whitespace like spaces, tabs, or newlines.

(Very clear textual description. A good thing, probably, that this isn't CodeReview.) — greybeard, May 19 '23 at 21:22
I haven't thought of using prefix sum for rows and columns. Nice approach! It was a competitive programming problem so the input are separated by "regular" whitespace. — tyghn, May 20 '23 at 00:24

score 0 · Answer 3 · answered May 24 '23 at 10:32

There are paradigms: I can't see how to apply divide and conquer to advantage.

There are tactics like having the innermost loop access memory sequentially
or having it do the most iterations to incur loop setup overhead less frequently (lastchance suggests transposition here):
assuming handling vertical stripes (fixed left and right column) from here.

I think it's simpler to iterate left and right than width (w) and right (j).
Stripping down lastchance's rev 3 bestRectangleSum():

/** Return the maximum of all rect border value sums in M.
 *  M may get transposed.
 *  A rect is a rectangle with sides of length > 1 parallel to the axes.
 *  Derived from lastchance's answer's rev 3 bestRectangleSum()
 *  in https://stackoverflow.com//revisions/76289514/3
 */
int bestRectangleSum(vector<vector<int>> M)
{
    // Transpose if necessary to work with large #rows
    if (M.size() < M[0].size())
        transpose(M);
    const int m = M.size(), n = M[0].size();

    // sumLeft[col][row] holds the cumulative
    // sum to the left of (excluding) M[row][col])
    vector<vector<int>> sumLeft(n+1, vector<int>(m, 0));
    for (int row = 0, sum ; row < m ; row++) {
        sumLeft[0][row] = sum = 0;
        for (int col = 0 ; col < n ; col++) 
            sumLeft[col+1][row] = sum += M[row][col];
    }

    int best = M[0][0] + M[0][1] + M[1][0] + M[1][1];
    // each subsequent row will give rise to two candidates:  This part O(N^3)
    //                 *----*                 
    //                 |    |                         
    // Either          |    |               or
    //                 |    |                         
    //                 o    o                              o----o
    //                 *----*                              *----*
    for (int left = 0 ; left < n - 1 ; left++) {
        const vector<int> leftPrefix = sumLeft[left];
        for (int right = left + 1, old ; right < n ; right++) {
            const vector<int> rightPrefix = sumLeft[right+1];
            int border = rightPrefix[0] - leftPrefix[0]          // top *----*
                + (old = rightPrefix[1] - leftPrefix[1]);        //     *----*
            for (int bottom = 1 ; ; ) {
                if (best < border)
                    best = border;
                border += M[bottom][right] + M[bottom][left];    // add |    | 
                if (m <= ++bottom)
                    break;
                int current = rightPrefix[bottom] - leftPrefix[bottom];
                border = max(old, border - old) + current;  // + bottom *----*
                old = current;
            }
        }
    }
    return best;
}

Blissfully ignorant of C++ multidimensional array storage organisation (OK, row major),
I have an inkling that with a vector of vectors it would be advantageous to process the transpose of the above - for one thing, one gets to determine sumLeft[left] and …right+1 outside the inner loop, using just the row as an index there.

One not overly promising strategy:
Precompute the prefix sum of the sum of the two largest values per column T2 in O(mn) time,
same for "maximum sub-column sum" MC.
With the score for a candidate established, disregard stripes where the sum of MC in each border column (upper limit on vertical borders) + the sum of the T2s in between (upper limit on "horizontal connections") doesn't exceed the candidate sum.
This would in turn suggest to handle stripes not say increasing left in the outer loop and right in the nested one, but decreasing width (assuming total sum positive) in the outer loop and iterating left(&right=left+width) in the nested one.
A candidate to start with may be sought using the columns with highest MC.

The transposing is mainly to keep memory use down (maximum j and w, and hence the size of B(j,w)). It doesn't actually seem to change the execution time. If w were moved to be the outer loop then it would still work and could be parallelised (which is probably cheating!) — lastchance, May 24 '23 at 10:47
`transposing [to keep additional] memory use down` is doing only so much where the prefix sums take O(mn) memory (says someone going to lengths to get rid of of `B` entirely). — greybeard, May 24 '23 at 15:36

lastchance · Answer 4 · 2023-05-25T10:35:07.523

I hope that @Greybeard will and @tyghn will permit me a second solution - because it is recursive and different from the others. It is unashamedly influenced by Greybeard's prompt to look for "Divide-and-Conquer" algorithms: things like mergeSort.

NOTE: I had hoped to end up with an O(N^2.log(N)) solution, but, sadly, each subdivision still ends up doing O(N^2) work, so the overall algorithm is still O(N^3). In fact, it is slower than my O(N^3) dynamic-programming solution. I leave the solution here in the hope that others might be able to see how to make the "conquer" bit of divide-and-conquer do less work on a smaller set of rows.

The idea, like many divide-and-conquer techniques, is to repeatedly divide the domain into 2 and then merge optimals.

Here, we successively divide the matrix into upper (1) and lower (2) layers, with top and bottom rows being i1 and i2:

--------------------- i1

   upper (1)

.....................

   lower (2)

--------------------- i2

In EACH layer we calculate "best" arrays B[][], U[][], N[][], L[][] of different shapes, where B[j1][j2] etc. means the best starting in column j1 and finishing in column j2. The shapes of these are:

   +---+             +   +             +---+             +   +
B: |   |          U: |   |          N: |   |          L: |   |
   +---+             +---+             +   +             +   +

Note that U and N do not need "drop lines" - they could consist of just horizontal lines. However, they must extend to the top or bottom of layer, whilst L extends through the whole depth.

EDIT: actually, for B we only need to retain a scalar from each layer, because it is not necessary to merge it across layers; this is done in the edited code below. (Reduces the number of arrays and makes code slightly simpler and faster.)

When we come to "merge" the layers then there are only a small number of candidates. e.g. for B

-------------------------------------------------
       +---+
Either |B1 |                             +---+
       +---+                             |   |
.........................................|   | N1 + U2
                       +---+             |   |
             or        |B2 |     or      +---+
                       +---+
------------------------------------------------

and similarly for the other shapes.

When you have divided down to a single row then the evaluation of B, U, N, L for each is trivial.

Below is the code, with some checking routines and a random solution.

#include <iostream>
#include <iomanip>
#include <vector>
#include <algorithm>
#include <cstdlib>
#include <ctime>
#include <utility>
#include <limits>
using namespace std;

using matrix = vector<vector<int>>;
const int LARGE_NEGATIVE = numeric_limits<int>::min();

//------------------------------------------------------------------------------

ostream &operator << ( ostream &out, const matrix &M )
{
   for ( auto &row : M )
   {
      for ( auto e : row ) out << setw( 10 ) << e << "  ";
      out << '\n';
   }
   return out;
}

//------------------------------------------------------------------------------

matrix transpose( const matrix &M )
{
   int m = M.size(), n = M[0].size();
   matrix result( n, vector<int>( m ) );

   for ( int i = 0; i < m; i++ )
   {
      for ( int j = 0; j < n; j++ ) result[j][i] = M[i][j];
   }

   return result;
}

//------------------------------------------------------------------------------

matrix fillRand( int m, int n, int lower, int higher )
{
   matrix M( m, vector<int>( n ) );

   for ( int i = 0; i < m; i++ )
   {
      for ( int j = 0; j < n; j++ ) M[i][j] = lower + rand() % ( higher - lower + 1 );
   }

   return M;
}

//------------------------------------------------------------------------------

int allShapes( matrix &M, int i1, int i2, int n, matrix &U, matrix &N, matrix &L )
{
   int B;

   if ( i2 == i1 )                               // Single row
   {
      vector<int> sumLeft( n );
      sumLeft[0] = M[i1][0];
      for ( int j2 = 1; j2 < n; j2++ )
      {
         sumLeft[j2] = sumLeft[j2-1] + M[i1][j2];
         for ( int j1 = 0; j1 < j2; j1++ )
         {
            U[j1][j2] = N[j1][j2] = sumLeft[j2] - ( j1 > 0 ? sumLeft[j1-1] : 0 );
            L[j1][j2] = M[i1][j1] + M[i2][j2];
         }
      }
      B = LARGE_NEGATIVE;    // No B's from a single row.
   }

   else                                          // Split into upper and lower layers

   {
      // *** Divide and "conquer" (not quite, sadly - always do O(n^2) work) ***
      // Workspace - potentially slow to set up multiple times
      matrix U1, U2, N1, N2, L1, L2;
      U1 = U2 = N1 = N2 = L1 = L2 = matrix( n, vector<int>( n ) );

      int im = i1 + ( i2 - i1 ) / 2;             
      int B1 = allShapes( M, i1    , im, n, U1, N1, L1 );
      int B2 = allShapes( M, im + 1, i2, n, U2, N2, L2 );
      B = max( B1, B2 );
  
      // *** Merge ***
      for ( int j2 = 1; j2 < n; j2++ )
      {
         for ( int j1 = 0; j1 < j2; j1++ )
         {
            B = max( B, N1[j1][j2] + U2[j1][j2] );
            U[j1][j2] = max( U1[j1][j2], U2[j1][j2] + L1[j1][j2] );
            N[j1][j2] = max( N2[j1][j2], N1[j1][j2] + L2[j1][j2] );
            L[j1][j2] = L1[j1][j2] + L2[j1][j2];
         }
      }
   }

   return B;
}

//------------------------------------------------------------------------------

int maxPerimeterSum( matrix &M )
{
   int m = M.size(), n = M[0].size();

   // Transpose if necessary to work with smaller maximum width (==> less memory needed)
   if ( m < n )
   {
      M = transpose( M );
      int t = m;   m = n;  n = t;
   }

   // Main arrays
   matrix( n, vector<int>( n ) );
   matrix U, N, L;   U = N = L = matrix( n, vector<int>( n ) );

   // Call master routine
   return allShapes( M, 0, m - 1, n, U, N, L );
}

//------------------------------------------------------------------------------

int bruteForce( const vector<vector<int>> &M )
{
   int m = M.size(), n = M[0].size();
   vector<vector<int>> sumLeft( m, vector<int>( n, 0 ) ), sumAbove( m, vector<int>( n, 0 ) );
   for ( int i = 0; i < m; i++ )
   {
      sumLeft[i][0] = M[i][0];
      for ( int j = 1; j < n; j++ ) sumLeft[i][j] = sumLeft[i][j-1] + M[i][j];
   }
   for ( int j = 0; j < n; j++ )
   {
      sumAbove[0][j] = M[0][j];
      for ( int i = 1; i < m; i++ ) sumAbove[i][j] = sumAbove[i-1][j] + M[i][j];
   }

   int best = M[0][0] + M[0][1] + M[1][0] + M[1][1];
   for ( int i1 = 0; i1 < m - 1; i1++ )
   {
      for ( int i2 = i1 + 1; i2 < m; i2++ )
      {
         for ( int j1 = 0; j1 < n - 1; j1++ )
         {
            for ( int j2 = j1 + 1; j2 < n; j2++ )
            {
               int candidate = sumLeft[i1][j2] + sumLeft[i2][j2];
               if ( j1 > 0 ) candidate -= ( sumLeft[i1][j1-1] + sumLeft[i2][j1-1] );
               if ( i2 - 1 > i1 ) candidate += ( sumAbove[i2-1][j1] - sumAbove[i1][j1] + sumAbove[i2-1][j2] - sumAbove[i1][j2] );
               if ( candidate > best ) best = candidate;
            }
         }
      }
   }
   return best;
}

//------------------------------------------------------------------------------

void check( matrix &M )
{
   int m = M.size(), n = M[0].size();

   cout << "Case: m = " << m << "    n = " << n << "\n\n";
// if ( m <= 5 && n <= 5 ) cout << "Matrix:\n" << M << "\n\n";
   cout << "Main solver:           " << maxPerimeterSum( M ) << '\n';          // <==== O(N^3) solution
   cout << "Not-quite brute force: " << bruteForce     ( M ) << '\n';          // <==== O(N^4) solution
   cout << "\n=======\n\n";
}

//------------------------------------------------------------------------------

int main()
{
   matrix M;

// Example 1 (first example of Motomotes)
   M = { {  2, -8, -9, -2,  8 },
         {  6,  0,  2,  7,  4 },
         { -6, -8, -4, -4,  1 },
         {  0, -8, -1,  4,  4 } };
   check( M );

// Example 2 (second example of Motomotes)
   M = { {   9,  -2,  -1,   3 },
         { -10,  -5,   1,  -4 },
         {   1,  -1,  12,  -2 },
         {   3,   8,   0,  -1 },
         {   2,   2,  -1,   2 } };
   check( M );

// Examples of random matrices
   const int lower = -100, higher = 100;
   srand( time( 0 ) );
   vector<pair<int,int>> tests = { { 2, 2 }, { 2, 4 }, { 4, 2 }, { 10, 10 }, { 100, 50 }, { 50, 100 }, { 500, 500 } };
   for ( pair<int,int> pr : tests )
   {
      M = fillRand( pr.first, pr.second, lower, higher );
      check( M );
   }
}

Output:

Case: m = 4    n = 5

Main solver:           22
Not-quite brute force: 22

=======

Case: m = 5    n = 4

Main solver:           23
Not-quite brute force: 23

=======

Case: m = 2    n = 2

Main solver:           -64
Not-quite brute force: -64

=======

Case: m = 2    n = 4

Main solver:           129
Not-quite brute force: 129

=======

Case: m = 4    n = 2

Main solver:           16
Not-quite brute force: 16

=======

Case: m = 10    n = 10

Main solver:           1059
Not-quite brute force: 1059

=======

Case: m = 100    n = 50

Main solver:           3448
Not-quite brute force: 3448

=======

Case: m = 50    n = 100

Main solver:           4457
Not-quite brute force: 4457

=======

Case: m = 500    n = 500

Main solver:           11369
Not-quite brute force: 11369

=======

score 0 · Answer 5 · answered May 25 '23 at 17:54

The Naive Solution for this problem is to check every possible rectangle in the given 2D array. This solution requires 6 nested loops –

4 for start and end coordinate of the 2 axis O(n4)

and 2 for the summation of the sub-matrix O(n2).

The overall time complexity of this solution would be O(n6).

#include <bits/stdc++.h>
using namespace std;

#define ROW 4
#define COL 5


int find(int* arr, int* start, int* finish, int n)
{
    // initialize sum, maxSum and
    int sum = 0, maxSum = INT_MIN, i;

    // Just some initial value to check
    // for all negative values case
    *finish = -1;

    // local variable
    int local_start = 0;

    for (i = 0; i < n; ++i)
    {
        sum += arr[i];
        if (sum < 0)
        {
            sum = 0;
            local_start = i + 1;
        }
        else if (sum > maxSum)
        {
            maxSum = sum;
            *start = local_start;
            *finish = i;
        }
    }

    // There is at-least one
    // non-negative number
    if (*finish != -1)
        return maxSum;

    // Special Case: When all numbers
    // in arr[] are negative
    maxSum = arr[0];
    *start = *finish = 0;

    // Find the maximum element in array
    for (i = 1; i < n; i++)
    {
        if (arr[i] > maxSum)
        {
            maxSum = arr[i];
            *start = *finish = i;
        }
    }
    return maxSum;
}

// The main function that finds
// maximum sum rectangle in M[][]
void findMaxSum(int M[][COL])
{
    // Variables to store the final output
    int maxSum = INT_MIN,
                finalLeft,
                finalRight,
                finalTop,
                finalBottom;

    int left, right, i;
    int temp[ROW], sum, start, finish;

    // Set the left column
    for (left = 0; left < COL; ++left) {
        // Initialize all elements of temp as 0
        memset(temp, 0, sizeof(temp));

        // Set the right column for the left
        // column set by outer loop
        for (right = left; right < COL; ++right) {

            // Calculate sum between current left
            // and right for every row 'i'
            for (i = 0; i < ROW; ++i)
                temp[i] += M[i][right];

            
            sum = find(temp, &start, &finish, ROW);

            // Compare sum with maximum sum so far.
            // If the sum is more, then update maxSum and
            // other output values
            if (sum > maxSum) {
                maxSum = sum;
                finalLeft = left;
                finalRight = right;
                finalTop = start;
                finalBottom = finish;
            }
        }
    }

    // Print final values
    cout << "(Top, Left) ("
        << finalTop << ", "
        << finalLeft
        << ")" << endl;
    cout << "(Bottom, Right) ("
        << finalBottom << ", "
        << finalRight << ")" << endl;
    cout << "Max sum is: " << maxSum << endl;
}

// Driver Code
int main()
{
    int M[ROW][COL] = { { 1, 2, -1, -4, -20 },
                        { -8, -3, 4, 2, 1 },
                        { 3, 8, 10, 1, 3 },
                        { -4, -1, 1, 7, -6 } };

    // Function call
    findMaxSum(M);

    return 0;
}

*The overall time complexity of this solution would be O(* n *⁶)* or O((mn)³), using *m* and *n* from the problem statement - how does this answer the implied question *How does one give the maximum border sum significantly faster than* O((*mn*)²) ?* (avoiding TLE) ? — greybeard, May 26 '23 at 01:56

Find rectangle with the largest sum of integers that lies on its border in C++

5 Answers5