0

Please, I would like to find the maximum sum with only one value per row. I already made the resolution by brute force and it is O (N^5). Now I would like to find a way with dynamic programming or another way to reduce the complexity.

For example:

Matrix:

  100   5   4   3  1

  90   80  70  60  50

  70   69  65  20  10

  60   20  10   5   1

  50   45  15   6   1

Solution for 5 sets:

  1. 100 + 90 + 70 + 60 + 50 = 370

  2. 100 + 90 + 69 + 60 + 50 = 369

  3. 100 + 90 + 70 + 60 + 45 = 365

  4. 100 + 90 + 65 + 60 + 50 = 365

  5. 100 + 90 + 69 + 60 + 45 = 364

Sum: 1833

example for the sum with brute force:

  for(int i=0; i<matrix[0].size(); i++) {
    for(int j=0; j<matrix[1].size(); j++) {
      for(int k=0; k<matrix[2].size(); k++) {
        for(int l=0; l<matrix[3].size(); l++) {
          for(int x=0; x<matrix[4].size(); x++) {
            sum.push_back(matrix[0][i] + matrix[1][j] + matrix[2][k] + matrix[3][l] + matrix[4][x]);
          }
        }
      }
    }
  }
  
sort(sum.begin(), sum.end(), mySort);

Thanks!

Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52

3 Answers3

1

You can solve it in O(k*log k) time with Dijkstra's algorithm. A node in a graph is represented by a list with 5 indexes of the numbers in the corresponding rows of the matrix.

For example in the matrix

100 5  4  3  1
90  80 70 60 50
70  69 65 20 10
60  20 10 5  1
50  45 15 6  1

the node [0, 0, 2, 0, 1] represents the numbers [100, 90, 65, 60, 45]

The initial node is [0, 0, 0, 0, 0]. Every node has up to 5 outgoing edges increasing 1 of the 5 indexes by 1, and the distance between nodes is the absolute difference in the sums of the indexed numbers.

So for that matrix the edges from the node [0, 0, 2, 0, 1] lead:

  • to [1, 0, 2, 0, 1] with distance 100 - 5 = 95
  • to [0, 1, 2, 0, 1] with distance 90 - 80 = 10
  • to [0, 0, 3, 0, 1] with distance 65 - 20 = 45
  • to [0, 0, 2, 1, 1] with distance 60 - 20 = 40
  • to [0, 0, 2, 0, 2] with distance 45 - 15 = 30

With this setup you can use Dijkstra's algorithm to find k - 1 closest nodes to the initial node.

Kolmar
  • 14,086
  • 1
  • 22
  • 25
  • thanks a lot, but I have a doubt: what is the complexity to generate all graph combinations? is less than O(n ^ 5)? For me, this way, the problem is n^5 to generate + k*log k . – Jessicadomingues Mar 26 '21 at 03:27
  • @Jessicadomingues What do you mean by "generate"? Do you need to spend n^5 to generate that matrix? The algorithm in the answer is given a matrix M and a number k. It doesn't generate all combinations, just traverses 5*k edges and visits k nodes. – Kolmar Mar 26 '21 at 09:20
  • First of all, thank you very much for your attention. I tried to create the graph. The initial step is a list with `0 0 0 0 0`, right? to the next step i need see the cost to `00001, 00010, 00100, 01000`. And the next is `00100` because the cost is 1. For the next step I need to see `00001, 00010,01000,10000, 00200`. Now I get the `00001`. To the next I need to have all previous nodes plus 11000, 10100,10010, 10001, 00002,etc. Sorry, but I'm having difficulties with graph modeling. – Jessicadomingues Mar 26 '21 at 15:09
  • 1
    @Jessicadomingues See the latest answer by Matthias Fripp He basically implements this idea and has provided the code. The only difference is that he uses linear check to find the next node instead of a priority queue. So his solution is O(k^2), but if you use a priority queue it will be O(k log k) instead. – Kolmar Mar 26 '21 at 15:26
  • i'm trying to understand the code. I have more skill with c / c ++, but thanks again for all the answers. – Jessicadomingues Mar 26 '21 at 16:31
0

If you want just maximum sum, then sum maximum value at each row. That is,

M = [[100, 5, 4, 3, 1],
 [90, 80, 70, 60, 50],
 [70, 69, 65, 20, 10],
 [60, 20, 10, 5, 1],
 [50, 45, 15, 6, 1]]

sum(max(row) for row in M)

Edit

It is not necessary to use dynamic programming, etc.
There is simple rule: select next number considering difference between the number and current number.

Here is a code using numpy.

import numpy as np
M = np.array(M)
M = -np.sort(-M, axis = 1)
k = 3

answer = []
ind = np.zeros(M.shape[0], dtype = int)
for _ in range(k):
    answer.append(sum(M[list(range(M.shape[0])), ind]))
    min_ind = np.argmin(M[list(range(len(ind))), ind] - M[list(range(len(ind))), ind+1])
    ind[min_ind] += 1

Result is [370, 369, 365].

Gilseung Ahn
  • 2,598
  • 1
  • 4
  • 11
0

Update I previously used a greedy algorithm, which doesn't work for this problem. Here is a more general solution.

Suppose we've already found the combinations with the top m highest sums. The next highest combination (number m+1) must be 1 step away from one of these, where a step is defined as shifting focus one column to the right in one of the rows of the matrix. (Any combination that is more than one step away from all of the top m combinations cannot be the m+1 highest, because you can convert it to a higher one that is not in the top m by undoing one of those steps, i.e., moving back toward one of the existing combinations.)

For m = 1, we know that the "m highest combinations" just means the combination made by taking the first element of each row of the matrix (assuming each row is sorted from highest to lowest). So then we can work out from there:

  1. Create a set of candidate combinations to consider for the next highest position. This will initially hold only the highest possible combination (first column of the matrix).

  2. Identify the candidate with the highest sum and move that to the results.

  3. Find all the combinations that are 1 step away from the one that was just added to the results. Add all of these to the set of candidate combinations. Only n of these will be added each round, where n is the number of rows in the matrix. Some may be duplicates of previously identified candidates, which should be ignored.

  4. Go back to step 2. Repeat until there are 5 results.

Here is some Python code that does this:

m = [
    [100, 5, 4, 3, 1],
    [90, 80, 70, 60, 50],
    [70, 69, 65, 20, 10],
    [60, 20, 10, 5, 1],
    [50, 45, 15, 6, 1]
]
n_cols = len(m[0]) # matrix width

# helper function to calculate the sum for any combination,
# where a "combination" is a list of column indexes for each row
score = lambda combo: sum(m[r][c] for r, c in enumerate(combo))

# define candidate set, initially with single highest combination
# (this set could also store the score for each combination
# to avoid calculating it repeatedly)
candidates = {tuple(0 for row in m)}
results = set()

# get 5 highest-scoring combinations
for i in range(5):
    result = max(candidates, key=score)
    results.add(result)
    candidates.remove(result)  # don't test it again
    # find combinations one step away from latest result
    # and add them to the candidates set
    for j, c in enumerate(result):
        if c+1 >= n_cols:
            continue  # don't step past edge of matrix
        combo = result[:j] + (c+1,) + result[j+1:]
        if combo not in results:
            candidates.add(combo)  # drops dups

# convert from column indexes to actual values
final = [
    [m[r][c] for r, c in enumerate(combo)]
    for combo in results
]
final.sort(key=sum, reverse=True)
print(final)
# [
#     [100, 90, 70, 60, 50]
#     [100, 90, 69, 60, 50], 
#     [100, 90, 70, 60, 45], 
#     [100, 90, 65, 60, 50], 
#     [100, 90, 69, 60, 45], 
# ]
Matthias Fripp
  • 17,670
  • 5
  • 28
  • 45
  • Thanks @Matthias, but the answer is not correct. The expected answer: [ # [100, 90, 70, 60, 50], # [100, 90, 69, 60, 50], # [100, 90, 65, 60, 50], # [100, 90, 70, 60, 45], # [100, 90, 69, 60, 45] # ] – Jessicadomingues Mar 26 '21 at 03:51
  • @Jessicadomingues good point. I've revised the algorithm to consider one step away from all previously found combinations, instead of just one step away from the last found combination. I think this has O(n) complexity, where n is the number of rows in the matrix. – Matthias Fripp Mar 26 '21 at 08:28
  • Thank you. I'm trying to understand everything you did, I don't know python very well, I program in c / c ++, but thank you very much for trying to help me. – Jessicadomingues Mar 26 '21 at 16:51
  • The idea is that you incrementally build a set of k highest-value combinations, working from highest down. A combination is a vector with a column number for each row in the main matrix. The value of a combination is the sum of the cells it references. At each step, you add the next highest combination to the set. To find this, you only consider the combinations that are one step away from the ones that are already in the result set. To make this faster, I keep a set of all those neighbors, and whenever I add a new combination to the result set, I also add its neighbors to the candidate set. – Matthias Fripp Mar 26 '21 at 18:47
  • Also, in Python, `enumerate(collection)` gives you pairs of values `i` and `val`, where `val` is a value from the collection and `i` is its index (position) within the collection. And the expression `r[:j] + (x,) + r[j+1:]` concatenates three tuples to make a new one. These are the first `j` elements of `r`, then the value `x`, then the elements of `r` from position `j+1` to the end. So this just makes a new version of the tuple `r`, with the value in position `j` replaced by `x`. You can think of a tuple as being similar to a vector, array or list. – Matthias Fripp Mar 26 '21 at 19:01
  • I took a look at what it takes to implement an [unordered set of tuples in C++](https://stackoverflow.com/questions/7110301/generic-hash-for-tuples-in-unordered-map-unordered-set) and panicked and fled. But I could write a more 'C-like' version of this code if that would help. – Matthias Fripp Mar 26 '21 at 19:48
  • First of all, thank you very much, you are very kind!! Please, if you don't mind, I would be very grateful. – Jessicadomingues Mar 26 '21 at 20:33
  • Are you using any libraries in C/C++, to implement vectors, matrices, unordered sets, tuples, etc.? That would help me know if you can work with high-level objects or if you need a very basic solution with only generic arrays and lots of loops. – Matthias Fripp Mar 26 '21 at 22:33
  • I'm studying the code, its execution time was longer than brute force with 5 loops. In brute force I check all available combinations once, sort and take k solutions. – Jessicadomingues Mar 27 '21 at 00:01
  • That is surprising. The brute force method has to generate and evaluate 3,125 combinations, but this algorithm only creates 25 combinations and evaluates 75 scores at most. If you are using C for the brute force approach and Python for this one, that could explain it. But this algorithm should win if you go to many more rows or make k much bigger. – Matthias Fripp Mar 27 '21 at 00:09
  • when I increase the number of combinations to 60, brute force is better. I think this method is better for the initial cases. – Jessicadomingues Mar 27 '21 at 00:29
  • I put the brute force code in the problem description – Jessicadomingues Mar 27 '21 at 00:41
  • The facts that the rows may be wide and k may be high are important for solving this. I think there may be a solution where you pick values for v, w, x, y, z that are the number of cells you will use from each row, then you find the sum of all combinations of those leftmost cells, which is pretty fast. This might somehow work as a linear program with 5 stages, with steps v, w, x, y and z. The problem is that this gives the sum for v*w*x*y*z combos, which won't generally equal k. But maybe you can partially increment one. But I don't have time to think about it more right now. – Matthias Fripp Mar 27 '21 at 20:14
  • @Jessicadomingues, I'm still surprised mine is running much slower than yours for large n. Or maybe it's just for large k? This should have O(k^2) complexity. It could be improved to O(k log k) by also adding the candidates to a heapq (as a tuple with their scores) then pulling the best candidate from there. But that wouldn't be a huge improvement in your test. – Matthias Fripp Mar 29 '21 at 08:35