1

I am currently stuck with the classic longest increasing subsequence problem, but there is a slight twist to it. Instead of just finding the longest increasing subsequence, I need to find the largest sum of all increasing subsequences that are of length k.

I have the following pseudo code implemented:

input = [4,13,5,14] k = 2
n = size of input
opt = array of size n which stores the highest increasing subsequence sum up to this index
counts = array of size n which stores the amount of values in the subsequence up to this index
highestSum = -1
FOR i in range(0, n)
   high = new data object(value = 0, sum = 0, count = 0)
   FOR j in range(i-1, 0, -1)
     IF high.sum < opt[j] AND opt[j] < opt[i] AND counts[j] < k
        high.value = input[j]
        high.sum = opt[j]
        high.count = counts[j]
   opt[i] = high.sum + input[i]
   counts[i] = high.count + 1
   IF counts[i] == k
     highestSum = higher value between (highestSum, opt[i])
return highestSum

This dynamic programming approach works in most cases, but for the list I outlined above it does not return the optimal subsequence sum. The optimal subsequence sum with length 2 should be 27 (13-14), but 18 is returned (4-14). This is due to the opt and counts array looking like this:

k = 2
input:   0 4 13 5 14
opt:     0 4 17 9 18
counts:  0 1 2  2 2 

Due to 13 already having a subsequence of 4-13, and thus its count value (2) is no longer less than k, 14 is unable to accept 13 as a correct subsequence due to its count value.

Are there any suggestions as to what I can change?

גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61
Malice
  • 181
  • 2
  • 13

2 Answers2

2

You'll need k+1 sorted data structures, one for each possible length of subsequence currently found.

Each structure contains, by the last entry in an optimal subsequence, the current sum. That is, we only care about a subsequence that can lead to the best possible solution. (Technical note. Of those that can lead to the best solution, pick the one whose positions are lexicographically first.) Which will be sorted by increasing last entry, and decreasing sum.

In pseudocode it works like this.

initialize optimal[0..k]
optimal[0][min(sequence) - 1] = 0 # empty set.

for entry in sequence:
    for i in k..1:
        entry_prev = biggest < entry in optimal[i-1]
        if entry_prev is not None:
            this_sum = optimal[i-1][entry_prev] + entry
            entry_smaller = biggest <= entry in optimal[i-1]
            if entry_smaller is None or optimal[i][entry_smaller] < this_sum:
                delete (e, v) from optimal[i] where entry <= e and ​v <= this_sum
               ​ insert (entry, this_sum) into optimal[i]
return optimal[k][largest entry in optimal[k]]

But you need this kind of 2-d structure to keep track of what might happen from here.

The total memory needed is O(k n) and running time will be O(k n log(n)).

It is possible to also reconstruct the optimal subsequence, but that requires a more complex data structure.

btilly
  • 43,296
  • 3
  • 59
  • 88
  • Thank you, this answer is extremely helpful – Malice Oct 29 '21 at 20:41
  • (Unrelated to this answer, which I upvoted -- your answer [here](https://math.stackexchange.com/a/2146054/75712) was mentioned in the comments [here](https://stackoverflow.com/questions/69780830/number-of-restricted-arrangements) -- someone said there that they didn't have enough rep to contact you.) – גלעד ברקן Oct 31 '21 at 12:13
0

Here is a working solution in C++ that runs in O(logn * n * k) time with O(n*k) space. I think you can not make it faster but let me know if you find a faster solution. This is a modification of the solution for from https://stackoverflow.com/questions/16402854/number-of-increasing-subsequences-of-length-k. The key difference here is that we keep track of the maximum sum for each subsequences of different legths instead of accumulating the number of subsequences and we are iterating from the back of the array (since for increasing subsequences that have length larger than k the best k-length subarray will be at the end). An other trick is that we use the array sums to map index + length combinations to maximum sums.

maxSumIncreasingKLenSeqDP function is the simple dynamic programming solution with O(n * n * k) time complexity.

#include <iostream>
#include <algorithm>
#include <unordered_map>
#include <limits.h>

using namespace std;

#include <random>

int maxSumIncreasingKLenSeq(int arr[], size_t n, int k){
    // inverse compression: assign N-1, N-2, ... , 1 to smallest, ..., largest
    size_t N = 1;
    size_t compArr[n];
    {
        for(size_t i = 0; i<n; ++i)
            compArr[i] = arr[i];
        // descending order
        sort(compArr, compArr + n, greater<int>());
        unordered_map<int, size_t> compMap;
        for(int val : compArr){
            if(compMap.find(val) == compMap.end()){
                compMap[val] = N;
                ++N;
            }
        }
        for(size_t i = 0; i<n; ++i)
            compArr[i] = compMap[arr[i]];
    }
    int sums[n * (k - 1) + n]; // key is combined from index and length by n * (length - 1) + index
    for(size_t i = 0; i < n * (k - 1) + n; ++i)
        sums[i] = -1;
    for(size_t i = 0; i < n; ++i)
        sums[i] = arr[i]; // i, 1
    
    int BIT[N];
    for(size_t len = 2; len <= k; ++len){
        for(size_t i = 0; i<N; ++i)
            BIT[i] = INT_MIN;
        for(size_t i = 0; i < len - 1; ++i)
            sums[n * (len - 1) + i] = INT_MIN;
        for(int i = n - len; i >= 0; --i){
            int val = sums[n * (len - 2) + i + 1]; // i + 1, len - 1
            int idx = compArr[i + 1];
            while(idx <= N){
                BIT[idx] = max(val, BIT[idx]);
                idx += (idx & (-idx));
            }
            // it does this:
            //BIT[compArr[i + 1]] = sums[n * (len - 2) + i + 1];
            
            idx = compArr[i] - 1;
            int maxSum = INT_MIN;
            while(idx > 0){
                maxSum = max(BIT[idx], maxSum);
                idx -= (idx & (-idx));
            }
            sums[n * (len - 1) + i] = maxSum;
            
            // it does this:
            //for(int j = 0; j < compArr[i]; ++j)
            //  sums[n * (len - 1) + i] = max(sums[n * (len - 1) + i], BIT[j]);
            if(sums[n * (len - 1) + i] > INT_MIN)
                sums[n * (len - 1) + i] += arr[i];
        }
    }
    
    int maxSum = INT_MIN;
    for(int i = n - k; i >= 0; --i)
        maxSum = max(maxSum, sums[n * (k - 1) + i]); // i, k
    return maxSum;
}

int maxSumIncreasingKLenSeqDP(int arr[], int n, int k){
    int sums[n * (k - 1) + n]; // key is combined from index and length by n * (length - 1) + index
    for(size_t i = 0; i < n; ++i)
        sums[i] = arr[i]; // i, 1
    for(int i = 2; i <= k; ++i)
        sums[n * (i - 1) + n - 1] = INT_MIN; // n - 1, i
    // moving backward since for increasing subsequences it will be the last k items
    for(int i = n - 2; i >= 0; --i){
        for(size_t len = 2; len <= k; ++len){
            int idx = n * (len - 1) + i; // i, length
            sums[idx] = INT_MIN;
            for(int j = n - 1; j > i; --j){
                if(arr[i] < arr[j])
                    sums[idx] = max(sums[idx], sums[n * (len - 2) + j]); // j, length - 1
            }
            if(sums[idx] > INT_MIN)
                sums[idx] += arr[i];
        }
    }
    int maxSum = INT_MIN;
    for(int i = n - k; i >= 0; --i)
        maxSum = max(maxSum, sums[n * (k - 1) + i]); // i, k
    return maxSum;
}

int main(){
    std::random_device dev;
    std::mt19937 rng(dev());
    std::uniform_int_distribution<std::mt19937::result_type> dist(1,10);
    for(int len = 3; len < 10; ++len){
        for(int i = 0; i < 10000; ++i){
            int arr[100];
            for(int n = 0; n < 100; ++n)
                arr[n] = dist(rng);
            int res = maxSumIncreasingKLenSeqDP(arr, 100, len);
            int fastRes = maxSumIncreasingKLenSeq(arr, 100, len);
            if(res != fastRes)
                cout << "failed" << endl;
            else
                cout << "passed" << endl;
        }
    }
    return 0;
}