Longest Increasing Subsequence optimization

Question

The Longest Increasing Subsequence problem is to find a subsequence of a given sequence in which the subsequence's elements are in sorted order and in which the subsequence is as long as possible.

Here is my O(n^2) approach which is running very slow for long input:

N = int(input())

lst = []
for i in range(N):
    lst.append(input())

count = [1] * N

for i in range(1, len(lst)):
    for j in range(i):
        if(int(lst[j]) < int(lst[i])):
            k = int(count[j]) + 1
            if (k > int(count[i])):
                count[i] = k
count.sort()
print(count[-1])

Can anybody tell me how can it be done in O(n*log n) time?

When I searched for the algorithm, the *first* DuckDuckGo result was [Wikipedia](https://en.wikipedia.org/wiki/Longest_increasing_subsequence), which has an algorithm *written in Python-like language*... — Djizeus, Jul 02 '16 at 14:03
And if you need a video explanation: https://www.youtube.com/watch?v=S9oUiVYEq7E&t=0s — Moon Cheesez, Jul 02 '16 at 14:10
See also: [How to determine the longest increasing subsequence using dynamic programming?](http://stackoverflow.com/questions/2631726/how-to-determine-the-longest-increasing-subsequence-using-dynamic-programming) — Djizeus, Jul 02 '16 at 14:19

Yash Tewari · Accepted Answer · 2016-07-03T04:13:14.140

2

Great resources have been mentioned in the comments; if you still want a piece of python code, I've written and explained it here.

In this algorithm, for all L < N, we keep a track of the values in the input which represent the endpoint of the current longest increasing subsequence of length L.

longest_sequence_values stores these values. For example, longest_sequence_values[3] is a value in the input at which the longest increasing subsequence of length 3 ends.

Note here that longest_sequence_values will always be non-decreasing, which allows us to perform a binary search when we're trying to build a new longest increasing subsequence. This happens because if i < j, then the endpoint of a subsequence of length i cannot be larger than the endpoint of a subsequence of length j.

longest_current_sequence is the length of of the longest subsequence found thus far. We need this value to specify the range of the binary search. It also represents the answer at the end.

from math import ceil
N = int(input())

input_vals = []
for i in range(N):
    input_vals.append(input())

longest_sequence_values = [None] * (N+1)
longest_current_sequence = 0

for i in range(N):
    # binary search starts here
    # this gives us the log(N) factor
    lo = 1
    hi = longest_current_sequence
    while lo <= hi:
        mid = int(ceil((lo+hi)/2))
        if longest_sequence_values[mid] <= input_vals[i]:
            lo = mid + 1
        else:
            hi = mid - 1

    # lo will be length of the longest sequence ending at input_vals[i]
    longest_len_here = lo
    # We have a new lis of length longest_len_here ending at index i
    # Note that before we perform the following substitutions,
    # longest_sequence_values[longest_len_here] >= input_vals[i]
    # This means that the new endpoint of the lis of length longest_len_here
    # is <= to the old endpoint.
    # This point is essential in keeping the result optimal
    longest_sequence_values[longest_len_here] = input_vals[i]

    if longest_len_here > longest_current_sequence:
        longest_current_sequence = longest_len_here

print longest_current_sequence

edited Jul 03 '16 at 04:13

answered Jul 02 '16 at 15:25

Yash Tewari

760
1
6
19

Your code runs perfectly. Here is my code, It's working fine for some inputs, But for some it's not. It has some bug, can you tell me where am I doing it wrong. http://paste.ofcode.org/WFqZYpE2e2av4KwhpfSV74 – Atinesh Jul 03 '16 at 03:22
@Atinesh Can you post some inputs for which it is failing? Ideally, `tailTable` should be dealing with indices in the range `[1, N]` instead of `[0, N]`. This is because the indices represent lengths of size `1..N` and things will be more intuitive that way. Your `binary_search` method looks weak. For some reason the starting range is `[-1, N-1]`. Binary search is used to divide a sorted range into two parts using a property. In this case, that property is 'largest `j < len` such that `tailTable[j] <= key`'. Ensure that this is the case. – Yash Tewari Jul 03 '16 at 04:35
Actually, I tried this code on `Hacker Rank` they have tested it on 9 test cases out of which only 2 are successful. Here is the test case `Input` : http://paste.ofcode.org/MTCs4irbBFNNndgAGf3yEk `Expected Output` : 185 – Atinesh Jul 03 '16 at 05:56
I can't make any useful deductions using a test input of that size. If you can, find a small test case on which your code breaks and post it. I'm pretty sure it's the binary search, though. Change your `tailTable` range from `[0, N-1]` to `[1, N]`, study the binary search that I have used, and you should be good to go. – Yash Tewari Jul 03 '16 at 06:02
@Atinesh I just ran your code on the input you gave and it gave 185, which you said is the expected output... – Yash Tewari Jul 03 '16 at 12:10
I'm getting `189`. Try to run this code http://paste.ofcode.org/rGbxeeXu55sCywapSMTHYN with file https://www.dropbox.com/s/uh0qblvjc4nhxlv/input.txt?dl=0 – Atinesh Jul 03 '16 at 16:20
By the way your implementation is also not correct, getting same error with your implementation also. Try to run your code here https://www.hackerrank.com/challenges/longest-increasing-subsequent – Atinesh Jul 03 '16 at 16:21
'In this challenge you simply have to find the length of the longest **strictly increasing** sub-sequence of the given sequence.' My snippet is not meant to do this. Please read the problem statement carefully. – Yash Tewari Jul 03 '16 at 17:11
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/116306/discussion-between-yash-tewari-and-atinesh). – Yash Tewari Jul 03 '16 at 17:13

Longest Increasing Subsequence optimization

1 Answers1