3

I'm trying to solve the Min Avg Two Slice question from Codility.

I came up with the following code:

def solution(S, P, Q):
    weights = {'A': 1, 'C': 2, 'G': 3, 'T': 4}
    retVal = []
    for i in range(0, len(P)): 
        if P[i] == Q[i]: 
            retVal.append(weights[S[P[i]]])
        else: 
            retVal.append(local_min(S, weights, P[i], Q[i]))
    return retVal

minimums = {}
def local_min(S, weights, start, end):
    minVal = weights[S[start]]
    for i in range(start,end+1):
        val = weights[S[i]]
        if val == 1: 
            minimums[i] = 1
            return 1
        if val < minVal: 
            minVal = val
        minimums[i] = minVal
    return minVal

This works fine in terms of correctness, however it has a time complexity of O(n*m) since I'm re-computing the local_min every time and not reusing any results.

I'm trying to use the prefix_sums algorithm here to compute local mins or somehow memoize the computed min values using the minimums object and reuse them in subsequent calls to local_min.

The problem I'm running into is this - lets say we have the following array:

[1, 13, 2, 8, 20, 5]

We could compute an array of smallest values seen up to this point as follows:

For range 0, 6 it would simply be:

[1,1,1,1,1,1]

For 1,6 it would be:

[13, 2, 2, 2, 2]

For 2,6:

[2, 2, 2, 2]

And finally:

[8, 8, 8] and [20, 5]

I'm struggling to see how to adapt this approach to compute the smallest value in a given range and reduce the time complexity of my solution.

EDIT:

I came up with the following solution which achieves 100% on Codility in terms of correctness and performance and achieves a time complexity of O(n+m):

def solution(S, P, Q):
    weights = {'A': 1, 'C': 2, 'G': 3, 'T': 4}
    retVal = []
    for i in range(0, len(P)): 
        if P[i] == Q[i]: 
            retVal.append(weights[S[P[i]]])
        else: 
            if 'A' in S[P[i]:Q[i] + 1]: 
                retVal.append(1)
            elif 'C' in S[P[i]:Q[i] + 1]: 
                retVal.append(2)
            elif 'G' in S[P[i]:Q[i] + 1]: 
                retVal.append(3)
            else: 
                retVal.append(4)
    return retVal

I'm still a bit confused by this though - my assumption would be the in operation would take O(n) where n is the length of S so the overall complexity should be O(n * m) where m is the length of P and Q - could anyone explain why the complexity is in fact O(n + m)

Alk
  • 5,215
  • 8
  • 47
  • 116
  • You want to solve [Range Minimum Query](https://en.wikipedia.org/wiki/Range_minimum_query) as it seems. Use Sparse Table (O(nlogn) contruction, O(1) query) or Segment Tree (O(n) construction, O(logn) query O(logn), update). If sparse table is enough, use it, because it's little faster in practice. – Sandro J Dec 25 '21 at 01:08

2 Answers2

3

The hard part of this problem is proving that there's an easy solution.

In particular, you can prove that you only need to look for slices of length 2 or 3.

Proof: Imagine that there is a slice of length >=4 that has the smallest possible average. We can divide it into two slices -- one consisting of the first 2 elements, and one consisting of the remainder. Neither part can have a smaller average than the whole, since the whole has the smallest possible average, and so both parts must have the same average as the whole, so those initial 2 elements are a valid result that starts at the same place.

So just iterate through the start positions and test all slices of length 2 or 3, and remember the first one with the smallest average.

Matt Timmermans
  • 53,709
  • 3
  • 46
  • 87
1

An approach offered by Matt Timmermans is less complicated in implementation and has better time complexity. An approach below is easier to come up with but less efficient.

If we want to solve RMQ faster, there are such options:

  • Segment Tree (best option as for me). O(n) for construction, O(log(n)) for min search. Not so easy to develop using Python.
  • Sqrt Decomposition. O(n) for construction, O(sqrt(n)) for min search. Very easy to develop.
  • Fenwick Tree. O(n) for construction, O(log(n)) for min search. Extremely easy to develop, quite hard to understand how it works.
  • Sparse Table. O(n*log(n)) for construction, O(1) for min search. Quite easy to develop.

All the links contain the implementation, some of them in Python.

Yevgeniy Kosmak
  • 3,561
  • 2
  • 10
  • 26
  • 1
    These data structures can answer RMQ quickly, but they are not what the problem is asking for. – Peter Li Dec 25 '21 at 08:47
  • Why do you think so? OP said they `trying to use the prefix_sums algorithm here to compute local mins`, so I come up with approaches for that. And all of proposed solutions reduce time complexity. – Yevgeniy Kosmak Dec 25 '21 at 10:12
  • Fenwick tree was not originally designed for `RMQ`. In practice it's possible to modify it to solve `RMQ`, but it's too complicated and not worth trying. Though, It solves range sum queries with updates, very efficiently. – Sandro J Dec 25 '21 at 10:30
  • Well, maybe that it's a subjective matter. As for me, [this](https://stackoverflow.com/a/34602284/12914274) approach with Fenwick tree is extremely easy to develop, and it's even easier than sqrt decomposition. – Yevgeniy Kosmak Dec 25 '21 at 11:14
  • @YevgeniyKosmak If you stick to OP's idea of using prefix sum, then these data structures are indeed the RMQ equivalent of it. But Matt's analysis already pointed out that this problem has a trivial solution that doesn't require complex data structures. So to correct my previous comment: this answer can be used to solve the problem and does address the OP's intention to use prefix-sum, though it is unnecessarily complicated and slower than the optimal solution. – Peter Li Dec 25 '21 at 18:09
  • @PeterLi, you are totally right, agree with you at all points. However, it doesn't make my answer wrong, it still solves OP's question. Of course, if I'd think about it before his comment, you can be sure that I wouldn't advise this approach. But now what, it would be better to remove my answer? :) – Yevgeniy Kosmak Dec 25 '21 at 18:15
  • 1
    @YevgeniyKosmak I certainly don't want your answer removed because it does give a valid idea. But, I think it should be noted that this approach is suboptimal, which is already done. – Peter Li Dec 25 '21 at 18:59
  • 1
    @PeterLi that would be better, yeah. Updated. – Yevgeniy Kosmak Dec 25 '21 at 19:10