1

Suppose that S is a string containing only 0 and 1. I would like to count the number of not null substrings of S in which the number of 0s is less than the number of 1s.

Using brute-force method given below we can have an algorithm that solves this problem in O(n^2):

moreOnes(s):
count = 0
n = s.len()

for i = 1 to n
    one = 0
    zero = 0
    for j = i to n
        if s[i] == '1'
            one++
        else
            zero++
        
        if one > zero
            count++

return count

But can we have an algorithm with a better time complexity of O(n*logn) or O(n), and space complexity be anything from O(1) to O(n)?

Adeeb HS
  • 139
  • 7

1 Answers1

2

Consider an array A[i] that contains the number of ones in the range 1..i minus the number of zeros in the range 1..i.

With this array it now takes only O(1) time to compute the number of ones minus the number of zeros in a substring from i..j by computing A[j]-A[i-1].

For a given endpoint j, you want to find all start points i<=j such that A[j]-A[i-1]>0. Equivalently, you want to know how many values of A[i-1] have value less than A[j].

This can be solved with Fenwick trees by:

  1. Loop over j
  2. Add 1 at location A[j] in Fenwick tree
  3. Lookup cumulative value from Fenwick tree in range up to A[j]-1 - this is the number of substrings that end at j and satisfy the desired property of more ones than zeros.

This will take O(n) space for the Fenwick tree and O(nlogn) time as each lookup is O(logn).

(Note that A[j] may go negative, while Fenwick trees typically work with positive data. You can work around this by adding a constant offset of n to A[j] so all entries are positive.)

Peter de Rivaz
  • 33,126
  • 4
  • 46
  • 75
  • This is really helpful. It made me think of [this problem](https://stackoverflow.com/q/67209667), which is to find the substring with the best rate (num_ones / substring_length). Could this idea be adapted for that somehow? – גלעד ברקן Jul 10 '21 at 14:38
  • 1
    Yes: one approach is to binary search for the maximum rate of 1's and use this approach as a subproblem. Suppose we want to know if the average rate is greater than x. We can subtract x from all values and look to see if we can find a subarray of length k or more where the average rate is higher than 0. Having said that, there are better ways. For example, when doing the bisection, Kadane's algorithm would be more efficient than using a Fenwick Tree. There is also an O(n) method to compute the maximum average rate (with minimum length) based on a stack of the convex hull of A. – Peter de Rivaz Jul 10 '21 at 18:45
  • @PeterdeRivaz if we loop over j and look up cumulative value from range 0 to j-1 won't the complexity be O(n^2 logn), n^2 because you are loop through j and then agian looping from 0 to j-1, can logn for the look up? – Adeeb HS Jul 12 '21 at 09:37
  • You can compute the complete A[i] array in O(n) by the recursion A[i] = A[i-1] + (s[j] ? 1: -1) – Peter de Rivaz Jul 12 '21 at 20:23
  • Thank you for answering my comment. I couldn't find online ideas about computing the highest average for a minimum length interval using convex hull. If you have a moment, could you please point to any resource or give a hint about the formulation? – גלעד ברקן Jul 24 '21 at 13:42
  • 1
    @גלעד ברקן There is some code and explanation here for convex hull approach: https://just4once.gitbooks.io/leetcode-notes/content/leetcode/binary-search/644-maximum-average-subarray-ii.html – Peter de Rivaz Jul 24 '21 at 20:26
  • 1
    I looked into the hull approach attributed to section 3 of [Kai-min Chung, Hsueh-I Lu - An Optimal Algorithm for the Maximum-Density Segment Problem. 2008](https://arxiv.org/pdf/cs/0311020.pdf). I guess we have a hull in the sense that we're comparing slopes of `prefix_sum / length`, geometrically, but the algorithm there is quite a bit more nuanced since we have to identify the correct points to be looking at during the iteration. Do you understand their algorithm intuitively just visualising a convex hull? I find that hard to do. If you can spare a moment to hint how, I'd appreciate it. – גלעד ברקן Jul 30 '21 at 00:13
  • Not sure I have much helpful to add: I think the paper that introduced me to this may have been: https://www.sciencedirect.com/science/article/abs/pii/S0020019003002254 but I no longer have access to the pdf so cannot check for sure. – Peter de Rivaz Jul 30 '21 at 09:41
  • Ah, cool, thanks! I'll look at that one. I think the newer one mentions that article and corrects Kim's unfinished outline of the hull updates. – גלעד ברקן Jul 30 '21 at 18:15
  • (In the newer one, they claim they "believe that any correct implementation of Kim’s algorithm requires Ω(n log(wmax − wmin + 1)) time in the worse case.") – גלעד ברקן Jul 30 '21 at 18:23