1

I was going through common child problem on hackerrank. It is nothing but a Longest Common Subsequence problem.

Tried it solving as follows:

def commonChild(s1, s2):
    LCS_table = [[0 for _ in range(len(s2)+1)] 
                 for _ in range(len(s1)+1)
                ]              
    for i in range(1,len(s1)+1):
        for j in range(1,len(s2)+1):            
            if s1[i-1] == s2[j-1]: #Mistake not [i] & [j] but [i-1] & [j-1] Ref: $A1
                LCS_table[i][j] = LCS_table[i-1][j-1] + 1
            else:
                LCS_table[i][j] = max(LCS_table[i-1][j], LCS_table[i][j-1])
    return LCS_table[-1][-1]

Eight test cases passed, but six test cases gave "Time Limit Exceeded" error. I quickly googled online solutions and tried them. But for all of them was able to pass more than half of the test cases, but failing at other with time limit exceeded error:

Sol 2 ref

def commonChild(a, b):
    m = len(a)
    n = len(b)
    prev = [0 for x in range(n + 1)]
    for i in range(1, m + 1):
        curr = [0 for x in range(n + 1)]
        for j in range(1, n + 1):
            curr[j] = max(prev[j], curr[j - 1])
            if a[i - 1] == b[j - 1]:
                curr[j] = max(curr[j], prev[j - 1] + 1)
        prev = curr
    return curr[n]

Sol 3 ref

def commonChild(s1, s2):
    m = [[0]*(len(s2)+1) for _ in range(len(s1)+1)]
    for i,c in enumerate(s1,1):
        for j,d in enumerate(s2,1):
            if c == d:
                m[i][j] = m[i-1][j-1]+1
            else:
                m[i][j] = max(m[i][j-1],m[i-1][j])
                   
    return m[-1][-1]

Are we all missing some idea to make it execute even faster?

Rnj
  • 1,067
  • 1
  • 8
  • 23
  • 1
    I can't access hackerrank, so I don't know what the specific problem is. Are you sure there is no simpler solution than longest common subsequence? hackerrank seems to be telling you that there is a better solution than O(mn). – Frank Yellin Sep 02 '22 at 20:05
  • 1
    @FrankYellin Nah I think hackerrank is telling them that there is a better Python than CPython. (Better at this.) I'd try it but I'm on a phone and their editor is entirely broken. – Kelly Bundy Sep 02 '22 at 20:16
  • @FrankYellin Here is the [screenshot of hackerrank problem description](https://i.postimg.cc/FHyHWvRQ/image.png). And yes, thats exactly what I want to know, if there is any better approach than dynamic programming idea implemented by all above solutions. – Rnj Sep 02 '22 at 20:20
  • One of the test cases (failed 2 out of 14!) has the string of ```5000```~ $#% I prob. will have to switch to C code ... ;-) – Daniel Hao Sep 02 '22 at 20:29
  • 1
    @DanielHao What language did you select? – Kelly Bundy Sep 02 '22 at 20:37
  • @DanielHao And do you fail with Pypy 3 as well? – Kelly Bundy Sep 02 '22 at 20:45

2 Answers2

2

@Rnj - if you switch to PyPy3 - all three versions will pass with flying color. So your logic is all sound, but the CPython just too slow in this particular case unfortunately.

I've tried many times earlier with diff. approaches - until @Kelly reminded me the faster version is also available there. Lesson learned.

Big kudos to @Kelly.

Daniel Hao
  • 4,922
  • 3
  • 10
  • 23
  • 1
    The more interesting challenge of course is to get accepted with CPython :-). Might be hard. I measured their first solution elsewhere with `s1 = 'a' * 5000` and `s2 = 'b' * 5000` and PyPy took about 1 second while CPython took about 16 seconds. – Kelly Bundy Sep 02 '22 at 21:14
  • Althou [this](https://support.hackerrank.com/hc/en-us/articles/1500002392722--Execution-Environment-and-Samples) says CPython has a higher limit (10 seconds vs 4). Might be doable with the same basic algorithm and just some micro-optimizations. – Kelly Bundy Sep 02 '22 at 21:24
  • It's very hard to ace with Cpython - given some unreasonable string data. Love to see one. . – Daniel Hao Sep 02 '22 at 21:25
  • 1
    Could you try submitting [this](https://tio.run/##VU7LDoIwELz7FXMsggnFiyFy8to/IBwMLYGktEuBg/48ttVA3NM8dnaHXktvzfVGbtuk6tDacbTm0Q9asplnmIukPMHPXKDCspJWbHwSs05GM4mm8F6dNzhDK8O8HNXOOrQYDI4A/10LQz6U7yye2NnXLyuwKeUYfK@qklB6VnCBEu4eRE7JXyo89Z@mLCwavAfyfbLQLhUZxLHcROTUsjoDUV94s20f)? (Their retarded editor just doesn't let me type/paste code). Seems about 5 times faster than OP's first code. – Kelly Bundy Sep 02 '22 at 22:09
  • And also a [less optimized](https://tio.run/##VU7BDoMgFLvvK3rUyRLZLosZp135A8NJMJIoILjD9vMMMHFZT21fmz733iZrbnfnY5RqxGCXxZrnpGdZBUoQrnV3QgIHQ98KnDErUyW7uKP1GKANAt1jGS5F20OV4qH2e8dQrQ2FToOMSag5KPgsHR6JFO3qv1aekgQryUGDj3bpC5J/ajgB/4VFYV5tL2/A@wsVMX4B) version that's only slightly slower. – Kelly Bundy Sep 02 '22 at 22:18
  • @KellyBundy *Congrats* - both version you posted passing in ```Python 3```! Is there some tricks...;-) I think you should post the answers with some analysis. – Daniel Hao Sep 02 '22 at 22:25
  • Nice, thanks! Posted now with a little explanation. – Kelly Bundy Sep 02 '22 at 22:46
  • (sorry for being late to reply) I always saw PyPy in hackerrank, but never cared to find whats that! Now this is holy grail!! Am wondering what kinds of solutions will it execute that CPython fails at? Will it also run recursive dynamic programming implementations which fails with CPython due "maximum recursion depth exceeded error" something like [this](https://stackoverflow.com/a/73347777/13782227) especially because it has something called [stackless feature](https://doc.pypy.org/en/latest/stackless.html) (though I dont know what it is)? – Rnj Sep 03 '22 at 22:48
2

Here's a version that gets accepted with Python 3, i.e., doesn't need PyPy:

def commonChild(s1, s2):
    L = [0] * len(s2)
    for c in s1:
        p = 0
        L = [
            p := (q+1 if c==d else r if p < r else p)
            for d, q, r in zip(s2, [0]+L, L)
        ]
    return L[-1]

It's about five times faster than your solutions in my own testing.

Main tricks are to avoid max(), avoid reading by index (zipping instead) and avoid writing by index (use list comp instead).

My L is the latest row of the LCS_table, and my variables p, q and r mean LCS_table[i][j-1], LCS_table[i-1][j-1] and LCS_table[i-1][j].

Thanks to @Daniel for testing this at HackerRank (I can't, their editor is utterly unusable for me on my phone).


For some more clarity, it's:

enter image description here (where "p+1" means "next p-value", the one computed in the ... of p := (...) and then assigned to p)

To help you visualize what how above zipping works:

>>> L = [1,2,3,4,5,6]
>>> s2 = ['a','b','c','d','e','f']
>>> zip(s2,[0]+L,L)
[('a', 0, 1), ('b', 1, 2), ('c', 2, 3), ('d', 3, 4), ('e', 4, 5), ('f', 5, 6)]

Also whats := in list comprehension?

Kelly Bundy
  • 23,480
  • 7
  • 29
  • 65
  • (sorry for being late to reply) This is super clever! But, now am wondering if this is the only way to make it work in CPython? Or is there another algorithmic optimisation or completely different algorithm to solve it. Also, if I recall it correct, I also tried to copy paste run C solution on some websites and they failed too at some test cases. So was guessing if indeed there is some other way to solve this. – Rnj Sep 03 '22 at 22:53
  • @Rnj [Wikipedia](https://en.m.wikipedia.org/wiki/Longest_common_subsequence_problem) has some alternatives/improvements, and some optimizations could happen to be better for their particular test cases. Top-down with memorization might get lucky and not need most of the matrix, or before the algorithm you could filter out characters that don't appear on both strings. The latter is something I saw in someone's accepted solution afterwards. You could look through those accepted CPython solutions as well. All that said, I think this is the "standard" algorithm and intended to be used there. – Kelly Bundy Sep 03 '22 at 23:03
  • "You could look through those accepted CPython solutions as well." - can you please tell where to find these accepted solutions? I just checked the [discussion thread of the same problem on hackerrank](https://www.hackerrank.com/challenges/common-child/forum) and I couldnt find any cpython solution there. All python solutions there say they get some issues (including time limit exceeded) – Rnj Sep 04 '22 at 11:07
  • @Rnj In the [leaderboard](https://www.hackerrank.com/challenges/common-child/leaderboard). – Kelly Bundy Sep 04 '22 at 15:17