Longest Common Prefix from list elements in Python

Question

I have a list as below:

strs = ["flowers", "flow", "flight"]

Now, I want to find the longest prefix of the elements from the list. If there is no match then it should return "". I am trying to use the 'Divide and Conquer' rule for solving the problem. Below is my code:

strs = ["flowers", "flow", "flight"]

firstHalf = ""
secondHalf = ""


def longestCommonPrefix(strs) -> str:
    minValue = min(len(i) for i in strs)
    length = len(strs)
    middle_index = length // 2
    firstHalf = strs[:middle_index]
    secondHalf = strs[middle_index:]
    minSecondHalfValue = min(len(i) for i in secondHalf)
    matchingString=[] #Creating a stack to append the matching characters
    for i in range(minSecondHalfValue):
        secondHalf[0][i] == secondHalf[1][i]
    return secondHalf


print(longestCommonPrefix(strs))

I was able to find the mid and divide the list into two parts. Now I am trying to use the second half and get the longest prefix but am unable to do so. I have had created a stack where I would be adding the continuous matching characters and then I would use it to compare with the firstHalf but how can I compare the get the continuous matching characters from start?

Expected output:

"fl"

Just a suggestion would also help. I can give it a try.

Do you want the longest *prefix* or any arbitrary *substring*? Those are two different problems, and if you're looking for the common prefix specifically you're way overcomplicating this. :) — Samwise, Dec 10 '21 at 16:16
I think this is answered in: [https://stackoverflow.com/questions/6718196/determine-prefix-from-a-set-of-similar-strings](https://stackoverflow.com/questions/6718196/determine-prefix-from-a-set-of-similar-strings) — Nick, Dec 10 '21 at 16:21
I think this is adressed in: [https://stackoverflow.com/questions/6718196/determine-prefix-from-a-set-of-similar-strings](https://stackoverflow.com/questions/6718196/determine-prefix-from-a-set-of-similar-strings) — Nick, Dec 10 '21 at 16:23
I want to find the longest common prefix and NOT any arbitrary substring — whatsinthename, Dec 10 '21 at 16:30

score 2 · Accepted Answer · answered Dec 10 '21 at 16:20

2

No matter what, you need to look at each character from each string in turn (until you find a set of corresponding characters that doesn't match), so there's no benefit to splitting the list up. Just iterate through and break when the common prefix stops being common:

def common_prefix(strs) -> str:
    prefix = ""
    for chars in zip(*strs):
        if len(set(chars)) > 1:
            break
        prefix += chars[0]
    return prefix


print(common_prefix(["flowers", "flow", "flight"]))  # fl

answered Dec 10 '21 at 16:20

Samwise

68,105
3
30
44

Same solution as a one-liner: `''.join(c[0] for c in itertools.takewhile(lambda x: len(set(x)) == 1, zip(*strs)))` – Marat Dec 10 '21 at 16:26
As I mentioned in the question, I want to solve it using `'Divide and Conquer'` rule only since I am learning. Could you help me to achieve that way? – whatsinthename Dec 10 '21 at 16:31
1

The idea behind "divide and conquer" is generally to make a big problem easier by dividing it into smaller problems. What specific part of the problem are you hoping to simplify by dividing it up? – Samwise Dec 10 '21 at 16:36
It also implies that you use some sort of multithreading to make it effective – nikeros Dec 10 '21 at 16:57
Understood.Thanks @Samwise – whatsinthename Dec 10 '21 at 17:09
From what I understood, the time complexity would be max `n` from your solution, and space complexity would be `O(1)`. Is it correct? – whatsinthename Dec 13 '21 at 07:34

nikeros · Answer 2 · 2021-12-12T06:17:56.270

Even if this problem has already found its solution, I would like to post my approach (I considered the problem interesting, so started playing around with it). So, your divide-and-conquer solution would involve a very big task split in many smaller subtasks, whose solutions get processed by other small tasks and so, until you get to the final solution. The typical example is a sum of numbers (let's take 1 to 8), which can be done sequentially (1 + 2 = 3, then 3 + 3 = 6, then 6 + 4 = 10... until the end) or splitting the problem (1 + 2 = 3, 3 + 4 = 7, 5 + 6 = 11, 7 + 8 = 15, then 3 + 7 = 10 and 11 + 15 = 26...). The second approach has the clear advantage that it can be parallelized - increasing the time performance dramatically in the right set up - reason why this goes generally hand in hand with topics like multithreading.

So my approach:

import math

def run(lst):
    if len(lst) > 1:
        lst_split = [lst[2 * (i-1) : min(len(lst) + 1, 2 * i)] for i in range(1, math.ceil(len(lst)/2.0) + 1)]
        lst = [Processor().process(*x) for x in lst_split]
        if any([len(x) == 0 for x in lst]):
            return ''
        return run(lst)
    else:
        return lst[0]
    

class Processor:

   def process(self, w1, w2 = None):
       if w2 != None:
           zipped = list(zip(w1, w2))
           for i, (x, y) in enumerate(zipped):
               if x != y:
                   return w1[:i]
               if i + 1 == len(zipped):
                   return w1[:i+1]
       else:
           return w1
       return ''

lst = ["flowers", "flow", "flight", "flask", "flock"]

print(run(lst))

OUTPUT

fl

If you look at the run method, the passed lst gets split in couples, which then get processed (this is where you could start multiple threads, but let's not focus on that). The resulting list gets reprocessed until the end.

An interesting aspect of this problem is: if, after a pass, you get one empty match (two words with no common start), you can stop the reduction, given that you know the solution already! Hence the introduction of

if any([len(x) == 0 for x in lst]):
    return ''

I don't think the functools.reduce offers the possibility of stopping the iteration in case a specific condition is met.

Out of curiosity: another solution could take advantage of regex:

import re

pattern = re.compile("(\w+)\w* \\1\w*")

def find(x, y):
    v = pattern.findall(f'{x} {y}')
    return v[0] if len(v) else ''


reduce(find, lst)

OUTPUT

'fl'

hpchavaz · Answer 3 · 2021-12-10T17:15:17.047

Sort of "divide and conquer" :

solve for 2 strings
solve for the other strings

def common_prefix2_(s1: str, s2: str)-> str:
    if not s1 or not s2: return "" 
    for i, z in enumerate(zip(s1,s2)):
        if z[0] != z[1]:
            break
    else:
        i += 1
    return s1[:i]

from functools import reduce
def common_prefix(l:list):
    return reduce(common_prefix2_, l[1:], l[0]) if len(l) else ''

Tests

for l in [["flowers", "flow", "flight"],
          ["flowers", "flow", ""],
          ["flowers", "flow"],
          ["flowers", "xxx"],
          ["flowers" ],
          []]:
    print(f"{l if l else '[]'}: '{common_prefix(l)}'")

# output
['flowers', 'flow', 'flight']: 'fl'
['flowers', 'flow', '']: ''
['flowers', 'flow']: 'flow'
['flowers', 'xxx']: ''
['flowers']: 'flowers'
[]: ''

Longest Common Prefix from list elements in Python

3 Answers3