22

Given: a list of lists, such as [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]

Todo: Find the longest common prefix of all sublists.

Exists: In another thread "Common elements between two lists not using sets in Python", it is suggested to use "Counter", which is available above python 2.7. However our current project was written in python 2.6, so "Counter" is not used.

I currently code it like this:

l = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
newl = l[0]
if len(l)>1:
    for li in l[1:]:
    newl = [x for x in newl if x in li]

But I find it not very pythonic, is there a better way of coding?

Thanks!

New edit: Sorry to mention: in my case, the shared elements of the lists in 'l' have the same order and alway start from the 0th item. So you wont have cases like [[1,2,5,6],[2,1,7]]

Community
  • 1
  • 1
lukmac
  • 4,617
  • 8
  • 33
  • 34
  • 2
    What is the expected output for `[[1, 2, 3], [3, 2, 1]]`? – Sven Marnach Jun 29 '12 at 14:04
  • 2
    `Counter` doesn't preserve order anyway. What are the common elements between `[3, 2, 1]` and `[4, 3, 2, 1]`? `[]` or `[3, 2, 1]`? I'm asking -- does position matter as well as order? If position doesn't matter, then it's the [Longest common substring problem](https://en.wikipedia.org/wiki/Longest_common_substring_problem) which you can find the answer to elsewhere on this site – agf Jun 29 '12 at 14:04
  • what is expected output of [[1,2,3],[1,4,2,3]] – Luka Rahne Jun 29 '12 at 14:09
  • Please see my New edit in the original post. – lukmac Jun 29 '12 at 14:22
  • @lukmac: I tried to clarify the title and the post. Please check if this is what you want. – Sven Marnach Jun 29 '12 at 14:28
  • 1
    Yes, your edit fits it better, thanks – lukmac Jun 29 '12 at 15:22
  • Just to be clear, should `[[1, 2, 3], [1, 8, 2, 3]]` generate output (a) `[1]` or (b) `[1, 2, 3]`? Because your example code returns (b), but your title suggests that you want to return (a). – senderle Jul 01 '12 at 21:29

5 Answers5

64

os.path.commonprefix() works well for lists :)

>>> x = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
>>> import os
>>> os.path.commonprefix(x)
[3, 2, 1]
jfs
  • 399,953
  • 195
  • 994
  • 1,670
user3246749
  • 641
  • 1
  • 5
  • 2
18

I am not sure how pythonic it is

from itertools import takewhile,izip

x = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]

def allsame(x):
    return len(set(x)) == 1

r = [i[0] for i in takewhile(allsame ,izip(*x))]
Luka Rahne
  • 10,336
  • 3
  • 34
  • 56
  • Sorry for the bother, but i'm having a similar problem. I have to find the longest common prefix on a (single) list, but the prefix isn't exactly going to be on **every** element on the list. Could you guide me a little in order to make it work, or should I post a new question? – oScarDiAnno Nov 01 '17 at 19:22
  • Also, I wanted to try this code, but i'm also using Python 3.6.2. I saw that now **`izip`** isn't in *itertools* anymore, because now there is a main function called **`zip`** that does so. I tried replacing it but now I get `TypeError: unhashable type: 'list'` **on** `return len(set(x)) == 1`. Do you know how to adapt it to work in Python 3? – oScarDiAnno Nov 01 '17 at 19:25
  • 1
    About python3 I think that you can use build-in zip function, but I do not know sure if your code is same as one posted in this answer, because I do not get such error. About your first question I do not understand it well and you can post new question on SO. – Luka Rahne Nov 01 '17 at 19:32
  • [`more_itertools`](https://more-itertools.readthedocs.io/en/stable/_modules/more_itertools/more.html#longest_common_prefix) now implements this – pylang Jul 26 '23 at 10:07
2

Given your example code, you seem to want a version of reduce(set.intersection, map(set, l)) that preserves the initial order of the first list.

This requires algorithmic improvements, not stylistic improvements; "pythonic" code alone won't do you any good here. Think about the situation that must hold for all values that occur in every list:

Given a list of lists, a value occurs in every list if and only if it occurs in nlist lists, where nlist is the total number of lists.

If we can guarantee that each value occurs only once in every list, then the above can be rephrased:

Given a list of lists of unique items, a value occurs in every list if and only if it occurs nlist times total.

We can use sets to guarantee that the items in our lists are unique, so we can combine this latter principle with a simple counting strategy:

>>> l = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
>>> count = {}
>>> for i in itertools.chain.from_iterable(map(set, l)):
...     count[i] = count.get(i, 0) + 1
...     

Now all we have to do is filter the original list:

>>> [i for i in l[0] if count[i] == len(l)]
[3, 2, 1]
senderle
  • 145,869
  • 36
  • 209
  • 233
  • Well, it seems the goalposts have moved again. +1 for a good answer to what seemed to be the question. :) – Sven Marnach Jun 29 '12 at 14:29
  • @SvenMarnach, my thinking is that "the shared elements of the lists in 'l' have the same order and alway start from the 0th item" doesn't disallow input like `[[1, 2, 3], [1, 44, 2, 3]]`. So I don't _think_ this is a longest common prefix problem. But time will tell. – senderle Jun 29 '12 at 14:40
  • My crystal ball seems to have worked fine – the OP confirmed theyy are looking for the longest common prefix. – Sven Marnach Jun 29 '12 at 15:25
2

Here's an alternative way using itertools:

>>> import itertools
>>> L = [[3,2,1,4], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
>>> common_prefix = []
>>> for i in itertools.izip(*L):
...    if i.count(i[0]) == len(i):
...       common_prefix.append(i[0])
...    else:
...       break
... 
>>> common_prefix
[3, 2, 1]

Not sure how "pythonic" it might be considered though.

jterrace
  • 64,866
  • 22
  • 157
  • 202
0

It is inefficient as it doesn't early-out as soon as a mismatch is found, but its tidy:

([i for i,(j,k) in enumerate(zip(a,b)) if j!=k] or [0])[0]
Will
  • 73,905
  • 40
  • 169
  • 246