2

I have a list of single and multi-word phrases:

terms = ['Electronic rock', 'Alternative rock', 'Indie pop']

I want to detect that terms[0] and terms[1] share the word rock. Is there a Pythonic way to do this, instead of using a ton of for-loops, temporary lists, and split(' ')?

Basically, I'm trying to detect a half-equality of phrases.

Artur Sapek
  • 2,425
  • 6
  • 27
  • 29

4 Answers4

6

You can use a dictonary to remember which words appear in which terms:

from collections import defaultdict

terms = ['Electronic rock', 'Alternative rock', 'Indie pop']
d = defaultdict(list)
for term in terms:
    for word in term.split():
        d[word].append(term)

for k,v in d.iteritems():
    if len(v) > 1:
        print k,v

Output:

rock ['Electronic rock', 'Alternative rock']

See it working online: ideone

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • haha I was halfway through typing almost the exact same thing... Maybe a 2.7+/3+ guy will show us a terser Counter example? – Kenan Banks Nov 05 '11 at 23:59
1

This is a terribly inefficient solution for these simple list elements but for longer strings you could use itertools' combinations to generate a set of 2-entry lists and then difflib to compare the strings. If you're just dealing with two or three word phrases, this solution is not for you.

jedwards
  • 29,432
  • 3
  • 65
  • 92
1

visit How to find list intersection? I think the answer could think from this. In your question, we don't know what's the result you want to present. I think you'd better list the result which you want to get.

Here I list the result which can give you some hint. (Well, without split, I don't think that will be clear to understand).

a=terms[0].split()
b=terms[1].split()
list(set(a) & set(b))
Community
  • 1
  • 1
Daniel YC Lin
  • 15,050
  • 18
  • 63
  • 96
1

Some variations on the answer of @MarkByers:

>>> from collections import defaultdict
>>>
>>> terms = [
...     'Electronic rock', 'Alternative rock', 'Indie pop',
...     'baa baa black sheep',
...     'Blackpool rock', # definition of "equality"?
...     'Rock of ages',
...     ]
>>>
>>> def process1():
...     d = defaultdict(list)
...     for term in terms:
...         for word in term.split():
...             d[word].append(term)
...     for k,v in d.iteritems():
...         if len(v) > 1:
...             print k,v
...
>>> def process2():
...     d = defaultdict(set)
...     for term in terms:
...         for word in term.split():
...             d[word.lower()].add(term)
...     for k,v in d.iteritems():
...         if len(v) > 1:
...             print k, sorted(list(v))
...
>>> process1()
rock ['Electronic rock', 'Alternative rock', 'Blackpool rock']
baa ['baa baa black sheep', 'baa baa black sheep']
>>> process2()
rock ['Alternative rock', 'Blackpool rock', 'Electronic rock', 'Rock of ages']
>>>
John Machin
  • 81,303
  • 11
  • 141
  • 189