Python List of tuples to Dictionary having value as list of all similar item in list of tuples

Question

My data is like this:

movies = [
    "movie 1",
    "movie 2",
    "movie 3",
    "movie 4",
    "movie 5",
    "movie 6",
    "movie 7",
    "movie 8",
    "movie 9",
    "movie 10",
    "movie 11",
    "movie 12",
    "movie 13",
    "movie 14",
    "movie 15",
]

list_of_tuples = [
    ("movie 1", "movie 3"),
    ("movie 3", "movie 6"),
    ("movie 6", "movie 9"),
    ("movie 9", "movie 12"),
    ("movie 12", "movie 15"),
    ("movie 2", "movie 4"),
    ("movie 4", "movie 7"),
    ("movie 8", "movie 10"),
    ("movie 10", "movie 5"),
    ("movie 14", "movie 13"),
    ("movie 11", "movie 13"),
]

Output should be like this:

result_dict = {'movie 1' : ['movie 1' , 'movie 3', 'movie 6', 'movie 9', 'movie 12', 'movie 15'],
               'movie 2' : ['movie 2', 'movie 4', 'movie 7'],
               'movie 3' : ['movie 1' , 'movie 3', 'movie 6', 'movie 9', 'movie 12', 'movie 15'],
                ....}

Here elements in tuples are same so 'movie 1' is similar to 'movie 3' and 'movie 3' is similar to 'movie 6' and 'movie 6' is to 'movie 9' and 'movie 9' to 'movie 12' and 'movie 12' to ' movie 15'.

I want to get a dictionary which has all the similar items as values.

I have tried like this, but I am not getting result:

result_dict = {movie : list() for movie in movies}

for tup in list_of_tuples:
  mov1, mov2 = tup

  result_dict[mov1].append(mov2)
  result_dict[mov2].append(mov1)

  for x in result_dict[mov2]:
    if x not in result_dict[mov1]:
    result_dict[mov1].append(x)
  
  for x in result_dict[mov1]:
    if x not in result_dict[mov2]:
      result_dict[mov2].append(x )

Please help me transform this with minimum time complexity.

Thanks in advance.

Thanks to @James Lin for helping to get this result, I am posting below how the code looks.


relationships = []
relationship = set()
for tuple_data in list_of_tuples:
    tuple_data = set(tuple_data)
    if tuple_data.intersection(relationship):
       relationship |= tuple_data
    else:
       # broken link
       relationship = set()
       relationship |= tuple_data
       relationships.append(relationship)

for idx in range(len(relationships)):
  relationships[idx] = list(relationships[idx])



result_dict = {movie : list() for movie in movies}

for key in result_dict.keys():
  for item in relationships:
    if key in item:
      result_dict[key] = item

and Output is:

{'movie 1': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 2': ['movie 7', 'movie 4', 'movie 2'], 'movie 3': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 4': ['movie 7', 'movie 4', 'movie 2'], 'movie 5': ['movie 10', 'movie 5', 'movie 8'], 'movie 6': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 7': ['movie 7', 'movie 4', 'movie 2'], 'movie 8': ['movie 10', 'movie 5', 'movie 8'], 'movie 9': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 10': ['movie 10', 'movie 5', 'movie 8'], 'movie 11': ['movie 14', 'movie 11', 'movie 13'], 'movie 12': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 13': ['movie 14', 'movie 11', 'movie 13'], 'movie 14': ['movie 14', 'movie 11', 'movie 13'], 'movie 15': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3']}

Please help me in understanding the complexity of this whole process. It would be also great to get it optimized.

Thanks

Stackoverflow isn't a code-writing service. What have you tried so far? — Pynchia, Aug 12 '20 at 03:52
I’m voting to close this question because it looks like OP has made no attempt in solving the problem. — sushanth, Aug 12 '20 at 03:58
@Pynchia I have posted my attempt in the question. Please help me to reopen this question. — neel, Aug 12 '20 at 06:17
@Sushanth I have posted my attempt in the question. Please help me to reopen this question. — neel, Aug 12 '20 at 06:17

James Lin · Answer 1 · 2020-08-12T20:36:34.217

1

Assuming your relationships are ordered top down, your description is not exactly clear, I am going to give a try to give you some hint:

You need to loop through the list_of_tuples to build the relationships between each element

relationships = []
relationship = set()
for tuple_data in list_of_tuples:
    tuple_data = set(tuple_data)
    if tuple_data.intersection(relationship):
       relationship |= tuple_data
    else:
       # broken link
       relationship = tuple_data
       relationships.append(relationship)

print(relationships)

This will print out:

[{'movie 15', 'movie 12', 'movie 6', 'movie 9', 'movie 3', 'movie 1'}, {'movie 2', 'movie 7', 'movie 4'}, {'movie 8', 'movie 5', 'movie 10'}, {'movie 11', 'movie 14', 'movie 13'}]

From this list you will be able generate your desired dictionary.

UPDATE: use set() to solve movie 11 relate to movie 13

UPDATE: you can first try to profile your code, eg. _ldap.get_option(_ldap.OPT_API_INFO) is slow after upgrading to MacOS Mojave

edited Aug 12 '20 at 20:36

answered Aug 12 '20 at 03:53

James Lin

25,028
36
133
233

This is a great help for me. Thanks @James Lin – neel Aug 12 '20 at 06:25
Movie 11 and 14 are related since 13 is common between them – neel Aug 12 '20 at 07:18
1

I have updated my answer to solve movie 11 and 13 problem – James Lin Aug 12 '20 at 09:29
I worked for me but and I got my result_dict but I think I am messing with time complexity here I am updating my question with final result. It would be great if you could direct me on time complexity. – neel Aug 12 '20 at 10:56
updated my answer, see if profiling your code helps to understand which part is time consuming, also noticed I updated my code in the else block to remove unnecessary line. – James Lin Aug 12 '20 at 20:39
how to handle if the list_of_tuples is not sorted eg: ```[("movie 1", "movie 2"), ("movie 3", "movie 4"), ("movie 4", "movie 1") ]``` – neel Aug 13 '20 at 09:56
You will have to add in extra steps to reconnect the broken relationships. Not that hard. – James Lin Aug 13 '20 at 10:41

Joe Ferndz · Answer 2 · 2020-08-12T05:41:01.617

You can use defaultdict to get this done.

from collections import defaultdict

list_of_tuples = [
    ("movie 1", "movie 3"),
    ("movie 3", "movie 6"),
    ("movie 6", "movie 9"),
    ("movie 9", "movie 12"),
    ("movie 12", "movie 15"),
    ("movie 2", "movie 4"),
    ("movie 4", "movie 7"),
    ("movie 8", "movie 10"),
    ("movie 10", "movie 5"),
    ("movie 14", "movie 13"),
    ("movie 11", "movie 13"),
]

result_dict = defaultdict(list)

for k ,v in list_of_tuples:

    #for value in the tuple, find out if this is already part of
    #the existing dictionary. If yes, get the key so you can
    #append to the key else start a new key item

    a = ''.join([x for x, y in result_dict.items() for z in y if z == k])

    #if found, above list comprehension will result in 1 element

    if a = '' : #if not found, then create a new list for key
        result_dict[k].append(v)

    else: # value is part of a key list, so append value to key list 
        result_dict[a].append(v)

result_dict = dict(result_dict)
print (result_dict)

The output of the above code is:

{'movie 1': ['movie 3', 'movie 6', 'movie 9', 'movie 12', 'movie 15'], 'movie 2': ['movie 4', 'movie 7'], 'movie 8': ['movie 10', 'movie 5'], 'movie 14': ['movie 13'], 'movie 11': ['movie 13']}

Is this what you are looking for

You'r output doesnot match with OP's expected output. – sushanth Aug 12 '20 at 03:57 — sushanth, Aug 12 '20 at 03:57

score 0 · Answer 3 · answered Aug 12 '20 at 04:16

You can always call dict(list_of_tuples) to get a corresponding dictionary for these tuples.

I don't know if this is THE most efficient timewise, but I get what you're trying to get in ~ O(n) with the following code:

from collections import defaultdict

movie_dict = dict(list_of_tuples)

index = defaultdict(list)
for key, value in movie_dict.items():
    index[key] += [value]
    index[value] += [key]

The output is:

defaultdict(list,
            {'movie 1': ['movie 3'],
             'movie 3': ['movie 1', 'movie 6'],
             'movie 6': ['movie 3', 'movie 9'],
             'movie 9': ['movie 6', 'movie 12'],
             'movie 12': ['movie 9', 'movie 15'],
             'movie 15': ['movie 12'],
             'movie 2': ['movie 4'],
             'movie 4': ['movie 2', 'movie 7'],
             'movie 7': ['movie 4'],
             'movie 8': ['movie 10'],
             'movie 10': ['movie 8', 'movie 5'],
             'movie 5': ['movie 10'],
             'movie 14': ['movie 13'],
             'movie 13': ['movie 14', 'movie 11'],
             'movie 11': ['movie 13']})

ETA: This gives you an index by movie of what it's similar to. If you want the movie equivalence classes, so to speak, you'll need to do some set operations. I'll add more info shortly.

Python List of tuples to Dictionary having value as list of all similar item in list of tuples

3 Answers3