My data is like this:
movies = [
"movie 1",
"movie 2",
"movie 3",
"movie 4",
"movie 5",
"movie 6",
"movie 7",
"movie 8",
"movie 9",
"movie 10",
"movie 11",
"movie 12",
"movie 13",
"movie 14",
"movie 15",
]
list_of_tuples = [
("movie 1", "movie 3"),
("movie 3", "movie 6"),
("movie 6", "movie 9"),
("movie 9", "movie 12"),
("movie 12", "movie 15"),
("movie 2", "movie 4"),
("movie 4", "movie 7"),
("movie 8", "movie 10"),
("movie 10", "movie 5"),
("movie 14", "movie 13"),
("movie 11", "movie 13"),
]
Output should be like this:
result_dict = {'movie 1' : ['movie 1' , 'movie 3', 'movie 6', 'movie 9', 'movie 12', 'movie 15'],
'movie 2' : ['movie 2', 'movie 4', 'movie 7'],
'movie 3' : ['movie 1' , 'movie 3', 'movie 6', 'movie 9', 'movie 12', 'movie 15'],
....}
Here elements in tuples are same so 'movie 1' is similar to 'movie 3' and 'movie 3' is similar to 'movie 6' and 'movie 6' is to 'movie 9' and 'movie 9' to 'movie 12' and 'movie 12' to ' movie 15'.
I want to get a dictionary which has all the similar items as values.
I have tried like this, but I am not getting result:
result_dict = {movie : list() for movie in movies}
for tup in list_of_tuples:
mov1, mov2 = tup
result_dict[mov1].append(mov2)
result_dict[mov2].append(mov1)
for x in result_dict[mov2]:
if x not in result_dict[mov1]:
result_dict[mov1].append(x)
for x in result_dict[mov1]:
if x not in result_dict[mov2]:
result_dict[mov2].append(x )
Please help me transform this with minimum time complexity.
Thanks in advance.
Thanks to @James Lin for helping to get this result, I am posting below how the code looks.
relationships = []
relationship = set()
for tuple_data in list_of_tuples:
tuple_data = set(tuple_data)
if tuple_data.intersection(relationship):
relationship |= tuple_data
else:
# broken link
relationship = set()
relationship |= tuple_data
relationships.append(relationship)
for idx in range(len(relationships)):
relationships[idx] = list(relationships[idx])
result_dict = {movie : list() for movie in movies}
for key in result_dict.keys():
for item in relationships:
if key in item:
result_dict[key] = item
and Output is:
{'movie 1': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 2': ['movie 7', 'movie 4', 'movie 2'], 'movie 3': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 4': ['movie 7', 'movie 4', 'movie 2'], 'movie 5': ['movie 10', 'movie 5', 'movie 8'], 'movie 6': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 7': ['movie 7', 'movie 4', 'movie 2'], 'movie 8': ['movie 10', 'movie 5', 'movie 8'], 'movie 9': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 10': ['movie 10', 'movie 5', 'movie 8'], 'movie 11': ['movie 14', 'movie 11', 'movie 13'], 'movie 12': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3'], 'movie 13': ['movie 14', 'movie 11', 'movie 13'], 'movie 14': ['movie 14', 'movie 11', 'movie 13'], 'movie 15': ['movie 1', 'movie 15', 'movie 12', 'movie 9', 'movie 6', 'movie 3']}
Please help me in understanding the complexity of this whole process. It would be also great to get it optimized.
Thanks