0

So I have a .csv file with a bunch of movie titles, and one of the columns is the listing on Netflix, and it is a list so it has multiple listings, it looks like the spreadsheet below:

enter image description here

I have managed to extract the individual columns that I need, and so now my main trouble is how to simplify the list of genres so that repeats aren't added.

I have the following code:

enter code ef getUniqueGenres(genrelist):
uniqueGenres = []
for titles in genrelist:
    for listing in titles
        if not listing in uniqueGenres:
            uniqueGenres.append(listing)
return uniqueGenreshere

I want the code to be able to go through each listing per title and see if it is a unique genre. If it is, it will store it, otherwise it just ignores it.

CLARIFICATION: The listed_in data colum is a list where each index is its own list. So the unique values would have to take all that into account. I tried doing something like this:

uniqueGenres = []
for titles in netflixgenre:
    for listing in titles:
        if not listing in uniqueGenres:
            uniqueGenres.append(listing)

But that also did not give me the desired output

2 Answers2

1

This is actually a duplicate question from here and here. Nonetheless, the general consensus is the same as Shaid's answer:

Just casting to a set and back to a list is the easiest way to do it. This works because all items in a set must be unique, else they aren't added.

unique_list = list(set(list)).sort()
0

You can do it simply by using set. here is a example:

mylist = ['a', 'b', 'a', 'a', 'c', 'd', 'e']
unique_list = list(set(mylist))

for list inside a list.

import itertools
mylist = [['a', 'b', 'a', 'a', 'c', 'd', 'e'], ['c','a','d']]
// combine multiple list to single list using itertools
final_list = list(itertools.chain.from_iterable(mylist))
unique_list = list(set(final_list))