I have a big matrix import for one .csv more tan 50.000 lines.
I am working with panda and numpy, the matrix is a film data base, I would like to add a new conditional column.
One of the matrix column is genres, is one string with diferentes genres, I want to create a new colum call "Drama_yes_or_no" with one conditional evaluating the column, if the column contains "Drama" in the string write 1.
I am trying with this code but I have this error. ("argument of type 'float' is not iterable", u'occurred at index 424')
def dram_genres(passenger):
original_title, genres = passenger
#if genres.find('Drama') != -1:
if "Drama" in genres:
return 'Drama'
else:
return 'Not Drama'
# adds new column to dataframe specifying if the film is good/bad
IMDb_data['Drama_or_not'] = IMDb_data[['original_title', 'genres']].apply(dram_genres, axis=1)
IMDb_data[['original_title', 'genres', 'budget','vote_average','Drama_or_not']].head(7)
could you help me please?
Thanks in advance