For a single watchlist
You can use a selected isin
with np.where
by selecting which series you want to check for and by using a different watchlist for each series. For your dataframe df:
Episode Number Rating Series
0 4 Days Out 2.90 9.1 Breaking Bad (2008)
1 Buyout 5.60 9.0 Breaking Bad (2008)
2 Pilot 1.10 9.0 Breaking Bad (2008)
3 Dog Fight 1.12 9.0 Suits (2011)
4 We're Done 4.70 9.0 Suits (2011)
5 Privilege 5.60 8.9 Suits (2011)
6 Pilot 1.10 8.9 Suits (2011)
and watchlist
:
[1.1, 4.7, 2.9]
Assume that watchlist is just for Breaking Bad. Use np.where
to apply function only to rows that match Breaking Bad (2008)
and then use isin
to see if the value in the column Rating
is in your watchlist
:
df['Breaking Bad Watched'] = df['Number'][np.where(df['Series'] == "Breaking Bad (2008)")[0]].isin(watchlist)
Gives:
Episode Number Rating Series Breaking Bad Watched
0 4 Days Out 2.90 9.1 Breaking Bad (2008) True
1 Buyout 5.60 9.0 Breaking Bad (2008) False
2 Pilot 1.10 9.0 Breaking Bad (2008) True
3 Dog Fight 1.12 9.0 Suits (2011) NaN
4 We're Done 4.70 9.0 Suits (2011) NaN
5 Privilege 5.60 8.9 Suits (2011) NaN
6 Pilot 1.10 8.9 Suits (2011) NaN
Then use map
to convert from true
/ false
to yes
/ no
:
d = {True: 'Yes', False: 'No'}
df['Breaking Bad Watched'] = df['Breaking Bad Watched'].map(d)
Episode Number Rating Series Breaking Bad Watched
0 4 Days Out 2.90 9.1 Breaking Bad (2008) Yes
1 Buyout 5.60 9.0 Breaking Bad (2008) No
2 Pilot 1.10 9.0 Breaking Bad (2008) Yes
3 Dog Fight 1.12 9.0 Suits (2011) NaN
4 We're Done 4.70 9.0 Suits (2011) NaN
5 Privilege 5.60 8.9 Suits (2011) NaN
6 Pilot 1.10 8.9 Suits (2011) NaN
------------------------ For A dictionary of Watchlists --------------------
If you have a dictionary of watchlists where the series and episode number is specified separately:
watchlist = {'Breaking Bad (2008)': [1.1, 4.7, 2.9], 'Suits (2011)': [4.7, 5.6]}
You can interate over it as follows:
# Save name of new columns into new_col_list
new_col_list = []
for series, wlist in watchlist.iteritems():
# Save names of new columns into new_col_list
new_col_list.append('{} Watched'.format(series))
# Do calculation
print series, wlist
df['{} Watched'.format(series)] = df['Number'][np.where(df['Series'] == series)[0]].isin(wlist)
This gives you:
Episode Number Rating Series \
0 4 Days Out 2.90 9.1 Breaking Bad (2008)
1 Buyout 5.60 9.0 Breaking Bad (2008)
2 Pilot 1.10 9.0 Breaking Bad (2008)
3 Dog Fight 1.12 9.0 Suits (2011)
4 We're Done 4.70 9.0 Suits (2011)
5 Privilege 5.60 8.9 Suits (2011)
6 Pilot 1.10 8.9 Suits (2011)
Breaking Bad (2008) Watched Suits (2011) Watched
0 True NaN
1 False NaN
2 True NaN
3 NaN False
4 NaN True
5 NaN True
6 NaN False
new_col_list = ['Breaking Bad (2008) Watched', 'Suits (2011) Watched']
[1]If have only a few names then manually write them: Then use pd.concatenate
to concatenate the two watch columns, and drop those columns:
df['Watched'] = pd.concat([df['Breaking Bad (2008) Watched'].dropna(), df['Suits (2011) Watched'].dropna()])
# Remove old Columns
df.drop(['Breaking Bad (2008) Watched','Suits (2011) Watched'], axis=1, inplace=True)
[2] If have a list of columns names then add list of names to pd.concat
using a simple list comprehension, iterating over column names in new_col_list
:
df['Watched'] = pd.concat([df['{}'.format(i)].dropna() for i in new_col_list])
# Remove old Name Columns
df.drop(new_col_list, axis=1, inplace=True)
# Convert True False to Yes No
d = {True: 'Yes', False: 'No'}
df['Watched'] = df['Watched'].map(d)
# Final Output:
df:
Episode Number Rating Series Watched
0 4 Days Out 2.90 9.1 Breaking Bad (2008) Yes
1 Buyout 5.60 9.0 Breaking Bad (2008) No
2 Pilot 1.10 9.0 Breaking Bad (2008) Yes
3 Dog Fight 1.12 9.0 Suits (2011) No
4 We're Done 4.70 9.0 Suits (2011) Yes
5 Privilege 5.60 8.9 Suits (2011) Yes
6 Pilot 1.10 8.9 Suits (2011) No
Sources
Source for isin
:
[1] How to check if a value is in the list in selection from pandas data frame?
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.isin.html
Source for concat
:
[2] https://stackoverflow.com/a/10972557/2254228
Source for map
:
[3] Convert Pandas series containing string to boolean