I have this dataframe:
In [6]: import pandas as pd
In [7]: import numpy as np
In [8]: df = pd.DataFrame(data = np.nan,
...: columns = ['A', 'B', 'C', 'D', 'E'],
...: index = ['A', 'B', 'C', 'D', 'E'])
...:
...: df['list_of_codes'] = [['A' , 'B'],
...: ['A', 'B', 'E'],
...: ['C', 'D'],
...: ['B', 'D'],
...: ['E']]
...:
...: df
Out[8]:
A B C D E list_of_codes
A NaN NaN NaN NaN NaN [A, B]
B NaN NaN NaN NaN NaN [A, B, E]
C NaN NaN NaN NaN NaN [C, D]
D NaN NaN NaN NaN NaN [B, D]
E NaN NaN NaN NaN NaN [E]
And now I want to insert a '1' where both the index and column name are present inside of the list in the column df['list_of_codes']. The result would look like this:
A B C D E list_of_codes
A 1 1 0 0 0 [A, B]
B 1 1 0 0 1 [A, B, E]
C 0 0 1 1 0 [C, D]
D 0 1 0 1 0 [B, D]
E 0 0 0 0 1 [E]
I have tried something like this:
df.apply(lambda x: 1 if x[:-1] in (x[-1]) else 0, axis=1, result_type='broadcast')
but get the error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I don't think I understand this error exactly but then I try:
df.apply(lambda x: 1 if x[:-1].any() in (x[-1]) else 0, axis=1, result_type='broadcast')
This runs but does not give me the desired result. Instead it returns:
A B C D E list_of_codes
A 0 0 0 0 0 0
B 0 0 0 0 0 0
C 0 0 0 0 0 0
D 0 0 0 0 0 0
E 0 0 0 0 0 0
Can someone help me understand what I need in my pd.apply() and lambda functions in order to broadcast the '1's in the way that I am trying to? Thanks in advance!