My data looks like below, where I am trying to create the column output with the given values.
a_id b_received c_consumed
0 sam soap oil
1 sam oil NaN
2 sam brush soap
3 harry oil shoes
4 harry shoes oil
5 alice beer eggs
6 alice brush brush
7 alice eggs NaN
The code for producing the dataset is
df = pd.DataFrame({'a_id': 'sam sam sam harry harry alice alice alice'.split(),
'b_received': 'soap oil brush oil shoes beer brush eggs'.split(),
'c_consumed': 'oil NaN soap shoes oil eggs brush NaN'.split()})
I want a new column called Output which looks like this
a_id b_received c_consumed output
0 sam soap oil 1
1 sam oil NaN 1
2 sam brush soap 0
3 harry oil shoes 1
4 harry shoes oil 1
5 alice beer eggs 0
6 alice brush brush 1
7 alice eggs NaN 1
So the search is if sam recieved soap, oil and brush, look for values in column 'consumed' for products he consumed, so if soap was consumed the output will be 1, but since brush wasn't consumed the output is 0.
Similarly for harry, he received oil and shoes, then look for oil and shoes in the consumed column, if oil was consumed, the output is 1.
To make it more clear, the output value corresponds to the first column (received), contingent on the value being present in the second column (consumed).
I tried using this code
a=[]
for i in range(len(df.b_received)):
if any(df.c_consumed == df.b_received[i] ):
a.append(1)
else:
a.append(0)
df['output']=a
This gives me the output
a_id b_received c_consumed output
0 sam soap oil 1
1 sam oil NaN 1
2 sam brush soap 1
3 harry oil shoes 1
4 harry shoes oil 1
5 alice beer eggs 0
6 alice brush brush 1
7 alice eggs NaN 1
The problem is that since sam didn't consume brush, the output should be 0 but the output is 1, since brush was consumed by a different person (alice). I need to make sure that doesn't happen. The output needs to be specific to each person's consumption.
I know this is confusing, so if I have not made myself very clear, please do ask, I will answer your comments.