I have trouble with shortening my code with lambda if possible. bp is my data name.
My data looks like this:
user label
1 b
2 b
3 c
I expect to have
user label Y
1 b 1
2 b 1
3 c 0
Here is my code:
counts = bp['Label'].value_counts()
def score_to_numeric(x):
if counts['b'] > counts['s']:
if x == 'b':
return 1
else:
return 0
else:
if x =='b':
return 0
else:
return 1
bp['Y'] = bp['Label'].apply(score_to_numeric) # apply above function to convert data
It is a function converting a categorical data 'b' or 's' in column named 'Label' into numeric data: 0 or 1. The line counts = bp['Label'].value_counts()
counts the number of 'b' or 's' in column 'Label'. Then, in score_to_numeric
, if the count of 'b' is more than 's', then give value 1 to b in a new column called 'Y', and vice versa.
I would like to shorten my code into 3-4 lines at most. I think perhaps using a lambda statement will do this, but I'm not familiar enough with lambdas.