I am encoding categorical variables in my dataframe. I found a nice pythonic way to do this with lambda expressions. For instance, the following line of code replaces the gender categories "male" and "female" (encoded as strings) with values 0 and 1.
train_frame['Sex'] = train_frame['Sex'].apply(lambda x : 1 if x =='male' else 0)
Now my question is, can i also do this but then for more than two categories? (So more then 1 if in the expression so to say).
I am trying to do this for the place where people Embarked on a ship, where I want to represent the place where people boarded the ship with an integer (Some background info: S = Southampton, C = Cherbourg, Q = Queenstown)
I tried to do something like this, but it does not work:
#Southampton = 0, Cherbourg = 1, Queenstown = 2
train_frame['Embarked'] = train_frame['Embarked'].apply(lambda x: 0 if x =='S', 1 if x=='C' else 2 )
Can somebody explain me if it is possible to use lambda-expressions with multiple if-statements? and, slightly off-topic: is there a more pythonic way to encode categoricals in a dataframe?