0
dataset = pd.read_csv('./file.csv')
dataset.head()

This gives:

    age sex     smoker  married  region    price
0   39  female  yes     no       us        250000
1   28  male    no      no       us        400000
2   23  male    no      yes      europe    389000
3   17  male    no      no       asia      230000
4   43  male    no      yes      asia      243800

I want to replace all yes/no values of smoker with 0 or 1, but I don't want to change the yes/no values of married. I want to use pandas replace function.

I did the following, but this obviously changes all yes/no values (from smoker and married column):

dataset = dataset.replace(to_replace='yes', value='1')
dataset = dataset.replace(to_replace='no', value='0')

    age sex     smoker  married region     price
0   39  female  1       0       us         250000
1   28  male    0       0       us         400000
2   23  male    0       1       europe     389000
3   17  male    0       0       asia       230000
4   43  male    0       1       asia       243800

How can I ensure that only the yes/no values from the smoker column get changed, preferably using Pandas' replace function?

wiwa1978
  • 2,317
  • 3
  • 31
  • 67

1 Answers1

1

did you try:

dataset['smoker']=dataset['smoker'].replace({'yes':1, 'no':0})
Bushmaster
  • 4,196
  • 3
  • 8
  • 28