7

New to python and I am stuck at this. My CSV file contain this:

Sr,Gender
1,Male
2,Male
3,Female

Now I want to convert the Gender values into binary so the the file will look something like:

Sr,Gender
1,1
2,1
3,0

So, I imported the CSV file as data and ran this code:

data["Gender_new"]=1
data["Gender_new"][data["Gender"]=="Male"]=0
data["Gender_new"]=1=data["Gender_new"].astype(float)

But I got the error ValueError: could not convert string 'Male' to float:

What am I doing wrong and how can I make this work?

Thanks

Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284
Akshat Bhardwaj
  • 73
  • 1
  • 1
  • 3
  • `data[‘Gender’] = (data[‘Gender’] ==‘Male’).astpye(int)` – DJK Jun 25 '18 at 04:23
  • As I'm not able to put an answer, the below code would be a solution for your question: `from sklearn.preprocessing import LabelEncoder le = LabelEncoder() df['Gender_new'] = le.fit_transform(df['Gender_new'])` – Ikbel Nov 11 '19 at 10:16

2 Answers2

13

Try this:

import pandas as pd

file = open("your.csv", "r")

data = pd.read_csv(file, sep = ",")

gender = {'male': 1,'female': 0}

data.Gender = [gender[item] for item in data.Gender]
print(data)

Or

data.Gender[data.Gender == 'male'] = 1
data.Gender[data.Gender == 'female'] = 0
print(data)
Nidhin Sajeev
  • 562
  • 4
  • 11
3

You can do the conversion as you load the file:

d = pandas.read_csv('yourfile.csv', converters={'Gender': lambda x: int(x == 'Male')})

The converters argument takes a dictionary whose keys are the column names (or indices), and the value is a function to call for each item. The function must return the converted value.

The other way to do it is to convert it once you have the dataframe, as @DJK pointed in their comment:

data['Gender'] = (data['Gender'] == 'Male').astype(int)
Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284