0

Working with Pandas, and I want to generate dummy variables in place on the data frame I'm working with, but it always generates two columns for binary values. How can I keep it from splitting into columns?

for example: data = pd.get_dummies(data, columns=['gender']) will generate two columns in the place of the gender field. (i.e. gender_male and gender_female with a 1 representing when the value is true for that record for the column in question.)

I think this is incredibly redundant, but I'm not sure if it matters.

What I would like to know is how to force, or coerce, the get_dummies() function to generate a single column where 1 == 'Male' and 0 == 'Female'.

what would be the most common/recommended process to do this?

Chris Rutherford
  • 1,592
  • 3
  • 22
  • 58

0 Answers0