1

I got new column of 'gender' of df summarized as below after using gender_guesser.detector package. I want change 'mostly_female' to 'female'; and change 'mostly_male" & 'andy' to 'male'; I wrote codes as below, but generate error. How to fix it? Thanks a lot! unknown 1125 male 321 female 225 mostly_male 29 mostly_female 26 andy 15

import random import numpy as np

for index, g in df.iterrows():

if g == 'mostly_female':
    df.loc[index, 'gender'] = 'female'

elif g == 'mostly_male':
    df.loc[index, 'gender'] = 'male' 

elif g == 'andy':
    df.loc[index, 'gender'] = 'male'

elif g  == 'unknown':
    df.loc[index, 'gender'] = np.random.choice(['female', 'male'], size=1)

else: 
    print('error')

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

In additional, any suggestion how to revise "unknown" to "male" and " female" according to "first name"?

I really need to change "unknown" to male/female separately, but don't how to handle 1130 observation. So many name here ... 'Cyrenna', 'Dacks', 'Daella', 'Daella', 'Daemon', 'Daeron', 'Daeron', 'Dafyn', 'Dagon', 'Dake', 'Danwell', 'Daughter', 'Delena', 'Dickon', 'Donel', 'Harren', 'Harrold', 'Harwyn', 'Hoarfrost', 'Hoke', 'Hot', 'Hother', 'Humfrey', 'Humfrey', 'Jaremy', 'Jeor', 'Jeyne', 'Jeyne', 'Donnel', 'Jeyne', 'Jeyne', 'Jeyne', 'Jhaqo', 'Jhiqui', 'Aegon', 'Aegon', 'Aerion', 'Aladale', 'Alester', 'Bannen', 'Belandra', 'Belwas', 'Benjen', 'Benjen', 'Beric', 'Black', 'Bore'

J Lin
  • 113
  • 6
  • Don’t you want g[“gender”] = ? – Saddy Jan 25 '20 at 18:41
  • Reading the docs should be your first reflex when using a new library. – AMC Jan 25 '20 at 19:13
  • Does this answer your question? [Pandas conditional creation of a series/dataframe column](https://stackoverflow.com/questions/19913659/pandas-conditional-creation-of-a-series-dataframe-column) – AMC Jan 25 '20 at 19:14

1 Answers1

2

You could use map method by passing replacement value for every key you need.

df['gender'] = df['gender'].map({
      'mostly_female': 'female', 
      'mostly_male': 'male', 
      'andy': 'male',
      'unknown': np.random.choice(['female', 'male'], size=1)
})
Mihai Alexandru-Ionut
  • 47,092
  • 13
  • 101
  • 128
  • 1
    Probably worth noting that `'unknown'` will get evaluated once so if changing all unknowns to either male/female is what's desired this is fine... If each individual unknown should be changed to male/female separately then you'd need to take a different approach. – Jon Clements Jan 25 '20 at 18:58
  • @JonClements You're right, and I think that's what OP wanted. – AMC Jan 25 '20 at 21:02
  • I really need to change "unknown" to male/female separately, but don't how to handle 1130 observation. So many name here ... 'Cyrenna', 'Dacks', 'Daella', 'Daella', 'Daemon', 'Daeron', 'Daeron', 'Dafyn', 'Dagon', 'Dake', 'Danwell', 'Daughter', 'Delena', 'Dickon', 'Donel', 'Harren', 'Harrold', 'Harwyn', 'Hoarfrost', 'Hoke', 'Hot', 'Hother', 'Humfrey', 'Humfrey', 'Jaremy', 'Jeor', 'Jeyne', 'Jeyne', 'Donnel', 'Jeyne', 'Jeyne', 'Jeyne', 'Jhaqo', 'Jhiqui', 'Aegon', 'Aegon', 'Aerion', 'Aladale', 'Alester', 'Bannen', 'Belandra', 'Belwas', 'Benjen', 'Benjen', 'Beric', 'Black', 'Bore' – J Lin Jan 25 '20 at 21:09