1

I keep getting the error message SyntaxError: invalid syntax and I would like to know (1) why this is and (2) how to fix my function so that it does what I want.

I have a pandas dataframe that looks like this:

d = {'Relationship': ['Male', 'Female','Spouse','Spouse','Male','Spouse','Male','Male','Male','Spouse','Female'], 'Sex': ['Male', 'Female','Female','Male','Male','Female','Male','Male','Male','Female','Female']}
df = pd.DataFrame(data=d)
df

Relationship    Sex
Male            Male
Female          Female
Spouse          Female
Spouse          Male
Male            Male
Spouse          Female
Male            Male
Male            Male
Male            Male
Spouse          Female
Female          Female

And what I want is for each instance of Spouse to be filled in with the opposite sex listed in df['Sex']. So the df should look like this:

df

Relationship    Sex
Male            Male
Female          Female
Male            Female
Female          Male
Male            Male
Male            Female
Male            Male
Male            Male
Male            Male
Male            Female
Female          Female

This is the function I've written:

def typex(column):
    if column['Relationship']!='Spouse' & column['Sex']! ='Female':
        return 'Male'
    elif column['Relationship']!='Spouse' & column['Sex']! ='Male':
        return 'Female'

df.loc[:,'Relationship'] = df.apply(typex, axis=1)
JAG2024
  • 3,987
  • 7
  • 29
  • 58

1 Answers1

0

I suggest use numpy.select for vectorized solution:

m1 = (df['Relationship']!='Spouse') & (df['Sex']!='Female')
m2 = (df['Relationship']!='Spouse') & (df['Sex']!='Male')

df['new'] = np.select([m1, m2], ['Male','Female'], default='not matched') 

But if want use your code change & to and because working with scalars:

def typex(column):
    if (column['Relationship']=='Spouse') and (column['Sex']=='Female'):
        return 'Male'
    elif (column['Relationship']=='Spouse') and (column['Sex']='Male'):
        return 'Female'

df['new'] = df.apply(typex, axis=1)
JAG2024
  • 3,987
  • 7
  • 29
  • 58
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • I get the error message `TypeError: ("unsupported operand type(s) for &: 'str' and 'str'", 'occurred at index 0')` when I use your second solution. Any ideas why? – JAG2024 Aug 21 '18 at 11:31
  • @JAG2024 - Maybe `()` are necesarry like in first solution. – jezrael Aug 21 '18 at 11:32
  • 1
    I edited your function to one that worked for me. Perhaps there's a more elegant way... – JAG2024 Aug 21 '18 at 11:36