0

I'm trying to add a new column in Python Pandas that has 3 different values depending on a condition of two other columns in the DataFrame. In excel or any other array I would use a straight forward IF statement

My current code is the following

df[df[column1] > df[column2], "New Column] = "A"

df[df[column1] < df[column2], "New Column] = "B"

df[df[column1] = df[column2], "New Column] = "C"

This gives the error

TypeError: unhashable type: 'Series'

column 1 and column 2 are both populated with integers.

I understand that series are unhashable however a tuple is hashable from TypeError : Unhashable type

Do I need to convert column1 and column2 of the dataframe into a tuple?

I have tried the following method

df['New Column'] = df['column1'].apply(lambda df["column1"]: 'A' if (df["column1"] > df["column2"]) else 'B')

and get an invalid syntax error.

bmgrice
  • 11
  • 3

1 Answers1

0

Use np.where.

import numpy as np

df['column3'] = np.where(df[column1] > df[column2], "true", "false")

If you want more conditions, you can do a nested where loop. It works similarly to how excel handles nested if loops.


df['column3'] = np.where(df['column1'] > df['column2'], 'a', np.where(df['column1'] < df['column2'], 'b', 'c'))

It says if 1 is bigger than 2, return a, or go into the next conditional to see if a is less than b, if that’s true return b, else c which you get if you fail greater or less than, which means it’s equal.

anarchy
  • 3,709
  • 2
  • 16
  • 48