2

I cannot figure out how to compare two columns and if one columns is greater than or equal to another number input '1' to a new column. If the condition is not met I would like python to do nothing.

The data set for testing is here:

data = [[12,10],[15,10],[8,5],[4,5],[15,'NA'],[5,'NA'],[10,10], [9,10]]
df = pd.DataFrame(data, columns = ['Score', 'Benchmark'])

   Score Benchmark
0     12        10
1     15        10
2      8         5
3      4         5
4     15        NA
5      5        NA
6     10        10
7      9        10

The desired output is:

desired_output_data = [[12,10, 1],[15,10,1],[8,5,1],[4,5],[15,'NA'],[5,'NA'],[10,10,1], [9,10]]
desired_output_df = pd.DataFrame(desired_output_data, columns = ['Score', 'Benchmark', 'MetBench'])

   Score Benchmark  MetBench
0     12        10       1.0
1     15        10       1.0
2      8         5       1.0
3      4         5       NaN
4     15        NA       NaN
5      5        NA       NaN
6     10        10       1.0
7      9        10       NaN

I tried doing something like this:

if df['Score'] >= df['Benchmark']:
    df['MetBench'] = 1

I am new to programming in general so any guidance would be greatly appreciated. Thank you!

eyllanesc
  • 235,170
  • 19
  • 170
  • 241
Jake
  • 454
  • 5
  • 26

1 Answers1

2

Can usege and map

df.Score.ge(df.Benchmark).map({True: 1, False:np.nan})

or use the mapping from False to np.nan implicitly, since pandas uses the dict.get method to apply the mapping, and None is the default value (thanks @piRSquared)

df.Score.ge(df.Benchmark).map({True: 1})

Or simply series.where

df.Score.ge(df.Benchmark).where(lambda s: s)

Both outputs

0    1.0
1    1.0
2    1.0
3    NaN
4    NaN
5    NaN
6    1.0
7    NaN
dtype: float64

Make sure to do

df['Benchmark'] = pd.to_numeric(df['Benchmark'], errors='coerce')

first, since you have 'NA' as a string, but you need the numeric value np.nan to be able to compare it with other numbers

rafaelc
  • 57,686
  • 15
  • 58
  • 82
  • `df.Score.ge(df.Benchmark).map({True: 1})` The way pandas uses map is that it converts a dictionary to it's callable `dict.get` which returns `None` when a key doesn't exist. – piRSquared May 13 '19 at 17:40
  • @piRSquared Living and learning ;} Will add – rafaelc May 13 '19 at 17:42