6

I need to compute a column where the value is the result of a vectorized operation over other columns:

df["new_col"] = df["col1"] - min(0,df["col2"])

It turned out, however, that I cannot use min as in the above syntax. So, what is the right way to get the min between zero and a given value of pandas column?

Mohamed Ali JAMAOUI
  • 14,275
  • 14
  • 73
  • 117
  • 1
    tom's answer looks good. On a column you could also do `.map( lambda x: min(x,0) )` to apply the standard python `min` to each cell, but `np.minimum` is probably going to be the fastest way. – JohnE Aug 14 '15 at 13:04

3 Answers3

8

you can use numpy.minimum to find the element-wise minimum of an array

import numpy as np
df["new_col"] = df["col1"] - np.minimum(0,df["col2"])
tmdavison
  • 64,360
  • 12
  • 187
  • 165
1

You could use some masking and a temporary column. Totally ignoring the 'min' function.

magicnumber = 0
tempcol = df['col2']
mask = tempcol < magicnumber
tempcol.loc[df[~mask].index] = magicnumber
df['col1'] - tempcol

Or you can use a lambda function:

magicnumber = 0
df['col1'] - df['col2'].apply(lambda x: np.min(magicnumber, x))

OR you can apply over two columns:

df['magicnumber'] = 0
df['col1'] - df[['col2', 'magicnumber']].apply(np.min, axis=1)
firelynx
  • 30,616
  • 9
  • 91
  • 101
  • @ajcr Yeah, I got the question wrong. It's too hot today. I updated my answer with a solution to the actual problem. – firelynx Aug 14 '15 at 12:59
0

I think that the other answers aren't what you meant. They take the minimum value in df['col2'] and compare it to 0 (and thus always return the same value), while you wanted the minimum between each value in col2 and 0:

df = pd.DataFrame(data={'a': [2, 3], 'b': [-1, 1]})

df['new_col'] = map(lambda a, b: a - min(0, b), df['a'], df['b'])

print df

>>    a  b  new_col
   0  2 -1        3
   1  3  1        3
tmdavison
  • 64,360
  • 12
  • 187
  • 165
DeepSpace
  • 78,697
  • 11
  • 109
  • 154
  • 2
    `np.minimum` _does_ return the element-wise minimum of 0 and the values in `col2`; it does not "always return the same value" – tmdavison Aug 14 '15 at 12:32
  • (`np.minimum` does exactly this, as tom said). Also, a very minor point: your code differs from your output DataFrame. – Alex Riley Aug 14 '15 at 12:41