1

I have a Panda and want to do a calculation based on an existing column. However, the apply. function is not working for some reason.

It's something like letssay

df = pd.DataFrame({'Age': age, 'Input': input})

and the input column is something like [1.10001, 1.49999, 1.60001] Now I want to add a new column to the Dataframe, that is doing the following:

  • Add 0.0001 to each element in column
  • Multiply each value by 10
  • Transform each value of new column to int
StupidWolf
  • 45,075
  • 17
  • 40
  • 72
Anonymosaurus
  • 99
  • 1
  • 9

2 Answers2

2

Use Series.add, Series.mul and Series.astype:

#input is python code word (builtin), so better dont use it like variable
inp = [1.10001, 1.49999, 1.60001] 
age = [10,20,30]
df = pd.DataFrame({'Age': age, 'Input': inp})
df['new'] = df['Input'].add(0.0001).mul(10).astype(int)
print (df)
   Age    Input  new
0   10  1.10001   11
1   20  1.49999   15
2   30  1.60001   16
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • What if input has None or not numeric data, which cause add method to throw error? – furkanayd Nov 29 '19 at 10:07
  • 1
    @furkanayd - then first step is create numeric - check [this](https://stackoverflow.com/questions/15891038/change-data-type-of-columns-in-pandas) – jezrael Nov 29 '19 at 10:09
1

You could make a simple function and then apply it by row.

def f(row):
    return int((row['input']+0.0001)*10))

df['new'] = df.apply(f, axis=1)
alec_djinn
  • 10,104
  • 8
  • 46
  • 71
  • It is bad practise use [loop](https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas), because exist vectorized solution. (apply are loops under the hood) – jezrael Nov 29 '19 at 10:24
  • I thought apply was vectorized as well. I am not using iterrows() – alec_djinn Nov 29 '19 at 10:29
  • yop, something new for you. apply are loops too. Better like iterrows, but also better avoid it, if exist vectorized solution like here. – jezrael Nov 29 '19 at 10:30