Pandas new column with calculation based on other existing column

Question

I have a Panda and want to do a calculation based on an existing column. However, the apply. function is not working for some reason.

It's something like letssay

df = pd.DataFrame({'Age': age, 'Input': input})

and the input column is something like [1.10001, 1.49999, 1.60001] Now I want to add a new column to the Dataframe, that is doing the following:

Add 0.0001 to each element in column
Multiply each value by 10
Transform each value of new column to int

What is not working exactly? How do you use df.apply()? – alec_djinn Nov 29 '19 at 10:17 — alec_djinn, Nov 29 '19 at 10:17

jezrael · Accepted Answer · 2019-11-29T10:10:13.413

2

Use Series.add, Series.mul and Series.astype:

#input is python code word (builtin), so better dont use it like variable
inp = [1.10001, 1.49999, 1.60001] 
age = [10,20,30]
df = pd.DataFrame({'Age': age, 'Input': inp})
df['new'] = df['Input'].add(0.0001).mul(10).astype(int)
print (df)
   Age    Input  new
0   10  1.10001   11
1   20  1.49999   15
2   30  1.60001   16

edited Nov 29 '19 at 10:10

answered Nov 29 '19 at 10:04

jezrael

822,522
95
1,334
1,252

What if input has None or not numeric data, which cause add method to throw error? – furkanayd Nov 29 '19 at 10:07
1

@furkanayd - then first step is create numeric - check [this](https://stackoverflow.com/questions/15891038/change-data-type-of-columns-in-pandas) – jezrael Nov 29 '19 at 10:09

score 1 · Answer 2 · answered Nov 29 '19 at 10:15

1

You could make a simple function and then apply it by row.

def f(row):
    return int((row['input']+0.0001)*10))

df['new'] = df.apply(f, axis=1)

answered Nov 29 '19 at 10:15

alec_djinn

10,104
8
46
71

It is bad practise use [loop](https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas), because exist vectorized solution. (apply are loops under the hood) – jezrael Nov 29 '19 at 10:24
I thought apply was vectorized as well. I am not using iterrows() – alec_djinn Nov 29 '19 at 10:29
yop, something new for you. apply are loops too. Better like iterrows, but also better avoid it, if exist vectorized solution like here. – jezrael Nov 29 '19 at 10:30

Pandas new column with calculation based on other existing column

2 Answers2