1

I tried to iterate through each row in a dataframe, and to do this - calculate_distance_from_SnS(row), then reassign the returned value (its a number). The result is that it doesn't save the value under the specific column, what am I missing?

for example:

DF example

A   B   C
1   10  0
2   12  0

and I want to do this function C=A+B for each row and get to this state:

A   B   C
1   10  11
2   12  14

I did this:


def calculate_distance_from_SnS(row):

using DF row and using 2 cols to calculate.



for i,row in customers.iterrows():
    row['dist_from_sns'] = calculate_distance_from_SnS(row)
Community
  • 1
  • 1
Daniel
  • 23
  • 8

1 Answers1

1

Set values of original DataFrame, not by Series in loop:

for i,row in customers.iterrows():
    customers.loc[i, 'dist_from_sns'] = calculate_distance_from_SnS(row)

But if possible, better is use DataFrame.apply with axis=1 for processing per rows:

f = lambda x: calculate_distance_from_SnS(x['lat'], x['long'])
customers['dist_from_sns'] = customers.apply(f, axis=1)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • your first solution worked !, but i didnt understand the second one, on : customers['cols'].apply(calculate_distance_from_SnS), the columns i using are "lat" and "lng" so it should be: customers['dist_from_sns'] = customers['lat','long'].apply(calculate_distance_from_SnS) or i am wrong? – Daniel Oct 21 '19 at 11:45
  • @Daniel - Added solution for apply, `iterrows` working, but if exist some alternative best avoid it - check [this](https://stackoverflow.com/questions/24870953/does-pandas-iterrows-have-performance-issues/24871316#24871316) – jezrael Oct 21 '19 at 11:55