0

I have a pandas data frame with columns Longitude and Latitude. I'd like to get X and Y from them. There is a function in utm called from_latlon that does this. It receives Latitude and Longitude and gives [X,Y]. Here's what I do:

    def get_X(row):
        return utm.from_latlon(row['Latitude'], row['Longitude'])[0]

    def get_Y(row):
        return utm.from_latlon(row['Latitude'], row['Longitude'])[1] 

    df['X'] = df.apply(get_X, axis=1)
    df['Y'] = df.apply(get_Y, axis=1)

I'd like to define a function get_XY and apply from_latlon just one time to save time. I took a look at here, here and here but I could not find a way to make two columns with one apply function. Thanks.

tawab_shakeel
  • 3,701
  • 10
  • 26
ahoosh
  • 1,340
  • 3
  • 17
  • 31

2 Answers2

6

You can return a list from your function:

d = pandas.DataFrame({
    "A": [1, 2, 3, 4, 5],
    "B": [8, 88, 0, -8, -88]
})

def foo(row):
    return [row["A"]+row["B"], row["A"]-row["B"]]

>>> d.apply(foo, axis=1)
    A   B
0   9  -7
1  90 -86
2   3   3
3  -4  12
4 -83  93

You can also return a Series. This lets you specify the column names of the return value:

def foo(row):
    return pandas.Series({"X": row["A"]+row["B"], "Y": row["A"]-row["B"]})

>>> d.apply(foo, axis=1)
    X   Y
0   9  -7
1  90 -86
2   3   3
3  -4  12
4 -83  93
BrenBarn
  • 242,874
  • 37
  • 412
  • 384
  • Based on your first solution, I use `temp = d.apply(foo, axis=1)` and then do `d['sum'] = [item[0] for item in temp]` and `d['subtract'] = [item[1] for item in temp]`. Is there a better way to do it. If I do `d[['sum','subtract']] = d.apply(foo, axis=1)` I get an error. I guess at this point it's a matter of returning the result into the original data frame. Your second solution doesn't work for me unfortunately because of my specific function. Thanks. – ahoosh May 17 '16 at 19:47
  • 1
    @bikhaab: There's no simple way to assign multiple columns into a DataFrame at once, so that's kind of a separate issue. You could use `concat` or `merge` to join the result DataFrame with your original one. See [this question](http://stackoverflow.com/questions/20829748/pandas-assigning-multiple-new-columns-simultaneously) for some related ideas. – BrenBarn May 17 '16 at 20:40
1

I merged a couple of the answers from a similar thread and now have a generic multi-column in, multi-column out template I use in Jupyter/pandas:

# plain old function doesn't know about rows/columns, it just does its job.
def my_func(arg1,arg2):
    return arg1+arg2, arg1-arg2  # return multiple responses

df['sum'],df['difference'] = zip(*df.apply(lambda x: my_func(x['first'],x['second']),axis=1))
RufusVS
  • 4,008
  • 3
  • 29
  • 40