239

I want to apply a function with arguments to a series in python pandas:

x = my_series.apply(my_function, more_arguments_1)
y = my_series.apply(my_function, more_arguments_2)
...

The documentation describes support for an apply method, but it doesn't accept any arguments. Is there a different method that accepts arguments? Alternatively, am I missing a simple workaround?

Update (October 2017): Note that since this question was originally asked that pandas apply() has been updated to handle positional and keyword arguments and the documentation link above now reflects that and shows how to include either type of argument.

JohnE
  • 29,156
  • 8
  • 79
  • 109
Abe
  • 22,738
  • 26
  • 82
  • 111
  • 3
    Why not just use `functools.partial`, or `starmap`? – Joel Cornett Aug 29 '12 at 16:54
  • 1
    See [`DataFrame.apply` docs](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html) and [`Series.apply` docs](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.apply.html) – Martin Thoma Aug 16 '18 at 12:43

7 Answers7

269

Newer versions of pandas do allow you to pass extra arguments (see the new documentation). So now you can do:

my_series.apply(your_function, args=(2,3,4), extra_kw=1)

The positional arguments are added after the element of the series.


For older version of pandas:

The documentation explains this clearly. The apply method accepts a python function which should have a single parameter. If you want to pass more parameters you should use functools.partial as suggested by Joel Cornett in his comment.

An example:

>>> import functools
>>> import operator
>>> add_3 = functools.partial(operator.add,3)
>>> add_3(2)
5
>>> add_3(7)
10

You can also pass keyword arguments using partial.

Another way would be to create a lambda:

my_series.apply((lambda x: your_func(a,b,c,d,...,x)))

But I think using partial is better.

Bakuriu
  • 98,325
  • 22
  • 197
  • 231
  • 14
    For a DataFrame apply method accepts `args` argument, which is a tuple holding additional positional arguments or **kwds for named ones. I created an issue to have this also for Series.apply() https://github.com/pydata/pandas/issues/1829 – Wouter Overmeire Aug 30 '12 at 20:11
  • 31
    Feature has been implemented, will be in upcoming pandas release – Wes McKinney Sep 09 '12 at 00:23
  • 4
    This is a nice answer but the first 2/3 of it is really obsolete now. IMO, this answer could be nicely updated by just being a link to the new documentation plus a brief example of how to use with position and/or keyword args. Just FWIW and not a criticism of the original answer, just would benefit from an update IMO, especially as it is a frequently read answer. – JohnE Oct 15 '17 at 14:59
  • @watsonic The documentation has since been updated and clicking on the old links leads to current documentation which now answers the question very well. – JohnE Oct 16 '17 at 16:49
  • 8
    Note: If you are passing a single string argument, for example `'abc'`, then `args=('abc')` will be evaluated as three arguments `('a', 'b', 'c')`. To avoid this, you must pass a tuple containing the string, and to do that, include a trailing comma: `args=('abc',)` – Rocky K Jun 20 '20 at 12:22
  • @RockyK That's just how Python syntax words `('abc')` is just the string `abc`. The *comma* is the thing that defines tuples in Python syntax, the parenthesis are just for grouping. – Bakuriu Jun 21 '20 at 16:46
137

Steps:

  1. Create a dataframe
  2. Create a function
  3. Use the named arguments of the function in the apply statement.

Example

x=pd.DataFrame([1,2,3,4])  

def add(i1, i2):  
    return i1+i2

x.apply(add,i2=9)

The outcome of this example is that each number in the dataframe will be added to the number 9.

    0
0  10
1  11
2  12
3  13

Explanation:

The "add" function has two parameters: i1, i2. The first parameter is going to be the value in data frame and the second is whatever we pass to the "apply" function. In this case, we are passing "9" to the apply function using the keyword argument "i2".

FistOfFury
  • 6,735
  • 7
  • 49
  • 57
  • 2
    Exactly what I was looking for. Notably, this does not require creating a custom function just to handle a Series (or df). Perfect! – Connor May 24 '19 at 17:39
  • 2
    The only remaining question is: How to pass a keyword argument to the first arg in add (i1) and iterate with i2? – Connor May 24 '19 at 17:43
  • 1
    I think this is the best answer – crypdick Oct 28 '19 at 22:35
  • Seconding the comment by @Connor, how would one deal with 2 positional arguments when the first one must be specified? – timmey Jan 05 '21 at 15:46
53
Series.apply(func, convert_dtype=True, args=(), **kwds)

args : tuple

x = my_series.apply(my_function, args = (arg1,))
Alejandro Alcalde
  • 5,990
  • 6
  • 39
  • 79
dani_g
  • 539
  • 4
  • 3
  • 12
    Thanks! Can you explain why args = (arg1,) needs a comma after the first argument? – DrMisha May 05 '15 at 18:19
  • 24
    @MishaTeplitskiy, you need the comma in order for Python to understand the parentheses' contents to be a tuple of length 1. – prooffreader May 18 '15 at 21:10
  • 5
    What about putting in args for the `func`. So if I wish to apply `pd.Series.mean(axis=1)` how do I put in the `axis=1`? – Little Bobby Tables Apr 07 '16 at 10:57
  • 1
    As a side note, you can also add a keyword argument without using the parameter (e.g.: x = my_series.apply(my_function, keyword_arg=arg1), where is among the input parameters of my_function) – lev Apr 08 '16 at 08:15
  • 1
    this response is too short and doesn't explain anything – FistOfFury Apr 17 '17 at 22:08
  • 1
    @DrMisha in my opinion it is *significantly* more explicit to write `args = tuple(arg1)` instead of `args=(arg1,)` – Connor May 24 '19 at 17:41
  • @prooffreader The question that follows then is why we need a tuple in the first place? – Lobstw Sep 16 '19 at 16:10
  • Because functions can take more than one positional argument. Behind the scenes it's expanding *args, which is tuple expansion. So instead of adding in all the extra code to convert lists to tuples and convert single values to tuples of size 1, they just require tuples to be consistent. It's a design choice. – prooffreader Sep 17 '19 at 17:15
  • @Connor, No, `tuple(arg1)` is not equivalent to `(arg1,)`! If `arg1=='abc'`, then `tuple(arg1)` becomes `('a', 'b', 'c')`. What you want is `tuple([arg1])`, which is equal to `('abc',)` – wisbucky Aug 26 '22 at 18:29
  • 1
    @wisbucky ah right, good catch. Also python is dumb – Connor Aug 27 '22 at 02:03
41

You can pass any number of arguments to the function that apply is calling through either unnamed arguments, passed as a tuple to the args parameter, or through other keyword arguments internally captured as a dictionary by the kwds parameter.

For instance, let's build a function that returns True for values between 3 and 6, and False otherwise.

s = pd.Series(np.random.randint(0,10, 10))
s

0    5
1    3
2    1
3    1
4    6
5    0
6    3
7    4
8    9
9    6
dtype: int64

s.apply(lambda x: x >= 3 and x <= 6)

0     True
1     True
2    False
3    False
4     True
5    False
6     True
7     True
8    False
9     True
dtype: bool

This anonymous function isn't very flexible. Let's create a normal function with two arguments to control the min and max values we want in our Series.

def between(x, low, high):
    return x >= low and x =< high

We can replicate the output of the first function by passing unnamed arguments to args:

s.apply(between, args=(3,6))

Or we can use the named arguments

s.apply(between, low=3, high=6)

Or even a combination of both

s.apply(between, args=(3,), high=6)
NelsonGon
  • 13,015
  • 7
  • 27
  • 57
Ted Petrou
  • 59,042
  • 19
  • 131
  • 136
3
#sample dataframe

import pandas as pd

df1=pd.DataFrame({'a':[3,4,7],'b':[4,2,2]})

#my function

def add_some(p,q,r):return p+q+r

df2=df1[["a","b"]].apply(add_some, args=(3,2))

print(df2)

_ a b

0 8 9

1 9 7

2 12 7

mneumann
  • 713
  • 2
  • 9
  • 42
1

You just need to add comma after arguments, then you will be able to run function on whole list. Example is given below. Same procedure can be done on set.

df = {"name" : [2,3,4,6],
      
      "age" : [4,10, 30, 20]
      }

print("Before")
df = pd.DataFrame(df)

print(df)

def fun(a, b):
    for c in b:
        a +=c
    return a
[![enter image description here][1]][1]

listt = set([3,4,5])

print("After")
new = df.apply(fun, args = (listt,))
print(new)

Result

Faisal shahzad
  • 358
  • 3
  • 9
0

Most of the things are covered in other answers, would like to repeat a thing which you may have missed, you need to add a comma after your arguments in the args tuple, see below example:

df['some_column'].apply(function_name, args=(arg1 ,) #Here comma is necessary.
YoungSheldon
  • 774
  • 6
  • 19