43

I have read the docs of DataFrame.apply

DataFrame.apply(func, axis=0, broadcast=False, raw=False, reduce=None, args=(), **kwds)¶ Applies function along input axis of DataFrame.

So, How can I apply a function to a specific column?

In [1]: import pandas as pd
In [2]: data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
In [3]: df = pd.DataFrame(data)
In [4]: df
Out[4]: 
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9
In [5]: def addOne(v):
...:        v += 1
...:        return v
...: 
In [6]: df.apply(addOne, axis=1)
Out[6]: 
   A  B   C
0  2  5   8
1  3  6   9
2  4  7  10

I want to addOne to every value in df['A'], not all columns. How can I do that with DataFrame.apply.

Thanks for help!

GoingMyWay
  • 16,802
  • 32
  • 96
  • 149
  • 1
    Avoid using `apply` as much as possible. If you're not sure you need to use it, you probably don't. I recommend taking a look at [When should I ever want to use pandas apply() in my code?](https://stackoverflow.com/q/54432583/4909087). – cs95 Jan 30 '19 at 10:22
  • 1
    @coldspeed That is nice, good question and answers in depth. – GoingMyWay Jan 30 '19 at 11:34

4 Answers4

60

The answer is,

df['A'] = df['A'].map(addOne)

and maybe you would be better to know about the difference of map, applymap, apply.

but if you insist to use apply, you could try like below.

def addOne(v):
    v['A'] += 1
    return v

df.apply(addOne, axis=1)
su79eu7k
  • 7,031
  • 3
  • 34
  • 40
19

One simple way would be:

df['A'] = df['A'].apply(lambda x: x+1)
gustafbstrom
  • 1,622
  • 4
  • 25
  • 44
Felix Feng
  • 281
  • 3
  • 7
  • I did your suggestion by doing: df['A'] = df['A'].apply(lambda x: datetime.fromtimestamp(float(x)/1000.)) and I got: "A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead. " Any suggestions? – Catarina Nogueira Apr 10 '20 at 11:42
  • 1
    @Catarina Nogueira Try adding .copy() at the very end e.g. apply(...).copy() – Nosey Jul 02 '20 at 14:40
  • I don't think this is a good solution. You're mutating the DataFrame, while iterating over itself. I would frist make a copy of the DataFrame. See here: https://pandas.pydata.org/docs/user_guide/gotchas.html#gotchas-udf-mutation – Paul Jul 04 '23 at 13:06
  • 1
    @Paul Good Suggestion. Making a copy before do UDF function can aviod some unexpected behavior. – Felix Feng Jul 05 '23 at 02:55
4

For anyone else looking for a solution that allows for pipe-ing:

identity = lambda x: x

def transform_columns(df, mapper):
    return df.transform(
        {
            **{
                column: identity
                for column in df.columns
            },
            **mapper
        }
    )

# you can monkey-patch it on the pandas DataFrame (but don't have to, see below)
pd.DataFrame.transform_columns = transform_columns

(
    pd.DataFrame(data)
    .rename(columns={'A': 'A1'})   # just to demonstrate the motivation
    .transform_columns({'A1': add_one})
)

This also allows to:

pd.DataFrame(data).transform_columns({
    'A': add_one,
    'B': add_two,
})

And if you do not want to monkey-patch DataFrame, you can always use it with pipe:

pd.DataFrame(data).pipe(transform_columns, {'A': add_one})

It would be great if this was naively supported by pandas though.

The snippets above are CC0.

krassowski
  • 13,598
  • 4
  • 60
  • 92
2

you can use .apply() with lambda function to solve this kind of problems.

Consider, your dataframe is something like this,

A | B | C
----------
1 | 4 | 7
2 | 5 | 8
3 | 6 | 9

The function which you want to apply:

def addOne(v):
v += 1
return v

So if you write your code like this,

df['A'] = df.apply(lambda x: addOne(x.A), axis=1)

You will get:

A | B | C
----------
2 | 4 | 7
3 | 5 | 8
4 | 6 | 9
Tejas Shah
  • 21
  • 3