pandas DataFrame, how to apply function to a specific column?

Question

I have read the docs of DataFrame.apply

DataFrame.apply(func, axis=0, broadcast=False, raw=False, reduce=None, args=(), **kwds)¶ Applies function along input axis of DataFrame.

So, How can I apply a function to a specific column?

In [1]: import pandas as pd
In [2]: data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
In [3]: df = pd.DataFrame(data)
In [4]: df
Out[4]: 
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9
In [5]: def addOne(v):
...:        v += 1
...:        return v
...: 
In [6]: df.apply(addOne, axis=1)
Out[6]: 
   A  B   C
0  2  5   8
1  3  6   9
2  4  7  10

I want to addOne to every value in df['A'], not all columns. How can I do that with DataFrame.apply.

Thanks for help!

Avoid using `apply` as much as possible. If you're not sure you need to use it, you probably don't. I recommend taking a look at [When should I ever want to use pandas apply() in my code?](https://stackoverflow.com/q/54432583/4909087). — cs95, Jan 30 '19 at 10:22
@coldspeed That is nice, good question and answers in depth. — GoingMyWay, Jan 30 '19 at 11:34

su79eu7k · Accepted Answer · 2019-11-21T12:45:05.560

60

The answer is,

df['A'] = df['A'].map(addOne)

and maybe you would be better to know about the difference of map, applymap, apply.

but if you insist to use apply, you could try like below.

def addOne(v):
    v['A'] += 1
    return v

df.apply(addOne, axis=1)

edited Nov 21 '19 at 12:45

answered Mar 25 '16 at 03:14

su79eu7k

7,031
3
34
40

1

`can we apply same function at a time on both A and B.` – dondapati Aug 25 '18 at 03:39
2

@dondapati Sure, you can simply add v['B'] += 1 inside addOne function. Pandas apply function gets each row as v when axis=1. – su79eu7k Aug 25 '18 at 03:46

score 19 · Answer 2 · edited Feb 06 '19 at 09:56

19

One simple way would be:

df['A'] = df['A'].apply(lambda x: x+1)

edited Feb 06 '19 at 09:56

gustafbstrom

1,622
4
25
44

answered Feb 20 '17 at 08:22

Felix Feng

281
3
7

I did your suggestion by doing: df['A'] = df['A'].apply(lambda x: datetime.fromtimestamp(float(x)/1000.)) and I got: "A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead. " Any suggestions? – Catarina Nogueira Apr 10 '20 at 11:42
1

@Catarina Nogueira Try adding .copy() at the very end e.g. apply(...).copy() – Nosey Jul 02 '20 at 14:40
I don't think this is a good solution. You're mutating the DataFrame, while iterating over itself. I would frist make a copy of the DataFrame. See here: https://pandas.pydata.org/docs/user_guide/gotchas.html#gotchas-udf-mutation – Paul Jul 04 '23 at 13:06
1

@Paul Good Suggestion. Making a copy before do UDF function can aviod some unexpected behavior. – Felix Feng Jul 05 '23 at 02:55

score 4 · Answer 3 · answered Mar 12 '21 at 12:19

For anyone else looking for a solution that allows for pipe-ing:

identity = lambda x: x

def transform_columns(df, mapper):
    return df.transform(
        {
            **{
                column: identity
                for column in df.columns
            },
            **mapper
        }
    )

# you can monkey-patch it on the pandas DataFrame (but don't have to, see below)
pd.DataFrame.transform_columns = transform_columns

(
    pd.DataFrame(data)
    .rename(columns={'A': 'A1'})   # just to demonstrate the motivation
    .transform_columns({'A1': add_one})
)

This also allows to:

pd.DataFrame(data).transform_columns({
    'A': add_one,
    'B': add_two,
})

And if you do not want to monkey-patch DataFrame, you can always use it with pipe:

pd.DataFrame(data).pipe(transform_columns, {'A': add_one})

It would be great if this was naively supported by pandas though.

The snippets above are CC0.

score 2 · Answer 4 · answered Sep 25 '20 at 12:02

you can use .apply() with lambda function to solve this kind of problems.

Consider, your dataframe is something like this,

A | B | C
----------
1 | 4 | 7
2 | 5 | 8
3 | 6 | 9

The function which you want to apply:

def addOne(v):
v += 1
return v

So if you write your code like this,

df['A'] = df.apply(lambda x: addOne(x.A), axis=1)

You will get:

A | B | C
----------
2 | 4 | 7
3 | 5 | 8
4 | 6 | 9

pandas DataFrame, how to apply function to a specific column?

4 Answers4

Linked