Applying lambda to a dataframe but only upto to certain no of rows

Question

My dataframe is like

a = {'A': {0: 40.1, 1: 40.1, 2: 40.1, 3: 45.45, 4: 41.6, 5: 39.6},
     'B': {0: 41.0, 1: 43.6, 2: 41.65, 3: 47.7, 4: 46.0, 5: 42.95},
     'C': {0: 826.0, 1: 835.0, 2: 815.0, 3: 169.5, 4: 170.0, 5: 165.5},
     'D': {0: 889.0, 1: 837.0, 2: 863.3, 3: 178.8, 4: 172.9, 5: 170.0}}

a = pd.DataFrame(a)

#a
       A      B      C      D
0  40.10  41.00  826.0  889.0
1  40.10  43.60  835.0  837.0
2  40.10  41.65  815.0  863.3
3  45.45  47.70  169.5  178.8
4  41.60  46.00  170.0  172.9
5  39.60  42.95  165.5  170.0

I want to divide columns C and D by 5 but only upto 2nd index

With the help of this I came up with

a.apply(lambda x: x/5 if 'C' in x.name or 'D' in x.name else x)

As you have thought, it applies on the whole column.

Any way to apply it only upto the 2nd index and keep them inplace

You mean up to and including index 2? – pault Apr 25 '18 at 15:40 — pault, Apr 25 '18 at 15:40
Yep, upto and including 2nd index – Naveen Apr 25 '18 at 15:51 — Naveen, Apr 25 '18 at 15:51

jezrael · Answer 1 · 2018-04-25T15:26:28.973

4

For default index use loc for select:

a.loc[:2, ['C','D']] /= 5

Detail:

print (a.loc[:2, ['C','D']])
       C      D
0  826.0  889.0
1  835.0  837.0
2  815.0  863.3

General solution for all index values (e.g. DatetimeIndex) use get_indexer for positions by columns names and iloc for select:

a.iloc[:3, a.columns.get_indexer(['C','D'])] /= 5
print (a)
       A      B      C       D
0  40.10  41.00  165.2  177.80
1  40.10  43.60  167.0  167.40
2  40.10  41.65  163.0  172.66
3  45.45  47.70  169.5  178.80
4  41.60  46.00  170.0  172.90
5  39.60  42.95  165.5  170.00

Detail:

print (a.iloc[:3, a.columns.get_indexer(['C','D'])])
       C      D
0  826.0  889.0
1  835.0  837.0
2  815.0  863.3

edited Apr 25 '18 at 15:26

answered Apr 25 '18 at 15:14

jezrael

822,522
95
1,334
1,252

2

Maybe I misunderstood but isn't OP asking for `a.iloc[:2][["C", "D"]] /= 5`? *"I want to divide columns C and D by 5 but only upto 2nd index"* – pault Apr 25 '18 at 15:20
1

Yep, only upto the 2nd index – Naveen Apr 25 '18 at 15:21

pault · Accepted Answer · 2018-04-25T15:40:03.673

IIUC, to divide only columns C and D up to (and including) index 2, you can do:

a.iloc[:3][["C", "D"]] /= 5

Which results in:

       A      B      C       D
0  40.10  41.00  165.2  177.80
1  40.10  43.60  167.0  167.40
2  40.10  41.65  163.0  172.66
3  45.45  47.70  169.5  178.80
4  41.60  46.00  170.0  172.90
5  39.60  42.95  165.5  170.00

The above method is faster than using apply, but here is how to modify your existing code to get the same result:

a.iloc[:3] = a.iloc[:3].apply(lambda x: x/5 if x.name in {"C", "D"} else x)

The difference is that this runs the apply only on a slice of the DataFrame, and assigns the output back to the same slice.

Notice that we slice [:3] because the end index is not included in the slice. More on understanding python's slice notation.

Also, you don't have to check both conditions individually- you can use x.name in {..} to check if x.name is contained in a set. Using a set to test membership is faster than using a list: Python Sets vs Lists .

Applying lambda to a dataframe but only upto to certain no of rows

2 Answers2