1

My dataframe is like

a = {'A': {0: 40.1, 1: 40.1, 2: 40.1, 3: 45.45, 4: 41.6, 5: 39.6},
     'B': {0: 41.0, 1: 43.6, 2: 41.65, 3: 47.7, 4: 46.0, 5: 42.95},
     'C': {0: 826.0, 1: 835.0, 2: 815.0, 3: 169.5, 4: 170.0, 5: 165.5},
     'D': {0: 889.0, 1: 837.0, 2: 863.3, 3: 178.8, 4: 172.9, 5: 170.0}}

a = pd.DataFrame(a)

#a
       A      B      C      D
0  40.10  41.00  826.0  889.0
1  40.10  43.60  835.0  837.0
2  40.10  41.65  815.0  863.3
3  45.45  47.70  169.5  178.8
4  41.60  46.00  170.0  172.9
5  39.60  42.95  165.5  170.0

I want to divide columns C and D by 5 but only upto 2nd index

With the help of this I came up with

a.apply(lambda x: x/5 if 'C' in x.name or 'D' in x.name else x)

As you have thought, it applies on the whole column.

Any way to apply it only upto the 2nd index and keep them inplace

Naveen
  • 485
  • 1
  • 5
  • 14

2 Answers2

4

For default index use loc for select:

a.loc[:2, ['C','D']] /= 5

Detail:

print (a.loc[:2, ['C','D']])
       C      D
0  826.0  889.0
1  835.0  837.0
2  815.0  863.3

General solution for all index values (e.g. DatetimeIndex) use get_indexer for positions by columns names and iloc for select:

a.iloc[:3, a.columns.get_indexer(['C','D'])] /= 5
print (a)
       A      B      C       D
0  40.10  41.00  165.2  177.80
1  40.10  43.60  167.0  167.40
2  40.10  41.65  163.0  172.66
3  45.45  47.70  169.5  178.80
4  41.60  46.00  170.0  172.90
5  39.60  42.95  165.5  170.00

Detail:

print (a.iloc[:3, a.columns.get_indexer(['C','D'])])
       C      D
0  826.0  889.0
1  835.0  837.0
2  815.0  863.3
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

IIUC, to divide only columns C and D up to (and including) index 2, you can do:

a.iloc[:3][["C", "D"]] /= 5

Which results in:

       A      B      C       D
0  40.10  41.00  165.2  177.80
1  40.10  43.60  167.0  167.40
2  40.10  41.65  163.0  172.66
3  45.45  47.70  169.5  178.80
4  41.60  46.00  170.0  172.90
5  39.60  42.95  165.5  170.00

The above method is faster than using apply, but here is how to modify your existing code to get the same result:

a.iloc[:3] = a.iloc[:3].apply(lambda x: x/5 if x.name in {"C", "D"} else x)

The difference is that this runs the apply only on a slice of the DataFrame, and assigns the output back to the same slice.

Notice that we slice [:3] because the end index is not included in the slice. More on understanding python's slice notation.

Also, you don't have to check both conditions individually- you can use x.name in {..} to check if x.name is contained in a set. Using a set to test membership is faster than using a list: Python Sets vs Lists .

pault
  • 41,343
  • 15
  • 107
  • 149