7

Consider the following pandas dataframe:

df = pd.DataFrame({'t': [1,2,3], 'x1': [4,5,6], 'x2': [7,8,9]} )

>>> print(df)
t  x1  x2
0  1   4   7
1  2   5   8
2  3   6   9

I would like to apply a function (say multiplying by 2) to those columns with names containing the character 'x'

This can be done by:

df.filter(regex='x').apply(lambda c: 2*c)

but not in place. My solution is:

tmp = df.filter(regex='x')
tmp = tmp.apply(lambda c: 2*c)
tmp['t'] = df['t']
df = tmp

which has the added problem of changing the order of the columns. Is there a better way?

Ashish Ranjan
  • 5,523
  • 2
  • 18
  • 39
rhz
  • 960
  • 14
  • 29
  • I just up voted your question... you now have enough rep to vote yourself. Feel free to up vote the answer you accepted. – piRSquared Apr 14 '17 at 01:56

2 Answers2

3

IIUC you can do something like this:

In [239]: df.apply(lambda x: x*2 if 'x' in x.name else x)
Out[239]:
   t  x1  x2
0  1   8  14
1  2  10  16
2  3  12  18

UPDATE:

In [258]: df.apply(lambda x: x*2 if 'x' in x.name else x) \
            .rename(columns=lambda x: 'ytext_{}_moretext'.format(x[-1]) if 'x' in x else x)
Out[258]:
   t  ytext_1_moretext  ytext_2_moretext
0  1                 8                14
1  2                10                16
2  3                12                18
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
  • Great. As a twist to the original problem, I also need to rename those columns containing 'x' such that for example 'x1' is renamed as 'ytext_1_moretext' and 'x2' is renamed as 'ytext_2_moretext'. I know how to do this using regular expression subsitutions and df.rename(columns=lambda col: re.sub(... Would that be the right way, or can even this selective column renaming be easily incorporated to your code? – rhz Apr 13 '17 at 22:47
  • 3
    Note that neither of the suggestions are really 'in-place' as requested by the OP. There is still an assignment `df = df.apply(...)` required. – normanius Sep 18 '17 at 16:23
  • @MaxU how to apply your method upto to a certain no of rows – Naveen Apr 25 '18 at 12:45
  • @Naveen, please ask a new question, put there a small reproducible data set and your desired data set... – MaxU - stand with Ukraine Apr 25 '18 at 13:31
  • Unlike @piRSquared's answer, this does not present an inplace solution to the problem as asked by OP – Julio Cezar Silva Apr 08 '20 at 23:55
3

Use df.columns.str.contains('x') to get boolean mask to slice df

df.loc[:, df.columns.str.contains('x')] *= 2
print(df)

   t  x1  x2
0  1   8  14
1  2  10  16
2  3  12  18

More generalized

def f(x):
    return 2 * x

m = df.columns.str.contains('x')
df.loc[:, m] = f(df.loc[:, m])
print(df)

   t  x1  x2
0  1   8  14
1  2  10  16
2  3  12  18

Using apply

m = df.columns.str.contains('x')
df.loc[:, m] = df.loc[:, m].apply(f)
print(df)

   t  x1  x2
0  1   8  14
1  2  10  16
2  3  12  18
piRSquared
  • 285,575
  • 57
  • 475
  • 624