0

I do have a dataframe df with several columns like this:

       col1      col2
0  0.627521  0.026832
1  0.470450  0.319736
2  0.015760  0.484664
3  0.645810  0.733688
4  0.850554  0.506945

I want to apply a function to each of these columns and add the results as additional columns (similar to this question) whereby the names are the original names plus a common suffix for all added columns.

I tried the following (highly simplified case):

import pandas as pd
import numpy as np


def do_and_rename(s, s2):

    news = s + s2
    news.name = s.name + "_change"

    return news

df = pd.DataFrame({'col1': np.random.rand(5), 'col2': np.random.rand(5)})

new_df = pd.concat([df, df.apply(lambda x: do_and_rename(x, df.index))], axis=1)

which gives me

       col1      col2      col1      col2
0  0.627521  0.026832  0.627521  0.026832
1  0.470450  0.319736  1.470450  1.319736
2  0.015760  0.484664  2.015760  2.484664
3  0.645810  0.733688  3.645810  3.733688
4  0.850554  0.506945  4.850554  4.506945

The calculations are correct but the column names are wrong.

My desired output would be

       col1      col2  col1_change  col2_change
0  0.627521  0.026832  0.627521  0.026832
1  0.470450  0.319736  1.470450  1.319736
2  0.015760  0.484664  2.015760  2.484664
3  0.645810  0.733688  3.645810  3.733688
4  0.850554  0.506945  4.850554  4.506945

If I just do

do_and_rename(df['col1'], df.index)

I get

0    0.627521
1    1.470450
2    2.015760
3    3.645810
4    4.850554
Name: col1_change, dtype: float64

with the correct name. How can I use these returned names as columns headers?

Cleb
  • 25,102
  • 20
  • 116
  • 151
  • @Zero: Sure, added it. – Cleb Oct 05 '17 at 12:20
  • 1
    Ideally, why not use `df.join(df.add(df.index.values, axis=0).add_suffix('_change'))`? – Zero Oct 05 '17 at 12:24
  • @Zero: that's always the issue with minimal examples. In my actual case I don't want to do just an addition but apply a more complicated function. – Cleb Oct 05 '17 at 12:49
  • Feel free to add it as solution; will be happy to upvote it if it works fine :) – Cleb Oct 05 '17 at 12:51

3 Answers3

2

For me working:

new_df = pd.concat([df] + [do_and_rename(df[x], df.index) for x in df], axis=1)
print (new_df)
       col1      col2  col1_change  col2_change
0  0.364028  0.694481     0.364028     0.694481
1  0.457195  0.813740     1.457195     1.813740
2  0.286694  0.133999     2.286694     2.133999
3  0.130283  0.398216     3.130283     3.398216
4  0.694586  0.936815     4.694586     4.936815
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

If you don't want to make a new DataFrame, you can just do this:

for col in df:
    df[col + '_change'] = df[col] + df.index
Evan Nowak
  • 895
  • 4
  • 8
1

You could use df.join(your_func(df, args ...,).add_suffix('_change')) pattern. Where, your_func returns your modified dataframe

In [1459]: def your_func(df, s):
      ...:     dff = df.add(s, axis=0)
      ...:     return dff
      ...:

In [1460]: df.join(your_func(df, df.index.values).add_suffix('_change'))
Out[1460]:
       col1      col2  col1_change  col2_change
0  0.627521  0.026832     0.627521     0.026832
1  0.470450  0.319736     1.470450     1.319736
2  0.015760  0.484664     2.015760     2.484664
3  0.645810  0.733688     3.645810     3.733688
4  0.850554  0.506945     4.850554     4.506945

In [1461]: df
Out[1461]:
       col1      col2
0  0.627521  0.026832
1  0.470450  0.319736
2  0.015760  0.484664
3  0.645810  0.733688
4  0.850554  0.506945
Zero
  • 74,117
  • 18
  • 147
  • 154