Using pandas.DataFrame.apply to look up and replace values with values from a different DataFrame

Question

I have two pandas DataFrames with the same DateTime index.

The first one is J:

            A     B     C
01/01/10    100   400   200
01/02/10    300   200   400
01/03/10    200   100   300

The second one is K:

             100    200    300    400
01/01/10     0.05  -0.42   0.61  -0.12
01/02/10    -0.23   0.11   0.82   0.34
01/03/10    -0.55   0.24  -0.01  -0.73

I would like to use J to reference K and create a third DataFrame L that looks like:

             A      B      C
01/01/10     0.05  -0.12  -0.42
01/02/10     0.82   0.11   0.34
01/03/10     0.24  -0.55  -0.01

To do so, I need to take each value in J and look up the corresponding value in K where the column name is that value for the same date.

I tried to do:

L = J.apply( lambda x: K.loc[ x.index, x ], axis='index'  )

but get:

ValueError: If using all scalar values, you must pass an index

I would ideally like to use this so that any NaN values contained in J will remain as is, and will not be looked up in K. I had unsuccessfully tried this:

L = J.apply( lambda x: np.nan if np.isnan( x.astype( float ) ) else K.loc[ x.index, x ]  )

Your dataframes J and K have the same amount of rows and columns? — Erfan, Jan 27 '20 at 21:37
They share the same index, so will have the same amount of rows. However, J will have fewer columns than K. — SamC24, Jan 27 '20 at 22:27
Please represent that in your example data. Now they both have the same amount of columns. — Erfan, Jan 27 '20 at 22:38

ansev · Accepted Answer · 2020-01-27T23:44:48.060

Use DataFrame.melt and DataFrame.stack to use DataFrame.join to map the new values, then We return the DataFrame to original shape with DataFrame.pivot:

#if neccesary
#K = K.rename(columns = int)
L = (J.reset_index()
      .melt('index')
      .join(K.stack().rename('new_values'),on = ['index','value'])
      .pivot(index = 'index',
             columns='variable',
             values = 'new_values')
      .rename_axis(columns = None,index = None)
    )
print(L)

Or with DataFrame.lookup

L = J.reset_index().melt('index')
L['value'] = K.lookup(L['index'],L['value'])
L = L.pivot(*L).rename_axis(columns = None,index = None)
print(L)

Output

             A     B     C
01/01/10  0.05 -0.12 -0.42
01/02/10  0.82  0.11  0.34
01/03/10  0.24 -0.55 -0.01

I think that apply could be a good option but I'm not sure, I recommend you see When should I want use apply in my code

Erfan · Answer 2 · 2020-01-27T23:18:55.587

0

Use DataFrame.apply with DataFrame.lookup for label based indexing.

# if needed, convert columns of df2 to integers
# K.columns = K.columns.astype(int)
L = J.apply(lambda x: K.lookup(x.index, x))

             A     B     C
01/01/10  0.05 -0.12 -0.42
01/02/10  0.82  0.11  0.34
01/03/10  0.24 -0.55 -0.01

edited Jan 27 '20 at 23:18

answered Jan 27 '20 at 23:04

Erfan

40,971
8
66
78

Using pandas.DataFrame.apply to look up and replace values with values from a different DataFrame

2 Answers2