2

I realize this question is similar to join or merge with overwrite in pandas, but the accepted answer does not work for me since I want to use the on='keys' from df.join().

I have a DataFrame df which looks like this:

    keys    values
0      0  0.088344
1      0  0.088344
2      0  0.088344
3      0  0.088344
4      0  0.088344
5      1  0.560857
6      1  0.560857
7      1  0.560857
8      2  0.978736
9      2  0.978736
10     2  0.978736
11     2  0.978736
12     2  0.978736
13     2  0.978736
14     2  0.978736

Then I have a Series s (which is a result from some df.groupy.apply()) with the same keys:

keys
0       0.183328
1       0.239322
2       0.574962
Name: new_values, dtype: float64

Basically I want to replace the 'values' in the df with the values in the Series, by keys so every keys block gets the same new value. Currently, I do it as follows:

df = df.join(s, on='keys')
df['values'] = df['new_values']
df = df.drop('new_values', axis=1)

The obtained (and desired) result is then:

    keys    values
0      0  0.183328
1      0  0.183328
2      0  0.183328
3      0  0.183328
4      0  0.183328
5      1  0.239322
6      1  0.239322
7      1  0.239322
8      2  0.574962
9      2  0.574962
10     2  0.574962
11     2  0.574962
12     2  0.574962
13     2  0.574962
14     2  0.574962

That is, I add it as a new column and by using on='keys' it gets the corrects shape. Then I assign values to be new_values and remove the new_values column. This of course works perfectly, the only problem being that I find it extremely ugly.

Is there a better way to do this?

Community
  • 1
  • 1
xyzzyqed
  • 472
  • 4
  • 12

1 Answers1

1

You could try something like:

df = df[df.columns[df.columns!='values']].join(s, on='keys')

Make sure s is named 'values' instead of 'new_values'.

To my knowledge, pandas doesn't have the ability to join with "force overwrite" or "overwrite with warning".

thetainted1
  • 451
  • 3
  • 4