Is replace row-wise and will overwrite the value within the dict twice?

Question

Assuming I have following data set

lst = ['u', 'v', 'w', 'x', 'y']
lst_rev = list(reversed(lst))
dct = dict(zip(lst, lst_rev))

df = pd.DataFrame({'A':['a', 'b', 'a', 'c', 'a'],
                   'B':lst},
                   dtype='category')

Now I want to replace the value of column B in df by dct

I know I can do

df.B.map(dct).fillna(df.B)

to get the expected out put , but when I test with replace (which is more straightforward base on my thinking ), I failed

The out put show as below

df.B.replace(dct)
Out[132]: 
0    u
1    v
2    w
3    v
4    u
Name: B, dtype: object

Which is different from the

df.B.map(dct).fillna(df.B)
Out[133]: 
0    y
1    x
2    w
3    v
4    u
Name: B, dtype: object

I can think that the reason why this happen, But why ?

0    u --> change to y then change to u
1    v --> change to x then change to v
2    w
3    v
4    u

Appreciate your help.

piRSquared · Accepted Answer · 2018-09-25T21:29:18.773

6

It's because replace keeps applying the dictionary

df.B.replace({'u': 'v', 'v': 'w', 'w': 'x', 'x': 'y', 'y': 'Hello'})

0    Hello
1    Hello
2    Hello
3    Hello
4    Hello
Name: B, dtype: object

With the given dct 'u' -> 'y' then 'y' -> 'u'.

edited Sep 25 '18 at 21:29

answered Sep 25 '18 at 21:27

piRSquared

285,575
57
475
624

Is this intended behavior? I couldn't see anything in the docs, or find any issues related to this. – user3483203 Sep 25 '18 at 21:28
If that is case if we want to change three value like `a-b b-c c-a` we can only using map right ? – BENY Sep 25 '18 at 21:29
1

@user3483203 Not sure tbh. Wen, yes. – piRSquared Sep 25 '18 at 21:30
@piRSquared this is interesting behavior , which mean when we do `replace` , pandas restore the intermediate variable somewhere . :-) – BENY Sep 25 '18 at 21:31
1

It must. Kinda like `f = lambda x: dct.get(x, x); while s != s.map(f): s = s.map(f)` – piRSquared Sep 25 '18 at 21:33
2

I submitted an issue on their github page, hopefully I get a response, perhaps it is intended – user3483203 Sep 25 '18 at 21:37
@user3483203 I am not sure about others , I just aware of this behavior .. – BENY Sep 25 '18 at 21:37
Sir would you mind I unaccepted your answer after two days and open a bounty for this question ? – BENY Sep 25 '18 at 21:40

score 5 · Answer 2 · answered Sep 26 '18 at 04:08

This behavior is not intended, and was recognized as a bug.

This is the Github issue that first identified the behavior, and it was added as a milestone for pandas 0.24.0. I can confirm the replacement works as expected in the current version on Github.

Here is the PR containing the fix.

Is replace row-wise and will overwrite the value within the dict twice?

2 Answers2

Linked