-1

I need help with following join in Pandas :

My first table has duplicate date and second got unique. When I merged the two tables by Date, second table gets duplicate values instead of first match and rest should be NaN.

Does anyone knows how to do it with Python?

enter image description here

Jérôme Richard
  • 41,678
  • 6
  • 29
  • 59
bob kevin
  • 9
  • 1
  • 1
    Welcome to SO. Please see [this guide](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and edit your question properly. Don't forget to include **your effort**. – Quang Hoang Mar 10 '21 at 15:40
  • You can do it with Python, of course, just not with pandas. What you're asking for is not how a 'join' works. – Tim Roberts Mar 10 '21 at 23:19

1 Answers1

0

Do a merge() on Date and x column:

import pandas as pd

df1 = pd.DataFrame({'Date': ['2-Jul', '2-Jul', '3-Jul'],
                    'x': ['Bob', 'Bob', 'Alice'],
                    'y': [5, 9, 7]})

df2 = pd.DataFrame({'Date': ['2-Jul', '3-Jul'],
                    'x': ['Bob', 'Alice'],
                    'z': [2, 8]})

df3 = pd.merge(df1, df2, on=['Date', 'x'])
# print(df3)
    Date      x  y  z
0  2-Jul    Bob  5  2
1  2-Jul    Bob  9  2
2  3-Jul  Alice  7  8

pandas.DataFrame.duplicated() returns boolean Series denoting duplicate rows. keep=first marks duplicates as True except for the first occurrence. The default value of keep is first, so you can omit this.

pandas.DataFrame.mask() replaces values where the condition is True.

df3['z'].mask(df3.duplicated(subset=['Date', 'x'], keep='first'), inplace=True)
# print(df3)
    Date      x  y    z
0  2-Jul    Bob  5  2.0
1  2-Jul    Bob  9  NaN
2  3-Jul  Alice  7  8.0
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
  • Thank you so much! you save me. I appreciated a lot!!!!!!!!! Your are the best :) – bob kevin Mar 11 '21 at 05:09
  • @bobkevin **If your question is solved**, say thank you by ***accepting** the solution that is **best for your needs***. The **accept check** is below the up/down arrow at the top left of the answer. A new solution can be accepted if a better one shows up. If you have a reputation of 15 or greater, you may also vote on the quality of an answer, with the up or down arrow. **Leave a comment if a solution doesn't answer the question**. [What should I do when someone answers my question](https://stackoverflow.com/help/someone-answers)?. Thank you. – Ynjxsjmh Mar 11 '21 at 05:31