0

Hi I am trying to merge two dataframes inside np.where but getting error. How to achive df.merge() , What am i doing wrong ?

Code:

df3['old_result'] = np.where((df3['present_in_old'] == 'yes'), df3.merge(df1,left_on=(df3['id']), right_on = (df1['id']), how = 'outer')['name'],None)
  • Hi @LearnerBegineer, can you [post your dataframes](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) `df1` and `df3` ? Either the output of a `print` or code to produce them. – Cimbali Jun 19 '21 at 16:16

1 Answers1

1
  • generated some sample data with df1 and df3 different sizes.
  • your core issue - need a left join not an outer join
  • you can also just use column names as left_on / right_on parameters
  • due to fact I have duplicate column names in sample data I also uses suffixes parameter
df1 = pd.DataFrame(
    {"id": range(300), "name": np.random.choice(list("abcdefghijkjlmnopqrstuvwxyz"), 300)}
)
df3 = pd.DataFrame(
    {
        "id": range(35),
        "name": np.random.choice(list("abcdefghijkjlmnopqrstuvwxyz"), 35),
        "present_in_old": np.random.choice(["yes", "no", "maybe"], 35),
    }
)


df3["old_result"] = np.where(
    (df3["present_in_old"] == "yes"),
    df3.merge(df1, left_on="id", right_on="id", suffixes=("_left", ""), how="left")["name"],
    None,
)

Rob Raymond
  • 29,118
  • 3
  • 14
  • 30