- generated some sample data with df1 and df3 different sizes.
- your core issue - need a left join not an outer join
- you can also just use column names as left_on / right_on parameters
- due to fact I have duplicate column names in sample data I also uses suffixes parameter
df1 = pd.DataFrame(
{"id": range(300), "name": np.random.choice(list("abcdefghijkjlmnopqrstuvwxyz"), 300)}
)
df3 = pd.DataFrame(
{
"id": range(35),
"name": np.random.choice(list("abcdefghijkjlmnopqrstuvwxyz"), 35),
"present_in_old": np.random.choice(["yes", "no", "maybe"], 35),
}
)
df3["old_result"] = np.where(
(df3["present_in_old"] == "yes"),
df3.merge(df1, left_on="id", right_on="id", suffixes=("_left", ""), how="left")["name"],
None,
)