0

is the merge function the appropriate way to join the Dataframes 1 and 2 to get the Desired Dataframe?

Dataframe 1:

animal category Add-Date
wolf land 24/09/22
eagle sky 24/09/22
robin sky 24/09/22
bear land 24/09/22
cod water 24/09/22
salmon water 24/09/22

Dataframe 2:

category Tier
land 1
sky 2

Desired Dataframe:

animal category Add-Date Tier
wolf land 24/09/22 1
eagle sky 24/09/22 2
robin sky 24/09/22 2
bear land 24/09/22 1

The Desired Dataframe is Dataframe 1 with irrelevant categories removed and the appropriate Tier brought across with the category.

I was trying merge, join, etc. but unsure of the best approach or if I am making an error. Any help much appreciated on the code or method to use.

Andrew
  • 115
  • 1
  • 6
  • Does this answer your question? [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) – BigBen Aug 31 '22 at 17:36
  • Also, separately - What if the columns were named differently? E.g. if in Dataframe 1 was 'Location' instead of 'category' and Dataframe 2 was 'Prop_Location' instead of 'category'? – Andrew Aug 31 '22 at 17:44

2 Answers2

1
df_merged = pd.merge(df1, df2, how = 'left', on = 'category')

out:

   animal category  Add-Date  Tier
0    wolf     land  24/09/22   1.0
1   eagle      sky  24/09/22   2.0
2   robin      sky  24/09/22   2.0
3    bear     land  24/09/22   1.0
4     cod    water  24/09/22   NaN
5  salmon    water  24/09/22   NaN

you can drop NaN values if needed by

df_merged = pd.merge(df1, df2, how = 'left', on = 'category').dropna()

out:

  animal category  Add-Date  Tier
0   wolf     land  24/09/22   1.0
1  eagle      sky  24/09/22   2.0
2  robin      sky  24/09/22   2.0
3   bear     land  24/09/22   1.0
NoobVB
  • 989
  • 6
  • 10
1

You can perfom an inner join by using pandas.merge to grab the Tier column from df2.

out = df1.merge(df2, on='category')

display(out)

enter image description here

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Timeless
  • 22,580
  • 4
  • 12
  • 30
  • Thanks, will test this in a moment. What if the columns were named differently? E.g. if in Dataframe 1 was 'Location' instead of 'category' and Dataframe 2 was 'Prop_Location' instead of 'category'? – Andrew Aug 31 '22 at 17:43
  • In that case, you'll need to pass two more arguments, `out = df1.merge(df2, left_on='Location', right_on='Prop_Location')` – Timeless Aug 31 '22 at 19:42