0

I am new to python, could anyone help me on the below? I have two data frames (DF1 & DF2) like below,

DF1:
 project_ID  dataID#
 AAA         dataset_01
 BBB         dataset_02
 CCC         dataset_01
 DDD         dataset_02

DF2:
dataID#     Items

 dataset_01  Apple
 dataset_01  Orange
 dataset_02  banana
 dataset_02  Grape

Each "dataID" has list of "Items". Basically i want to create new data frame to list the "Items" (from DF2) based on the dataID# for each project_ID (from DF1) I want to have the output something like below (new data frame (DF3)) something like this,

project_ID    dataID#      Items
 AAA         dataset_01   Apple
 AAA         dataset_01   Orange
 BBB         dataset_02   banana
 BBB         dataset_02   Grape
 CCC         dataset_01   Apple
 CCC         dataset_01   Orange
 DDD         dataset_02   banana
 DDD         dataset_02   Grape

Thank you

awadhesh pathak
  • 121
  • 1
  • 4

1 Answers1

0

You are looking for a merge operation, specifically something called a leftjoin or leftouterjoin.

In Pandas, you can do it like this.

df1 = df1.merge(df2[["dataID#", "Items"]], on="dataID#", how="left")

Documentation: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html

P.S. Formatting your dataframe in columns would be highly appreciated.

Dustin
  • 483
  • 3
  • 13