0

I have two datasets of different size and I would like to create a new column with values from another dataset if certain conditions are satisfied. Let me illustrate that with an example:

       hour  day month event
df1 =   0     1    1    x1
        0     1    1    x2
        0     1    1    x3
        1     1    1    x4
        2     1    1    x5

So, I have multiple events per hour every day of the year (the dataset contains information for one year). My second dataset contains information about temperature measured every day of the same year every hour.

       hour  day  month  temperature
df2 =   0     1     1       42
        1     1     1       41
        2     1     1       40

I would like to add a new column to the first dataset with the temperature from the second dataset, with repeating values for temperature since we have multiple events per hour. Joining/merging probably won't work since we have different sizes of the dataframes?

I would like to have as result:

       hour  day month event  temperature
df1 =   0     1    1    x1       42
        0     1    1    x2       42
        0     1    1    x3       42
        1     1    1    x4       41
        2     1    1    x5       40

I tried with setting condition df1['hour']==df2['hour'] and df1['day']==df2['day'] and df1['month']==df2['month'] but it didn't work... How can we achieve that? Thank you very much in advance!

Lullaby
  • 143
  • 3

1 Answers1

0

As your first dataframe has more rows than your second one, you can use a left join :

df1.merge(df2,on='hour',how='left')
Ewdlam
  • 875
  • 9
  • 28