1

I am trying to manipulate two datasets filled with:

  • 728 rows and 365 columns. Data are average daily temperatures measured every day of the year.
  • 938 rows and 365 columns. Data are average daily temperatures measured every day of the year

Dataset 1 looks like this

FUA_CODE               01-01-2018   02-01-2018 ...

IT001L1  --> Milano     290.02020    289.1114   ...
IT002L3  --> Roma       281.20203    288.1235   ...
IT003L4  --> Napoli     287.03030    287.3121   ...
...

Dataset 2 looks like this

URAU_CODE     FUA_CODE                         01-01-2018   02-01-2018 ...

IT001C1       IT001L1 --> Milano                  A             B       ...
IT002C1       IT001L1 --> town outside Milano    ...           ...      ...
IT003C1       IT001L1 --> town2 outside Milano   ...           ...       ...
IT004C1       IT002L3 --> Roma                    C             D
IT005C1       IT002L3 --> town outside Roma      ...           ...
IT006C1       IT002L3 --> town2 outside Roma     ...           ...
IT007C1       IT003L4 --> Napoli                  E             F
IT008C1       IT003L4 --> town outside Napoli    ...           ...
IT009C1       IT003L4 --> town2 outside Napoli   ...           ...
              ...

My task is to merge these two datasets and calculate, for each day, the difference between the temperatures of a city (ex. Milano) and the temperatures of the same city in the other dataset.

Ideally, the result should look like

FUA_CODE                   01-01-2018        02-01-2018      ...

IT001L1  --> Milano     290.02020  -  A       289.1114 - B   ...
IT002L3  --> Roma       281.20203  -  C       288.1235 - D   ...
IT003L4  --> Napoli     287.03030  -  E       287.3121 - F   ...
...

What functions can I use?

Many thanks

nflore
  • 196
  • 1
  • 10
  • 1
    Please add data using `dput` or something that we can copy and use. Read about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and [how to give a reproducible example](http://stackoverflow.com/questions/5963269). – Ronak Shah Oct 30 '20 at 13:58

1 Answers1

0

you can join the df first then use summarise to calculate the value.

you can see here to join dataframe, and here to calculate it

jolii
  • 97
  • 8
  • Ok, thanks. Lastly, how do I overcome the fact that in the second dataset there are more individuals with the same code? When I use `summarise` I just want to subtract "Milano" with "Milano" and not "Milano" with all the cities with the same code. – nflore Nov 02 '20 at 14:21
  • i think you should make 1 more column as an identifier, and extract `IT001L1 --> Milano` from `fua_code` column. and then you can use `group_by` and `summarise` – jolii Nov 03 '20 at 07:47