0

I would like to lookup values between two data tables with a condition.

In the first table, I would like to create a new column "Return" and lookup the return for each identifier on a specified end date.

Desired output: lookup the return for all identifiers in Table 2 on the end date 28/02/2006.

Hope it is clear. Cheers!

Table 1:

Identifier
AA1
BX2
...

Table 2:

Date (DD/MM/YYYY) Identifier Return
31/01/2006 AA1 -2.0
28/02/2006 AA1 4.0
31/03/2006 AA1 1.0
31/01/2006 BX2 3.0
28/02/2006 BX2 -5.0
31/03/2006 BX2 -8.0
... ... ...

Desired output

Identifier Return 28/02/2006
AA1 4.0
BX2 -5.0

I have searched around for a solution but I have not found an idea how to solve the problem of specifying a condition.

  • library(dplyr); library(tidyr); table1 %>% left_join(table2 %>% filter(Date == "28/02/2006") %>% select(-Date)) should do the trick. Generally it would be much easier to help you if you provided a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). Adding a MRE and an example of the desired output (in code form, not tables and pictures) makes it much easier for others to find and test an answer to your question. – dario Jan 18 '23 at 15:27
  • Please provide enough code so others can better understand or reproduce the problem. – Community Jan 18 '23 at 15:28
  • 1
    `merge(df1, df2[df2$Date == "28/02/2006", ], by = "Identifier", all.x = TRUE)` – Darren Tsai Jan 18 '23 at 15:28
  • Thanks for some great responses @DarrenTsai and Dario. Apparently both of your replies work when I am using R.studio in the cloud version, but not on the installed version. The merge or left join execute fine, but it simply returns N/A responses for all my identifiers – mads_horsted Jan 19 '23 at 10:28
  • I think your real data is not like what you described in the question. You need to share your data to us by : `dput(head(df2))` – Darren Tsai Jan 19 '23 at 13:14

1 Answers1

0

Are you trying to do something like this?

library(dplyr)

table1 <- data.frame(
  Identifier = c("AA1", "BX2")
)

table2 <- data.frame(
  Date = c("31/01/2006", "28/02/2006", "31/03/2006", "31/01/2006", "28/02/2006", "31/03/2006"),
  Identifier = c("AA1", "AA1", "AA1", "BX2", "BX2", "BX2"),
  Return = c(-2.0, 4.0, 1.0, 3.0 ,-5.0 ,-8.0)
)

table2 %>%
  filter(Date == "28/02/2006") %>%
  right_join(x = table1, y = ., by = "Identifier")
  • The code works great using left_join on the R.studio cloud version, but somehow not on the installed version. The output has the correct format, i.e. rows and columns, but the values that are pulled from table2 are simply N/A on the installed version of R.studio. Any suggestions why this is the case? Thanks for the help! – mads_horsted Jan 19 '23 at 10:32
  • In my example, I used the `right_join()`. not `left_join()`. Have you tried with both ? – Alex Yahiaoui Martinez Jan 19 '23 at 12:29