0

I have two data.tables in R.

Table A has ID_A, days, and group.

Table B has ID_B, days, group, and value_of_interest.

I'm trying to add a column to A, max_value_of_interest, where the value is the maximum of the value_of_interest in all rows of a group where the days in B is greater than days in table A.

I'll try to describe it another way:

Table A:

ID_A    days    group
A1      5       X

I want to add a column to A containing the maximum value_of_interest from B, where the maximum value is chosen from B where B.group=X and B.days > 5 (greater than the value in row A1).

I've found solutions for finding the maximum by group, but I'm having trouble figuring out how to add in a condition to consider only values where B.days by group > A.days.

I'm not sure of the best way to approach this. I'd appreciate any help.

Sarah
  • 516
  • 10
  • 35

1 Answers1

0

It might be easiest to loop through the rows of Table A. For each row, select the relevant rows of B, then find the max value.

library(tidyverse)
A <- tibble(ID_A=paste("A", 1:5, sep=""), 
            days=seq(5,1,-1), 
            group=c("X", "X", "X", "Y", "Y"),
            max_val=NA)
B <- tibble(ID_B=paste("A", 1:5, sep=""), 
            days=seq(3,7,1), 
            group=c("X", "X", "X", "Y", "Y"),
            val=runif(5))

for (i in 1:nrow(A)){
  B_sel <- B %>%
    filter(group==A$group[i] & days>A$days[i]) 
  if (nrow(B_sel)>0)
    A$max_val[i] <- max(B_sel$val)
}

or

for (i in 1:nrow(A)){
  rows <- which(B$group==A$group[i] & B$days>A$days[i]) 
  if (length(rows)>0)
    A$max_val[i] <- max(B$val[rows])
}
Simon Woodward
  • 1,946
  • 1
  • 16
  • 24