I have two dataframes of different sizes:
df1<-data.frame(Chr = c(1, 1,2,3,4),
Start = c(15,120, 210,210,450),
End = c(15,130, 210,210,450),
Gene=c("gene1","gene2","gene3","gene3","gene3"),
sample_id=c("ss6","ss7","ss9","ss9","ss10"))
df2 <- data.frame(Chr = c(1, 1,3),
Start = c(10,100, 200),
End = c(50,200, 250),
Gene=c("gene1","gene2","gene3"),
sample_id=c("ss1","ss1","ss1"))
I would like to take the Start from df1 and check to see if it is between the range of Start-End of df2 whilst at the same time making sure the Chr is the same (the sample_id does not have to match). If it is then add a column to df1 ideally with df2$sample_id but if this is not possible then YES (or NA for no match). It is similar to this question but I also need to match 'Chr' Only checking range
It is also similar to this question and I know it should be easier as I don't want to match respective rows Check if column value is in between (range) of two other column values
I have tried:
df1 %>%
mutate(no_coverage_in = case_when(df2$Start <= Start & df2$End >=Start & Chr == df2$Chr ~ df2$sample_id ))
But it complains
longer object length is not a multiple of shorter object length