I have a question regarding efficiently populating an R dataframe based on data retrieved from another dataframe.
So my input typically looks like:
dfInput <- data.frame(start = c(1,6,17,29), end = c(5,16,28,42), value = c(1,2,3,4))
start end value
1 5 1
6 16 2
17 28 3
29 42 4
I want to find the min and max values in cols 1 and 2 and create a new dataframe with a row for each value in that range:
rangeMin <- min(dfInput$start)
rangeMax <- max(dfInput$end)
dfOutput <- data.frame(index = c(rangeMin:rangeMax), value = 0)
And then populate it with the appropriate "values" from the input dataframe:
for (i in seq(nrow(dfOutput))) {
lookup <- dfOutput[i,"index"]
dfOutput[i, "value"] <- dfInput[which(dfInput$start <= lookup &
dfInput$end >= lookup),"value"]
}
This for-loop achieves what I want to do, but it feels like this is a very convoluted way to do it.
Is there a way that I can do something like:
dfOutput$value <- dfInput[which(dfInput$start <= dfOutput$index &
dfInput$end >= dfOutput$index),"value"]
Or something else to populate the values when instantiating dfOutput.
I feel like this is pretty basic but I'm new to R, so many thanks for any help!