2

I have a tibble with a ton of data in it, but most importantly, I have a column that references a row in a lookup table by number (ex. 1,2,3 etc).

df <- tibble(ref = c(1,1,1,2,5)
             data = c(33,34,35,35,32))

lkup <- tibble(CurveID <- c(1,2,3,4,5)
               Slope <- c(-3.8,-3.5,-3.1,-3.3,-3.3)
               Intercept <- c(40,38,40,38,36)
               Min <- c(25,25,21,21,18)
               Max <- c(36,36,38,37,32))

I need to do a calculation for each row in the original tibble based on the information in the referenced row in the lookup table.

df$result <- df$data - lkup$intercept[lkup$CurveID == df$ref]/lkup$slope[lkup$CurveID == df$ref]

The idea is to access the slope or intercept (etc) value from the correct row of the lookup table based on the number in the data table, and to do this for each data point in the column. But I keep getting an error telling me my data isn't compatible, and that my objects need to be of the same length.

ggorlen
  • 44,755
  • 7
  • 76
  • 106
Gilthoniel
  • 47
  • 5
  • 1
    Might be easier just to do this as a merge/join - e.g.: `merge(df, lkup, by.x="ref", by.y="CurveID")` - all the options in various R approaches outlined here - https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right – thelatemail Feb 11 '21 at 22:17

3 Answers3

3

You could also do it with match()

df$result <- df$data - lkup$Intercept[match(df$ref, lkup$CurveID)]/lkup$Slope[match(df$ref, lkup$CurveID)]
df$result
# [1] 43.52632 44.52632 45.52632 45.85714 42.90909
DaveArmstrong
  • 18,377
  • 2
  • 13
  • 25
1

You could use the dplyr package to join the tibbles together. If the ref column and CurveID column have the same name then left_join will combine the two tibbles by the matching rows.

library(dplyr)
df <- tibble(CurveID = c(1,1,1,2,5),
             data = c(33,34,35,35,32))

lkup <- tibble(CurveID = c(1,2,3,4,5),
               Slope = c(-3.8,-3.5,-3.1,-3.3,-3.3),
               Intercept = c(40,38,40,38,36),
               Min = c(25,25,21,21,18),
               Max = c(36,36,38,37,32))

df <- df %>% left_join(lkup, by = "CurveID")

Then do the calcuation on each row

df <- df %>% mutate(result = data - (Intercept/Slope)) %>% 
      select(CurveID, data, result)
Tjn25
  • 685
  • 5
  • 18
0

For completeness' sake, here's one way to literally do what OP was trying:

library(slider)
df %>%
  mutate(result = slide_dbl(ref, ~ slice(lkup, .x)$Intercept /
                                   slice(lkup, .x)$Slope))

though since slice goes by row number, this relies on CurveID equalling the row number (we make no reference to CurveID at all). You can write it differently with filter but it ends up being more code.

meedstrom
  • 341
  • 2
  • 7