0

I have a tibble df and I'm trying to fill NAs using lm function. The filling must be done taking into account the variable t.

t = rep(c('a', 'b'), rep(4, 2))
v1 = c(1, 2, NA, 4, 5, NA, 7, 8)
v2 = runif(8)
v3 = rnorm(8)
v4 = rnorm(8)
v5 = c(NA, 5, 3, 1, NA, NA, 10, 0)

df = tibble(t, v1, v2, v3, v4, v5)

To do this quickly I tried using the for loop. As can be seen below:

vars = c('v1', 'v5')

for (n in vars){
  for (i in c('a', 'b')) {
    fit = lm(rlang::sym(n) ~ v3 + v2, df[df$t==i, ])
  df %>%
    mutate(pred = predict(fit,.))%>%
    mutate( rlang::sym(n) = ifelse(is.na(rlang::sym(n) )& t==i, pred, rlang::sym(n)  ))

}}

The main idea is replace n by v1 and v5. But this code doesn't work. Instead rlang::sym(n) I tried get(n), !!as.name(n) as stated Use dynamic variable names in dplyr . I'm using dplyr version 1.04.

How can I get mutate(), ifelse and lm to use dynamic name as variable name?

Alien
  • 116
  • 8

1 Answers1

1

You need to wrap expr() around (!!rlang::sym(n) ~ v3 + v2). Next you need to evaluate the LHS in your mutate call before your RHS. This is done with !! operator and :=. However, code is generating warnings.

t = rep(c('a', 'b'), rep(4, 2))
v1 = c(1, 2, NA, 4, 5, NA, 7, 8)
v2 = runif(8)
v3 = rnorm(8)
v4 = rnorm(8)
v5 = c(NA, 5, 3, 1, NA, NA, 10, 0)

df = tibble(t, v1, v2, v3, v4, v5)

vars = c('v1', 'v5')

for (n in vars){
  for (i in c('a', 'b')) {
    fit = lm(expr(!!rlang::sym(n) ~ v3 + v2), df[df$t==i, ])
    df %>%
      mutate(pred = predict(fit,.))%>%
      mutate( !!n := ifelse(is.na(n )& t==i, pred, n))
  }}
tacoman
  • 882
  • 6
  • 10