3

I want to make a function that would apply tidyr::complete to all non-numeric columns of an R data.frame. Value zero should be inserted to the new value rows. I understand that this requires standard evaluation solution, but I've thus far had no success.

Here is what I have thus far:

completeDf <- function(df){

      vars <- names(df)

      chVars <- vars[!(sapply(df, is.numeric))]
      nmVars <- vars[!(vars %in% chVars)]

      quoChVars <- quos(chVars)

      nmList <- vector("list", length(nmVars))
      nmList <- setNames(lapply(nmList, function(x) x <- 0), nmVars)
      quoNmVars <- quos(nmList)

      df <- df %>%
            complete(!!!quoChVars, fill = !!!quoNmVars)
}

Any idea of how to make this work?

Antti
  • 1,263
  • 2
  • 16
  • 28

1 Answers1

3

1) rlang/tidyreval Use !!!syms(notnum_names) to insert the variable names as complete arguments. Fill is just an ordinary list and no rlang/tidyeval computations are needed for it.

library(dplyr)
library(tidyr)
library(rlang)

completeDF <- function(data) {
  is_num <- sapply(data, is.numeric)
  num_names <- names(data)[ is_num ]
  notnum_names <- names(data)[ !is_num ]
  fill <- Map(function(x) 0, num_names)
  data %>% complete(!!!syms(notnum_names), fill = fill)
}

DF <- data.frame(a = c("A", "B", "B"), b = c("a", "a", "b"), c = 1:3) # test data
completeDF(DF)

giving:

# A tibble: 4 x 3
       a      b     c
  <fctr> <fctr> <dbl>
1      A      a     1
2      A      b     0
3      B      a     2
4      B      b     3

Here is the original code from the question modified to make it work. The changed lines are marked with ## at the end of each.

completeDf <- function(df){

      vars <- names(df)

      chVars <- vars[!(sapply(df, is.numeric))]
      nmVars <- vars[!(vars %in% chVars)]

      symsChVars <- rlang::syms(chVars) ##

      nmList <- vector("list", length(nmVars))
      nmList <- setNames(lapply(nmList, function(x) 0), nmVars) ##
      # quoNmVars <- quos(nmList ##

      df %>% ##
            complete(!!!symsChVars, fill = nmList) ##
}

completeDf(DF)

2) wrapr An alternative to rlang/tidyeval is the wrapr package.

The code here is the same as in (1) except we use library(wrapr) instead of library(rlang) and the last line of completeDF is replaced with a let statement giving completeDF2.

library(dplyr)
library(tidyr)
library(wrapr)

completeDF2 <- function(data) {
  is_num <- sapply(data, is.numeric)
  num_names <- names(data)[ is_num ]
  notnum_names <- names(data)[ !is_num ]
  fill <- Map(function(x) 0, num_names)
  let(c(NOTNUM = toString(notnum_names)), 
      data %>% complete(NOTNUM, fill = fill),
      strict = FALSE,
      subsMethod = "stringsubs")
}

completeDF2(DF)

Updates: Fixes and improvements. Add wrapr approach.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341