3

I am reading Advanced R by Hadley available at http://adv-r.had.co.nz/Functionals.html. He talks about the difference between sapply and vapply. My question is relating to using vapply instead of sapply, which he doesn't discuss further in his example.

Here's his code:

df2 <- data.frame(x = 1:10, y = Sys.time() + 1:10)
sapply(df2, class)

This returns

$x
[1] "integer"

$y
[1] "POSIXct" "POSIXt"

However, when I run vapply, I get an error.

vapply(df2, class, character(1))

Error:

Error in vapply(df2, class, character(1)) : values must be length 1,
 but FUN(X[[2]]) result is length 2

I have two questions:

Question:1) When I replace character(1) with character(2), it gives me the following error message:

vapply(df2, class, character(2))
Error in vapply(df2, class, character(2)) : values must be length 2,
 but FUN(X[[1]]) result is length 1

Why does this happen?

Question:2) How do I use vapply instead of sapply?

I am learning R so your answers will help me understand R at a deeper level. I'd appreciate your thoughts.

watchtower
  • 4,140
  • 14
  • 50
  • 92

1 Answers1

7

Question 1:

The error with character(2) is because the character vector "integer" is only of length 1 and rightly fails the consistency check against the expected result of character vector of length 2.

Question 2:

vapply() is there as a safer version of sapply() as it makes sure you only get back what you expect from each application of FUN. It is also safer I guess because the output from vapply() is consistent - you don't get a vector or a matrix or a list. You get a vector for length 1 returned sub-components and an array otherwise.

In the specific example you give, you can't use vapply() as what is returned by class isn't consistent. You have to know or expect certain output and vapply() fails if the output from a call to FUN doesn't match what it expects.

In this instance, I suppose you could do

df2 <- data.frame(x = 1:10, y = Sys.time() + 1:10)
vapply(df2, FUN = function(x) paste(class(x), collapse = "; "),
       FUN.VALUE = character(1))

> vapply(df2, FUN = function(x) paste(class(x), collapse = "; "),
+        FUN.VALUE = character(1))
                x                 y 
        "integer" "POSIXct; POSIXt"

but whether that is useful to you or not is a different matter.

Really, using vapply() comes down to knowing what to expect from FUN and wanting to only ever get that output. If you don't know or can't control it, you are probably better off with lapply().

Community
  • 1
  • 1
Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453