Could someone please explain the differences between how apply()
and sapply()
operate on the columns of a data frame?
For example, when attempting to find the class of each column in a data frame, my first inclination is to use apply
on the columns:
> apply(iris, 2, class)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
"character" "character" "character" "character" "character"
This is not correct, however, as some of the columns are numeric:
> class(iris$Petal.Length)
[1] "numeric"
A quick search on Google turned up this solution for the problem which uses sapply
instead of apply
:
> sapply(iris, class)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
"numeric" "numeric" "numeric" "numeric" "factor"
In this case, sapply
is implicitly converting iris
to a list, and then applying the function to each entry in the list, e.g.:
> class(as.list(iris)$Petal.Length)
[1] "numeric"
What I'm still unclear about is why my original attempt using apply
didn't work.