apply() to columns of data.frame

Question

I don't quite understand the behavior of apply() in case of data.frame columns:

apply(mtcars, 2, is.numeric)
# mpg  cyl disp   hp drat   wt qsec   vs   am gear carb 
# TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE 

apply(iris,   2, is.numeric)
# Sepal.Length  Sepal.Width Petal.Length  Petal.Width      Species 
# FALSE        FALSE        FALSE        FALSE        FALSE

in both tables we have numeric data, so why the result is different?
Moreover, if I add a column to mtcars, it changes the outcome for all columns:

mtcars$colA <- 'charA'
apply(mtcars, 2, is.numeric)
# mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb  colA 
# FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

For my purpose (determine the type of columns), lapply does the job (lapply(mtcars, is.numeric)) - but I still would like to figure out what's going on in case of apply(df,2,myfunc)

Hmm...`apply` coerces to matrix before doing any calculations. Since matrices can not hold more than one classes then this is what you are seeing. Either they are all numeric or not. This also explains why none are numeric in `mtcars` once you add a character variable. If you try it with `sapply` then you will get the real classes of the original df. Try `sapply(iris, is.numeric)` — Sotos, Dec 18 '19 at 12:39
be sure to check this FAQ for some good intuition to take away: https://stackoverflow.com/q/3505701/3576984 — MichaelChirico, Dec 18 '19 at 12:53
@MichaelChirico, just realized I even had that question bookmarked already! shame on me — Vasily A, Dec 18 '19 at 12:54

score 2 · Accepted Answer · edited Jun 20 '20 at 09:12

From the docs of apply,

Details

If X is not an array but an object of a class with a non-null dim value (such as a data frame), apply attempts to coerce it to an array via as.matrix if it is two-dimensional (e.g., a data frame) or via as.array.

Since these objects are only able to hold one single class, then you will either get that all are numeric or If there is at least one non-numeric column, then you will get FALSE for all. We can compare it with the results of sapply,

apply(iris, 2, is.numeric)
#Sepal.Length  Sepal.Width Petal.Length  Petal.Width      Species 
#       FALSE        FALSE        FALSE        FALSE        FALSE 

sapply(iris, is.numeric)
#Sepal.Length  Sepal.Width Petal.Length  Petal.Width      Species 
#        TRUE         TRUE         TRUE         TRUE        FALSE

apply() to columns of data.frame

1 Answers1