-1

Have a data frame of numerical data and using apply with median along columns. I'm getting NA for the median even though there are some non-zero entries in the columns. I did str(df) to ensure all of the df is integer and it is. What does it mean when R says the median is NA? Thanks.

v1  v2  v3..... 
1   3   4
0   0   0
.   .   .

Also, I got a bunch warnings like this: "1: In mean.default(sort(x, partial = half + 0L:1L)[half + ... : argument is not numeric or logical: returning NA"

mbs1
  • 317
  • 1
  • 3
  • 12
  • http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example It is not clear how you got the error. Based on this example, it works `set.seed(24);df1 <- as.data.frame(matrix(sample(0:5, 10*5, replace=TRUE), ncol=5)); apply(df1, 2, median)` – akrun Apr 18 '15 at 15:11

1 Answers1

0

My solution it is trivial but maybe there are some NAs you did not see. Try to use apply with the na.rm = FALSE in the last argument (the ellipsis).

Using the code provided by akrun.

set.seed(24)
df1 <- as.data.frame(matrix(sample(0:5, 10*5, replace=TRUE), ncol=5))
apply(df1, 2, median)

I add some NA

df1[ 3 , "V2" ] <- NA

and then use sapply (which is the same due to the fact that a data frame is a type of list )

sapply(df1, median, c(na.rm = TRUE))

edit:

consider that str(df1) return int even if there is an NA at row 3 column V2.

SabDeM
  • 7,050
  • 2
  • 25
  • 38