2

From searching online and in this group, it seems like this should work:

> mean(r_lab$ozone, na.rm=TRUE)

However, what I get is:

[1] NA
Warning message:
In mean.default(r_lab$ozone, na.rm = TRUE) :
  argument is not numeric or logical: returning NA

This is the contents of that column in the dataset:

> r_lab$Ozone
 [1]  41  36  12  18  NA  28  23  19   8  NA   7  16  11  14
[15]  18  14  34   6  30  11   1  11   4  32  NA  NA  NA  23

I'm sort of flustered.

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
David
  • 115
  • 3
  • 16
  • 2
    What is the result of `class(r_lab$ozone)`? – Rich Scriven May 29 '16 at 17:55
  • 1
    Please add a [reproducible example](http://stackoverflow.com/q/5963269/1217536) for people to work with. When I assign your values to a vector & run your code, I get a mean & no error. – gung - Reinstate Monica May 29 '16 at 17:57
  • 1
    It could be a factor column. Convert to numeric and it would work. i.e. `mean(as.numeric(as.character(r_lab$ozone)), na.rm=TRUE)` – akrun May 29 '16 at 18:01
  • 1
    @ZheyuanLi Yes, that is true, but the warning message usually occurs with factors i.e. `mean(factor(1:5)) [1] NA Warning message: In mean.default(factor(1:5)) : argument is not numeric or logical: returning NA` – akrun May 29 '16 at 18:04
  • 1
    "argument is not numeric or logical". That doesn't leave many choices. A factor class is quite likely, IMO. It would be necessary to post the output requested by @RichardScriven – RHertel May 29 '16 at 18:05
  • Class returns integer. I only showed the first 30 rows. There's 153 rows altogether, and more columns. – David May 29 '16 at 18:10
  • @David If you are using `typeof`, it returns `integer` as `factor` is stored internally as `integer` – akrun May 29 '16 at 18:11
  • Please add a reproducible example. There is information in the link above. – gung - Reinstate Monica May 29 '16 at 18:12
  • 4
    Wait, you wrote `ozone` in the first one and `Ozone` in the last one. Do you have two columns of the same name with different caps? – Rich Scriven May 29 '16 at 18:12
  • First few rows of all columns > r_lab Ozone Solar.R Wind Temp Month Day 1 41 190 7.4 67 5 1 2 36 118 8.0 72 5 2 3 12 149 12.6 74 5 3 – David May 29 '16 at 18:13
  • 1
    Crap. It's case sensitive. Ozone worked. ozone did not. My apologies, and many thanks. – David May 29 '16 at 18:14
  • 1
    Lol. Yeah so you were effectively doing `mean(NULL, na.rm = TRUE)`. Gotta watch out for those caps! – Rich Scriven May 29 '16 at 18:16

2 Answers2

6

Your data is most likely of class character, instead of numeric.

Take a look at these examples:

# Set up some numeric data
x <- c(41, 36, 12, 18, NA, 28, 23, 19,  8, NA,  7, 16, 11, 14, 18, 14, 34,  6, 30, 11,  1, 11,  4, 32, NA, NA, NA, 23)

# Clearly taking the mean on this will work
 mean(x, na.rm = TRUE)

[1] 18.13043

However, if your data is of class character, then you get the error message you report:

y <- as.character(x)
mean(y, na.rm = TRUE)

[1] NA
Warning message:
In mean.default(y, na.rm = TRUE) :
  argument is not numeric or logical: returning NA

So you should convert your data to numeric first, then take the mean:

mean(as.numeric(x), na.rm = TRUE)

[1] 18.13043
Andrie
  • 176,377
  • 47
  • 447
  • 496
  • 1
    The only problem is that `as.character(x)` displays with double quotes around each entry. This is not what the OP has posted. – RHertel May 29 '16 at 18:14
  • @RHertel Who knows what the OP posted - no reproducible example, so it's all guess work, isn't it. After his last update, it seems (s)he simply made a spelling mistake. Still, this answer is most likely what really happened. – Andrie May 29 '16 at 18:16
0

I was not aware that R was case sensitive.

Richard was right, I should have been using Ozone, not ozone. Thanks to everyone for their help.

Sorry, I did not know how to provide reproducible data. What would have been sufficient in this case?

David
  • 115
  • 3
  • 16