0

i have a vector called yphl. the vector is all 0's and 1's.

When i do:

length(yphl)

the output is 45972

then i want to find the number of 0's and 1's split up. after looking through stackoverflow and r documentation, it's clear to me that the table function should work.. but it doesn't. i did:

table(yphl)

output is:0 at 6446 and 1 at 13553. this obviously does not add up.

is there anything else i can try? is the vector too long to compute? very confused

any help would be greatly appreciated, thanks

itjcms18
  • 3,993
  • 7
  • 26
  • 45
  • 4
    Most likely there are missing values in your vector. Read the documentation at `?table` and pay close attention to the various arguments. – joran Sep 20 '14 at 01:45
  • I would say @joran's guess is almost certainly correct, but it's still just a guess. If you want an answer that we can verify to be a true, you need to create a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – MrFlick Sep 20 '14 at 01:48

1 Answers1

2

As joran noted in the comments, you probably have NA values in the vector. Here are a couple of examples of how the useNA argument of table works.

> set.seed(1)
> x <- sample(c(0,1,NA), 10, TRUE)
> x
# [1]  0  1  1 NA  0 NA NA  1  1  0

First off, you can find out if the vector contains any NA values with

> anyNA(x)
# [1] TRUE

From ?table

useNA controls if the table includes counts of NA values: the allowed values correspond to never, only if the count is positive and even for zero counts.

The allowed values are useNA = c("no", "ifany", "always")

Only if the count is positive:

> table(x, useNA = "ifany")
# x
#    0    1 <NA> 
#    3    4    3 
> identical(sum(table(x, useNA = "ifany")), length(x))
# [1] TRUE

Even for zero counts:

> y <- x[!is.na(x)] ## remove NA values from x
> table(y, useNA = "always")
#
#   0    1 <NA> 
#   3    4    0 
> identical(sum(table(y, useNA = "always")), length(y))
# [1] TRUE

Never:

> table(x, useNA = "no")
# x
# 0 1 
# 3 4
> identical(sum(table(x, useNA = "no")), length(x))
# [1] FALSE 
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245