Counting specific values of binary variable and returning the count only, preferably using dplyr

Question

I've got to get the frequency of one specific value of one single dichotomous variable, preferable using dplyr but I'll gladly accept alternative solutions. It should just be as short and straightforward as possible.

Here's the example:

dat<-data.frame(x=c(1,1,0,0,NA,NA))

What I could do is this...

dat %>% group_by(x) %>% summarise(sum(!is.na(x)))

...based on what @akrun suggested in another thread.

The issues with this is that it returns a tibble showing the count for each value x takes:

# A tibble: 3 x 2
      x `sum(!is.na(x))`
  <dbl>            <int>
1    0.                2
2    1.                2
3   NA                 0

What I need is just one number for a specific value of x, say x==1. Adding this condition (x==1) to the dplyr command, however, won't work, as it only returns the same tibble output as above.

Simply put, I need a command that returns the count of x==1 or x==2 and just that. So in this case, the perfect R output would look like this:

[1] 2

I've also tried something like this...

!is.na(dat[,c("x")]==1)

which returns an integer that equals TRUE if x==1 and FALSE otherwise. But then I'd need to count the TRUEs.

Do you need something like `sum(dat$x == 1, na.rm = T)` for example? — AntoniosK, Nov 13 '18 at 16:48
[Counting the number of elements with the values of x in a vector](https://stackoverflow.com/questions/1923273/counting-the-number-of-elements-with-the-values-of-x-in-a-vector) — Henrik, Nov 13 '18 at 16:53
Lol, yes exactly. Somehow I assumed this wouldn't work because I tried `sum(!is.na(...)==1)` which just returns the full length of that variable (except for NAs) and ignores the `==1` command... — Dr. Fabian Habersack, Nov 13 '18 at 16:53
The problem with using `dplyr` for this is that it's made for working with data frames and you don't want a data frame result. So you could do something like `dat %>% summarize(sum(x == 1, na.rm = TRUE)) %>% pull`, but it's simpler to skip `dplyr` altogether and just do what Antonios suggests: `sum(dat$x == 1)`. (Or maybe you want `sum(dat$x %in% c(1, 2), na.rm = TRUE)`? I can't really tell the specifics.) — Gregor Thomas, Nov 13 '18 at 17:01

score 2 · Accepted Answer · answered Nov 13 '18 at 16:53

2

You can try using 'nrow'

dat %>% filter(x == 1) %>% nrow()

answered Nov 13 '18 at 16:53

pooja p

144
7

1

I was about to suggest the same. – Athanasia Mowinckel Nov 13 '18 at 16:55

Counting specific values of binary variable and returning the count only, preferably using dplyr

1 Answers1