-1

I have a data frame of many columns. First column is the ID, second is age group (below 30, 30-40, 40-50, etc.), and the rest of the columns are Q1, Q2, Q3...Q50 with integer values and NA. I need to count the non-NA values for each column by age group.

I tried this and the "n" in the last line of the codes is not working, but if I replace "n" with "mean", it works well. library(dplyr) df2 <- df %>% group_by(age_grp) %>% summarise_if(is.integer, n, na.rm=TRUE)

Thanks.

Phil
  • 7,287
  • 3
  • 36
  • 66
  • Try `count` instead of `n`. Incidentally, you will find it easier to use the tidyverse if your dataset is tidy. here, that would mean a tibble with four columns: ID, Agegroup, Question (with values 1 to 50) and Response. Finally, `summarise_if` has been deprecated. You should use `across` in dplyr v1.0.0.0. – Limey Feb 19 '21 at 15:00
  • Does this answer your question? [R group by, counting non-NA values](https://stackoverflow.com/questions/41150212/r-group-by-counting-non-na-values) – TarJae Feb 19 '21 at 15:02
  • Thanks much. I tried "count" instead, but it does not work with summarise_if. Can you please advise how I use across and count? Thanks again! – SilverSpringbb Feb 19 '21 at 15:18
  • I also tried this after the group_by in dplyr: summarise_each(funs(sum(!is.na(.)))) and it does not work. – SilverSpringbb Feb 19 '21 at 15:25
  • also tried this and it does not work either: summarise(across(where(is.integer), list(n = ~count(.)), na.rm=TRUE)) – SilverSpringbb Feb 19 '21 at 15:49
  • summarise_each(funs(sum(!is.na(.)))) actually works!! not sure why the first time i ran it, it didn't work. Thanks to you guys!! – SilverSpringbb Feb 19 '21 at 16:19

1 Answers1

0

library(dplyr) df_2 <- df %>% group_by(age_grp) %>% summarise_each(funs(sum(!is.na(.)))