-1

I have a data frame which looks like this

enter image description here

where value of b ranges from 1:31 and alpha_1,alpha_2 and alpha_3 can only have value 0 and 1. for each b value i have 1000 observations so total 31000 observations. I want to group the entire dataset by b and wanted to count value of alpha columns ONLY when its value is 1. So the end result would have 31 observations (unique b values from 1:31) and count of alpha values when its 1.

how do i do this in R. I have tried using pipe methods in dplyr and nothing seems to be working.

Scorpio
  • 29
  • 7
  • 3
    Sounds like an `aggregate` - `aggregate(. ~ b, data=df, FUN=sum)` – thelatemail Jun 26 '17 at 04:11
  • [Subset columns](https://stackoverflow.com/questions/18587334/subset-data-to-contain-only-columns-whose-names-match-a-condition) and then [sum by group](https://stackoverflow.com/questions/1660124/how-to-sum-a-variable-by-group). – Ronak Shah Jun 26 '17 at 04:27

1 Answers1

1

We can use

library(dplyr)
df1 %>%
    group_by(b) %>%
    summarise_at(vars(starts_with("alpha")), sum)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you for the response. Though this does seem to be returning the desired result I have a question: how in this command we are checking that alpha columns value is 1? i.e we are only counting 1's and not 0's – Scorpio Jun 26 '17 at 04:17
  • @Scorpio Just check the output of `sum(c(1, 1, 1, 0))` When we do the sum, 0 + any number = any number i.e. 0 doesn't have any effect – akrun Jun 26 '17 at 04:19
  • yup. I must be overthinking this. was such a simple solution. Thank you again! – Scorpio Jun 26 '17 at 04:22