2

I'm trying to use dplyr to summarize some data and can't work out how to sum values from part of a column. Normally I'd use tally(), but in this case I want to add up all of the 1's and 0's so tally() isn't appropriate.

My data looks something like this:

  subj | child | child_age | older | younger
    1      1        374        0        1
    1      2        465        1        0
    2      1        573        1        0
    2      2        583        1        0
    2      3        172        0        1

So, I want to create a dataset that shows, for each subj, how many 'older' children and how many 'younger' children they have. This should look something like this:

  subj | n_child | older | younger
    1      2        1         1
    2      3        2         1

This is the code I've used so far:

  child_ages <- data %>%
    group_by(subj) %>%
    mutate(nOlder = sum(older),
           nYounger = sum(younger)) %>%
    ungroup()

I've also tried summarize() in place of mutate(); both appear to be ignoring my group_by command and just give me totals across the data.

Many thanks!

Catherine Laing
  • 475
  • 6
  • 18

0 Answers0