Aggregate Columns by multiple conditions

Question

I would like to aggregate this data frame where for each Family Size, there are six categories, of Hours Worked.

families <- structure(list(`Family Size` = c(2L, 2L, 2L, 2L, 2L, 2L, 2L,13L, 13L, 13L), HoursLess20 = c("1,014", "1,041", "11", "3","1", "2", "1", "0", "0", "0"), Hours2024 = c(7L, 298L, 1L, 0L,0L, 0L, 0L, 0L, 0L, 0L), Hours2529 = c(1L, 34L, 0L, 0L, 0L, 0L,0L, 0L, 0L, 0L), Hours3034 = c(6L, 44L, 1L, 0L, 0L, 0L, 0L, 0L,0L, 0L), Hours3539 = c(4L, 46L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Hours40plus = c(9L, 128L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L)), .Names = c("Family Size","HoursLess20", "Hours2024", "Hours2529", "Hours3034", "Hours3539","Hours40plus"), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 1977L,1978L, 1979L), class = "data.frame")

How do you want them aggregated? The sum? The average? – G5W Feb 28 '17 at 17:45 — G5W, Feb 28 '17 at 17:45

score 1 · Accepted Answer · answered Feb 28 '17 at 17:59

First of all, you currently have the values in HoursLess20 as strings (because of the commas). To make any sort of numerical aggregations, you will want to get rid of the commas and convert that to numeric.

families$HoursLess20 = as.numeric(gsub(",", "", families$HoursLess20))

Once you have done that you can just use the aggregate function to create whatever aggregate you want.

## Sum
aggregate(families[,-1], list(families[,1]), sum)
  Group.1 HoursLess20 Hours2024 Hours2529 Hours3034 Hours3539 Hours40plus
1       2        2073       306        35        51        50         138
2      13           0         0         0         0         0           0

## Average
aggregate(families[,-1], list(families[,1]), mean)
  Group.1 HoursLess20 Hours2024 Hours2529 Hours3034 Hours3539 Hours40plus
1       2    296.1429  43.71429         5  7.285714  7.142857    19.71429
2      13      0.0000   0.00000         0  0.000000  0.000000     0.00000

I don't know why that happened, it must've happened when I reproduced it for SO, but thanks. — Jeremy R. Johnson, Mar 03 '17 at 15:45

Aggregate Columns by multiple conditions

1 Answers1