-1

Here is some sample code:

dat = data.frame(income = c(100,200,300,400,500,600), 
                 sex = c("M","M","M", "F","F","F"), 
                 num.kid = c(1,2,3,1,2,3))

I want to produce a 2-dimensional table that summarizes the key statistics (e.g. mean and var) of income distribution by sex and num.kid.

For example, table(dat$sex, dat$num.kid) would give me a 2x3 table with sex as rows and num.kid as columns, but the table would be filled with the count of those combinations. How can I bring a third variable (e.g. income) into the table? How can I fill the table with mean or var of income by sex and num.kid? This is almost like filling out an Excel pivot table using R code.

Felix T.
  • 520
  • 3
  • 11
Ruser
  • 3
  • 1
  • 1
    Hi Ruser, could you please include a `data.frame` of the expected results? – Felix T. May 10 '19 at 22:34
  • 1
    Sounds like you really need something like `dplyr::group_by`/`dplyr::summarize`, base R's `by` or `ave`, or something similar from `data.table`? – r2evans May 10 '19 at 22:57

1 Answers1

1

Here's a sample using your data:

library(dplyr)
dat %>% 
  group_by(sex) %>%  
  summarise(mean = mean(income), 
            var = var(income),
            sd = sd(income))

You can put multiple fields in the group_by statement.

Ryan John
  • 1,410
  • 1
  • 15
  • 23
  • Ugh, I know you can use the dplyr package, but I'm was wondering if there is a more basic command to do this. – Ruser May 10 '19 at 23:16