0

Consider we have a data.frame named IND, in which we have a column called dept. There are in total 100 rows and there are 20 distinct values in dept.

Now I would like to obtain the summary statistics for these 20 subsets of data.frame containing 5 rows each using the main data.frame!

summary(IND) gives the summary statistics for whole dataset but what should I do in my case?

Vincent Bonhomme
  • 7,235
  • 2
  • 27
  • 38
rrsa
  • 11
  • 6
  • 2
    `by(IND,IND$dept,summary)` ? – scoa Aug 26 '15 at 14:23
  • You may also check `?summaryBy` from `library(doBy)` – akrun Aug 26 '15 at 14:24
  • Thank you very much scoa.. that gave the required result! – rrsa Aug 26 '15 at 14:30
  • @akrun it is probably a duplicate, but I don't think this is the right question to point to. The answers over there are only about summarizing *one* variable by groups, whereas the OP wants to summarize *every* variable in the data.frame, by group. The first answer with `tapply` doesn't work here for instance – scoa Aug 26 '15 at 14:33
  • @scoa I reopened. As the OP didn't provide an input example/expected output, it is not that clear. There was also a `summaryBy` option in the link. – akrun Aug 26 '15 at 14:36

1 Answers1

0

Something like this

mtcars %>% group_by(cyl) %>% summarise_each(funs(sum, mean))

can be used for your case as

IND %>% group_by(dept) %>% summarise_each(funs(sum, mean))
Prradep
  • 5,506
  • 5
  • 43
  • 84