Calculate mean of multiple rows using grouping variables

Question

I am trying to calculate an overall mean of multiple classes. Currently the database is in long format. I tried selecting first ID number (group variable 1), then a dummy variable (stem=1) classes that I am interested in (grouping variable 2), and then calculating one GPA mean (i.e., stem GPA mean) for the grades received in interested classes (stem=1).

I have an attached an example of the database below. Overall,, I am trying figure out how to calculate stem GPA for each student.

See example here

I have tried using library(psych), describeBy(data, dataset$id, dataset$stem), but to no avail. Any suggestions?

easy base R, try `help('aggregate')` to get you started.. – Stephen Henderson Feb 16 '16 at 22:03 — Stephen Henderson, Feb 16 '16 at 22:03
In addition to `aggregate`, `?ave` could also be useful. – RHertel Feb 16 '16 at 22:04 — RHertel, Feb 16 '16 at 22:04

score 1 · Accepted Answer · answered Feb 16 '16 at 22:09

1

I prefer the dplyr package for these operations. Try e.g.

 df %>% group_by(class) %>% summarise(mean_class=mean(class))

For instance, using the mtcars dataset:

 library(dplyr)
 mtcars %>% group_by(cyl) %>% summarise(mean_disp = mean(disp))

will give you all the means of disp based on the grouping variable cyl.

answered Feb 16 '16 at 22:09

coffeinjunky

11,254
39
57

Thank you! Can you clarify the meaning of %>% ? Thank you! – Tawk_Tomahawk Feb 16 '16 at 22:43
That is the so-called pipe-operator. It takes whatever is on the left and gives it as an argument to whatever is on the right. For instance, `mtcars %>% group_by(cyl)` should be read "take the dataset `mtcars`, then `group` it by cyclinder size, then... it is equivalent to the command `group_by(mtcars, cyl)` since the first argument of `group_by` is a dataframe. – coffeinjunky Feb 17 '16 at 08:57
1

On a more general note, this problem belongs to the `split-apply-combine` theme. If you google this, you will find many more ways to do the above. Moreover, you seem pretty new to stackoverflow, and that is ok, I have been there myself. ;) Just want to say that it is generally advisable to post a `minimal reproducable example` (google it) with the desired output when posting here. – coffeinjunky Feb 17 '16 at 09:04

Calculate mean of multiple rows using grouping variables

1 Answers1