-2

I am trying to calculate an overall mean of multiple classes. Currently the database is in long format. I tried selecting first ID number (group variable 1), then a dummy variable (stem=1) classes that I am interested in (grouping variable 2), and then calculating one GPA mean (i.e., stem GPA mean) for the grades received in interested classes (stem=1).

I have an attached an example of the database below. Overall,, I am trying figure out how to calculate stem GPA for each student.

See example here

I have tried using library(psych), describeBy(data, dataset$id, dataset$stem), but to no avail. Any suggestions?

Tawk_Tomahawk
  • 139
  • 2
  • 8

1 Answers1

1

I prefer the dplyr package for these operations. Try e.g.

 df %>% group_by(class) %>% summarise(mean_class=mean(class))

For instance, using the mtcars dataset:

 library(dplyr)
 mtcars %>% group_by(cyl) %>% summarise(mean_disp = mean(disp))

will give you all the means of disp based on the grouping variable cyl.

coffeinjunky
  • 11,254
  • 39
  • 57
  • Thank you! Can you clarify the meaning of %>% ? Thank you! – Tawk_Tomahawk Feb 16 '16 at 22:43
  • That is the so-called pipe-operator. It takes whatever is on the left and gives it as an argument to whatever is on the right. For instance, `mtcars %>% group_by(cyl)` should be read "take the dataset `mtcars`, then `group` it by cyclinder size, then... it is equivalent to the command `group_by(mtcars, cyl)` since the first argument of `group_by` is a dataframe. – coffeinjunky Feb 17 '16 at 08:57
  • 1
    On a more general note, this problem belongs to the `split-apply-combine` theme. If you google this, you will find many more ways to do the above. Moreover, you seem pretty new to stackoverflow, and that is ok, I have been there myself. ;) Just want to say that it is generally advisable to post a `minimal reproducable example` (google it) with the desired output when posting here. – coffeinjunky Feb 17 '16 at 09:04