Calculating mean of each column and reshaping

Question

I have a csv dataset for student of ID 229, where:

StudID            Score            Weight
 229                65               51
 229                45               43
 229                82               79

and I'm going to calculate the mean of score and weight of student, where I want to obtain as below:

Measurements   Mean Value   #Measurements and mean value are new column names

Score          64         #score and weight which used to be column names are now under measurements
Weight         57.67

So what i did so far is as below:

stud_data <- read.csv("student_weight.csv")
stud_mean <- colMeans(stud_data[2:3])    #finding the mean of only score and weight

which gives me the below when i print stud_mean:

Score               Weight
 64                  57.67

Is it possible to format output in a way i would obtain:

Measurements        Mean Value          #Measurements and Mean value are new column names

   Score                64
   Weight               57.67

Maurits Evers · Answer 1 · 2019-04-11T01:55:43.193

1

I'm not entirely sure I understood what you're trying to do. Are you asking how to aggregate multiple columns at once? Or is this about how to reshape your aggregate data?

Regarding the former, in base R you can do

aggregate(cbind(Score, Weight) ~ StudID, df, FUN = mean)
#  StudID Score   Weight
#1    229    64 57.66667

Regarding the latter, you can use stack to reshape

stack(aggregate(cbind(Score, Weight) ~ StudID, df, FUN = mean))
#     values    ind
#1 229.00000 StudID
#2  64.00000  Score
#3  57.66667 Weight

Here I assume your actual data contains data for more than one StudID, so you probably want to aggregate data by StudID.

Sample data

df <- read.table(text =
    "StudID            Score            Weight
 229                65               51
 229                45               43
 229                82               79", header = T)

edited Apr 11 '19 at 01:55

answered Apr 11 '19 at 01:05

Maurits Evers

49,617
4
47
68

would aggregating based on StudID be stack(aggregate(cbind(Score,Weight), by=list(stud_data$StudID), df, FUN=mean)) in this case? – Maxxx Apr 11 '19 at 01:28
@Maxxx Just `stack(aggregate(cbind(Score,Weight) ~ StudID, df, FUN=mean))`, see the second part of my answer where I'm doing exactly that. I'm assuming that `StudID` is a column of your `data.frame`, as per your sample data. – Maurits Evers Apr 11 '19 at 01:54

score 1 · Answer 2 · answered Apr 11 '19 at 01:06

1

This is basically a very short Wide-to-Long problem, and the easiest way is to use tidyr::gather:

data.frame('Score'=64, 'Weight'=57.67) %>%
    tidyr::gather('Measurements', 'Value')

  Measurements Value
1        Score 64.00
2       Weight 57.67

answered Apr 11 '19 at 01:06

divibisan

11,659
11
40
58

score 1 · Answer 3 · answered Apr 11 '19 at 01:07

1

You could use stack after colMeans

stack(colMeans(df[2:3]))
#  values    ind
#1 64.00000  Score
#2 57.66667 Weight

To assign column name, we could use setNames

setNames(stack(colMeans(df[2:3])), c("Mean_Value", "Measurements"))

#    Mean_Value  Measurements
#1     64.00000        Score
#2     57.66667        Weight

answered Apr 11 '19 at 01:07

Ronak Shah

377,200
20
156
213

would it be possible if it's Measurements first followed by Mean Value in the second answer? – Maxxx Apr 11 '19 at 02:47
@Maxxx It's just about rearranging the columns. You could do `setNames(stack(colMeans(df[2:3]))[2:1], c("Measurements", "Mean_Value"))` Or `setNames(stack(colMeans(df[2:3])), c("Mean_Value", "Measurements"))[2:1]` – Ronak Shah Apr 11 '19 at 02:52

Calculating mean of each column and reshaping

3 Answers3

Sample data