I have a data frame df
with more than 500 rows and 11 columns (sample below)
Subject Group RA TA Trial VA1 VA2 VA3 VB1 VB2 VB3
1 CAI 0 1 1 6.289 5.924 7.194 4.058 9.770 -3.479
1 CAI 0 1 2 10.512 6.376 8.058 1.905 8.031 -10.698
1 CAI 0 1 3 6.554 5.046 9.482 1.825 7.152 -9.608
1 CAI 0 2 1 8.235 3.006 8.054 2.816 9.911 -4.125
1 CAI 0 2 2 6.744 2.176 9.751 3.566 12.217 -8.225
2 CAI 45 1 1 5.321 5.781 7.821 4.402 6.420 4.125
2 CAI 45 2 1 7.594 4.472 8.053 5.226 5.417 2.481
2 CAI 45 3 1 5.918 5.771 10.154 4.602 5.727 2.092
2 CAI 45 3 2 6.884 5.292 9.180 4.323 4.449 2.191
I am trying to reduce the data set by getting the average of all the trials per condition. In this case, the "Trial" column indicates the trial number for the condition. So for the condition "Subject = 1 , Group = CAI, RA = 0, TA = 1," there are three trials, and I want to get the averages of VA1
to VB3
for the three trials.
However, number of trials are not consistent per condition.
I wanted to see something like this:
Subject Group RA TA VA1 VA2 VA3 VB1 VB2 VB3
1 CAI 0 1 7.785 5.782 8.244 2.596 8.317 -7.928
1 CAI 0 2 7.490 2.591 8.902 3.191 11.064 -6.175
2 CAI 45 1 5.321 5.781 7.821 4.402 6.420 4.125
2 CAI 45 2 7.594 4.472 8.053 5.226 5.417 2.481
2 CAI 45 3 6.401 5.532 9.667 4.463 5.088 2.142
This is the code I've currently come up with. It seems to work, however, there might be room to improve, and the "Group" column is missing, so I wanted to include it in the output data frame.
attach(df)
factor(Subject)
factor(Group)
factor(RA)
factor(TA)
ndf = c()
SubjList <- list(1,2)
RAList <- list(0, 45, -45)
TAList <- list(1,2,3,4)
for (S in SubjList){
for (R in RAList){
for (T in TAList){
TEMP <- subset(M, (Subject == S & RA == R & TA ==T ))
mVA1 = mean(TEMP$VA1)
mVA2 = mean(TEMP$VA2)
mVA3 = mean(TEMP$VA3)
mVB1 = mean(TEMP$VB1)
mVB2 = mean(TEMP$VB2)
mVB3 = mean(TEMP$VB3)
lst = c(S, R, T, mVA1, mVA2, mVA3, mVB1, mVB2, mVB3)
ndf <- rbind(ndf, lst)
}
}
}
I would love to know what you think, and thank you in advance for your help!