-1

I need to average a large number of columns based on the name in another column. My matrix looks like this (with separate unique row names):

Names       X1  Y1  Z1  X2  Y2  Z2
P.maccus    4   2   2   6   5   3
P.maccus    6   5   3   7   6   5
P.maccus    8   3   2   8   7   3
A.ammophius 3   6   2   7   5   5
P.sabaji    2   5   3   8   4   5
P.sabaji    4   6   3   9   6   5
P.sabaji    5   7   2   8   7   3
P.sabaji    3   5   3   9   5   4

I need to average each row to look like this:

Names       X1  Y1      Z1      X2  Y2  Z2
P.maccus    6   3.33    2.33    7   6   3.66
A.ammophius 3   6       2       7   5   5
P.sabaji    3.5 5.75    2.75    8.5 5.5 4.25

Can anyone help? Thank you!

Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89
  • Your question is a simpler form of that. You want to use `groupby` and `summarize` from the `dplyr` package – divibisan Jun 14 '18 at 21:46

2 Answers2

1

This is pretty easy with dplyr. you can do

dd %>% group_by(Names) %>% summarize_all(mean)

tested with the following data

dd<-read.table(text="Names       X1  Y1  Z1  X2  Y2  Z2
P.maccus    4   2   2   6   5   3
P.maccus    6   5   3   7   6   5
P.maccus    8   3   2   8   7   3
A.ammophius 3   6   2   7   5   5
P.sabaji    2   5   3   8   4   5
P.sabaji    4   6   3   9   6   5
P.sabaji    5   7   2   8   7   3
P.sabaji    3   5   3   9   5   4", header=TRUE)
MrFlick
  • 195,160
  • 17
  • 277
  • 295
0

You can use aggregate() for that.

Assuming your data matrix is in variable named df:

aggregate(. ~ Names, data=df, FUN=mean)

        Names  X1       Y1       Z1  X2  Y2       Z2
1 A.ammophius 3.0 6.000000 2.000000 7.0 5.0 5.000000
2    P.maccus 6.0 3.333333 2.333333 7.0 6.0 3.666667
3    P.sabaji 3.5 5.750000 2.750000 8.5 5.5 4.250000
Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89