0

I have a a dataframe like:

Col_name  Col_X    Col_Y    Col_Z
BoB         2        3          3    
BoB         3        4          3
Carl        4        5          2
Carl        2        3          3
Eva         5        2          5
Bob         1        1          2

I want to get the mean of each column by the name. So I want to get this df:

Col_name   Col_X    Col_Y      Col_Z
 BOB       2        2.33       2,33
 Carl      3        4          2,5
 Eva       5        2          5

Does anyone knows how to to this?

ArunK
  • 1,731
  • 16
  • 35
user5543269
  • 137
  • 2
  • 2
  • 8

4 Answers4

1

Here is one approach with dplyr (BTW - Since you have different cases for names, I am not sure how you got your output, but I convert them to all lower case to get the desired output):

library(dplyr)
df %>%
  mutate(Col_name = tolower(Col_name)) %>%
  group_by(Col_name) %>%
  summarise_each(funs(mean))

Output as follows:

Source: local data frame [3 x 4]

  Col_name Col_X    Col_Y    Col_Z
     <chr> <dbl>    <dbl>    <dbl>
1      bob     2 2.666667 2.666667
2     carl     3 4.000000 2.500000
3      eva     5 2.000000 5.000000
Gopala
  • 10,363
  • 7
  • 45
  • 77
0

Use package dplyr and do the following:

library(dplyr)
df <- data.frame(Col_name = c("Bob", "Bob", "Carl", "Carl"), 
                 Col_X = c(2,3,4,2), Col_Y = c(3,4,5,3))
df %>% group_by(Col_name) %>% summarise_each(funs(mean(.)))

You are taking the data (df) group them by the column Col_name and then apply the function mean on each column and for all distinct groups.

Output:

Source: local data frame [2 x 3]

  Col_name Col_X Col_Y
    (fctr) (dbl) (dbl)
1      Bob   2.5   3.5
2     Carl   3.0   4.0
Martin Schmelzer
  • 23,283
  • 6
  • 73
  • 98
0

Using dplyr:

require(dplyr)

a <- data.frame(Col_name = c(rep("Bob", 2), rep("Carl", 2), "Eva", "Bob"),
                Col_X = runif(6),
                Col_Y = runif(6),
                Col_Z = runif(6))

 a %>% 
  group_by(Col_name) %>% 
  summarise_each(funs(mean))
Manuel R
  • 3,976
  • 4
  • 28
  • 41
0

With package data.table you can do

# creating example data
library(data.table)
dt <- data.table(Col_name = c("Bob", "Bob", "Carl", "Carl"), Col_X = c(2,3,4,2), Col_Y = c(3,4,5,3))

# aggregate
dt[, lapply(.SD, mean), by = Col_name]

which returns:

   Col_name Col_X Col_Y
1:      Bob   2.5   3.5
2:     Carl   3.0   4.0
Uwe
  • 41,420
  • 11
  • 90
  • 134