Get mean values if a key column value is duplicated with dplyr (R)

Question

This is my data. What I would like to do is, if the gene column has duplicated value (e.g. CASZ1), then I would like to get mean values for each Sample column.

Input data

Output data

I googled it and tried, but I am stuck to get an answer. I am sorry for asking such a question looks exactly like homework.

My code

data %>% group_by(gene) %>% summarise(avg = mean(colnames(data)) --- error...

score 4 · Answer 1 · answered Aug 31 '18 at 08:30

4

You can use summarize_at along with some regular expression to ensure any column not starting by your pattern will not be included:

data %>% group_by(gene) %>% summarise_at(vars(matches("Sample")), mean)

Is that what you're looking for?

answered Aug 31 '18 at 08:30

Vincent Bonhomme

7,235
2
27
38

Thank you. I didn't know that I can use regex in dplyr. Good to know. – jkim Aug 31 '18 at 10:08

score 3 · Accepted Answer · answered Aug 31 '18 at 08:28

3

You can use summarise_all:

library(dplyr)
data %>% group_by(gene) %>% summarise_all(funs(mean))

answered Aug 31 '18 at 08:28

RLave

8,144
3
21
37

Thank you! I should read more about dplyr. – jkim Aug 31 '18 at 10:06

Get mean values if a key column value is duplicated with dplyr (R)

2 Answers2