-1

I am working with a data frame in R, beginning as follows:

   > head(renamed.Mc.Cd.Ni[2:6])
   s_MC13_B2_Cd.Ni s_MC13_B3_Cd.Ni s_MC13_B4_Cd.Ni     GENE_ID
   9.854759       10.216916        9.722329     GENE:JGI_V11_100009        
   7.863938        8.075640        7.894878     GENE:JGI_V11_100009
   9.448034        9.177245        9.053654     GENE:JGI_V11_100036
   9.333245        9.208673        9.159947     GENE:JGI_V11_100036
   9.360540        9.374757        9.273236     GENE:JGI_V11_100036
   8.983222        9.023339        9.112987     GENE:JGI_V11_100044

As you can see, I have three columns which give a gene expression value for 3 different daphnia when under a treatment. The final column represents the gene expressed. However, there are more than one row for each gene due to multiple probes being used. How do i get an average for each gene for each daphnia (columns 1-3)?

For example, for each of the three daphnia (columns 1 to 3), I want an overall average gene expression for each gene shown in column 4. I cant do it manually as I have over 60,000 gene probe expression values.

Thank you in advance!

www
  • 38,575
  • 12
  • 48
  • 84
A.Carter
  • 49
  • 8

1 Answers1

0

Your question appears to be a straight forward mean by group problem, there are several examples already on SO... so this is a duplicate, however, this a a tidy approach to solving your problem.

library(dplyr)

df <- read.table(text = "
s_MC13_B2_Cd.Ni s_MC13_B3_Cd.Ni s_MC13_B4_Cd.Ni     GENE_ID
9.854759       10.216916        9.722329     GENE:JGI_V11_100009
7.863938        8.075640        7.894878     GENE:JGI_V11_100009
9.448034        9.177245        9.053654     GENE:JGI_V11_100036
9.333245        9.208673        9.159947     GENE:JGI_V11_100036
9.360540        9.374757        9.273236     GENE:JGI_V11_100036
8.983222        9.023339        9.112987     GENE:JGI_V11_100044
", header = TRUE, stringsAsFactors = FALSE)

df %>%
  group_by(GENE_ID) %>%
  summarise_if(is.numeric, mean)

# # A tibble: 3 x 4
#   GENE_ID             s_MC13_B2_Cd.Ni s_MC13_B3_Cd.Ni s_MC13_B4_Cd.Ni
#   <chr>                         <dbl>           <dbl>           <dbl>
# 1 GENE:JGI_V11_100009            8.86            9.15            8.81
# 2 GENE:JGI_V11_100036            9.38            9.25            9.16
# 3 GENE:JGI_V11_100044            8.98            9.02            9.11
Kevin Arseneau
  • 6,186
  • 1
  • 21
  • 40