How can I get this average in a dataframe?

Question

I know my next question es very basic in R, but I'm new on this! I have a data frame with qPCR information. What I want to do is a new column with the ct average as a function of spraying, genotype, and gene. This is my data frame, so you can understand what I mean:

> d
   gene sampleCode genotype spraying    ct mean.ct
1     1 1-C1-R1-SA        a  without 31.06   31.06
2     1 1-C1-R2-SA        a  without 30.71   31.06
3     1 1-C1-R3-SA        a  without 31.42   31.06
4     1 1-C1-R1-CA        a     with 31.78   31.98
5     1 1-C1-R2-CA        a     with 32.07   31.98
6     1 1-C1-R3-CA        a     with 32.08   31.98
7     1 2-C2-R1-SA        b  without 32.16   32.16
8     1 2-C2-R2-SA        b  without 32.52   32.16
9     1 2-C2-R3-SA        b  without 31.80   32.16
10    1 2-C2-R1-CA        b     with 32.55   32.28
11    1 2-C2-R2-CA        b     with 32.39   32.28
12    1 2-C2-R3-CA        b     with 31.91   32.28
13    2 1-C1-R1-SA        a  without 31.21   31.58
14    2 1-C1-R2-SA        a  without 31.96   31.58
15    2 1-C1-R3-SA        a  without 31.58   31.58
16    2 1-C1-R1-CA        a     with 32.75   32.75
17    2 1-C1-R2-CA        a     with 32.53   32.75
18    2 1-C1-R3-CA        a     with 32.98   32.75
19    2 2-C2-R1-SA        b  without 31.64   31.64
20    2 2-C2-R2-SA        b  without 32.83   31.64
21    2 2-C2-R3-SA        b  without 30.45   31.64
22    2 2-C2-R1-CA        b     with 31.97   32.43
23    2 2-C2-R2-CA        b     with 32.60   32.43
24    2 2-C2-R3-CA        b     with 32.72   32.43

I make the column "mean.ct" in excel, but I can´t make for all the rows because I have a lot of information! Does anyone know how I can make this new column in R with a simple code? I thought using the function "for" and "if". But I can´t realize how! Any help will be appreciated so much! Thanks!

This and related forms of this question have been answered in [Aggregate / summarize multiple variables per group](https://stackoverflow.com/a/9723446/3508856). Hopefully this helps. — David O, Aug 23 '19 at 19:49
`merge(df,aggregate(mean.ct~gene+genotype+spraying,transform(df,mean.ct=ct),mean))` — Onyambu, Aug 23 '19 at 20:49

score 2 · Answer 1 · edited Aug 23 '19 at 20:37

2

You can use the data.table library. It is very fast for large datasets. Try the following code:

library(data.table)
d[, mean_ct := mean(ct), by = list(spraying, genotype, gene)]

edited Aug 23 '19 at 20:37

Arturo Sbr

5,567
4
38
76

answered Aug 23 '19 at 19:43

red_quark

971
5
20

Mmm doesn't worked for me. Maybe is missing something before d[? – Gerardo Aug 24 '19 at 14:18
1

Have you installed the `data.table` package? You can try the equivalent code: `d <- d[, .(mean_ct = mean(ct)), by = c("spraying", "genotype", "gene")]` – red_quark Aug 24 '19 at 14:35

score 1 · Accepted Answer · answered Aug 23 '19 at 19:39

this is simple data transformation. The simplest way for me is to use the tidyverse

package.

library(tidyverse)

df <- read.table(text = "gene sampleCode genotype spraying    ct mean.ct
1     1 1-C1-R1-SA        a  without 31.06   31.06
2     1 1-C1-R2-SA        a  without 30.71   31.06
3     1 1-C1-R3-SA        a  without 31.42   31.06
4     1 1-C1-R1-CA        a     with 31.78   31.98
5     1 1-C1-R2-CA        a     with 32.07   31.98
6     1 1-C1-R3-CA        a     with 32.08   31.98
7     1 2-C2-R1-SA        b  without 32.16   32.16
8     1 2-C2-R2-SA        b  without 32.52   32.16
9     1 2-C2-R3-SA        b  without 31.80   32.16
10    1 2-C2-R1-CA        b     with 32.55   32.28
11    1 2-C2-R2-CA        b     with 32.39   32.28
12    1 2-C2-R3-CA        b     with 31.91   32.28
13    2 1-C1-R1-SA        a  without 31.21   31.58
14    2 1-C1-R2-SA        a  without 31.96   31.58
15    2 1-C1-R3-SA        a  without 31.58   31.58
16    2 1-C1-R1-CA        a     with 32.75   32.75
17    2 1-C1-R2-CA        a     with 32.53   32.75
18    2 1-C1-R3-CA        a     with 32.98   32.75
19    2 2-C2-R1-SA        b  without 31.64   31.64
20    2 2-C2-R2-SA        b  without 32.83   31.64
21    2 2-C2-R3-SA        b  without 30.45   31.64
22    2 2-C2-R1-CA        b     with 31.97   32.43
23    2 2-C2-R2-CA        b     with 32.60   32.43
24    2 2-C2-R3-CA        b     with 32.72   32.43")


df%>%
  group_by(gene, genotype, spraying)%>%
  mutate(mean.ct2 = mean(ct))%>%
  View

You get the result by first grouping your variables and then mutateing. I called my column mean.ct2 to see if I can produce your values. You should probably rename them for your project. I hope this helps.

Hi Johannes! Very helpful your code! Thanks so much, was so simple. And I was so close to get it, I use group_by and transmutate before (and I tried with mutate too) but in some part of the code I should have had something wrong and I couldn't realize what! — Gerardo, Aug 24 '19 at 14:18

How can I get this average in a dataframe?

2 Answers2