1

I am trying to make a loop, so I can test multiple conditions: cond_A, cond_B and cond_C, each against the same control ('ctrl'). Each condition and control is represented by a triplicate. As outcome I would like to get a dataframe with condition names and pvalues.

Here is my input:

structure(list(ctrl_1 = 1L, ctrl_2 = 2L, ctrl_3 = 3L, cond_A_1 = 4L, 
    cond_A_2 = 4L, cond_A_3 = 4L, cond_B_1 = 5L, cond_B_2 = 5L, 
    cond_B_3 = 7L, cond_C_1 = 8L, cond_C_2 = 9L, cond_C_3 = 2L), .Names = c("ctrl_1", 
"ctrl_2", "ctrl_3", "cond_A_1", "cond_A_2", "cond_A_3", "cond_B_1", 
"cond_B_2", "cond_B_3", "cond_C_1", "cond_C_2", "cond_C_3"), class = "data.frame", row.names = c(NA, 
-1L))

And expected output with hypothetical pvalues:

cond_A_pval cond_B_pval cond_C_pval
0.05    0.9 0.006

Here is my starting point:

pval<-apply(df,1,function(x) {t.test(x[1:3],x[4:6])$p.value})
user2904120
  • 416
  • 1
  • 4
  • 18

1 Answers1

1

Try the following:

df <- structure(list(ctrl_1 = 1L, ctrl_2 = 2L, ctrl_3 = 3L, cond_A_1 = 4L, 
               cond_A_2 = 4L, cond_A_3 = 4L, cond_B_1 = 5L, cond_B_2 = 5L, 
               cond_B_3 = 7L, cond_C_1 = 8L, cond_C_2 = 9L, cond_C_3 = 2L), 
               .Names = c("ctrl_1", "ctrl_2", "ctrl_3", 
                          "cond_A_1", "cond_A_2", "cond_A_3", 
                          "cond_B_1", "cond_B_2", "cond_B_3", 
                          "cond_C_1", "cond_C_2", "cond_C_3"), 
               class = "data.frame", row.names = c(NA, -1L))

library(tidyr)

# Reshape the data into key-value pairs. 
# It is generally advisable to have data in tidy format. 
df <- gather(df)
# Remove the _1, _2, etc. 
df$group <- gsub("_\\d", "", df$key)

#Now you can loop through the groups. Note that "ctrl" is the first group:
sapply(unique(df$group)[-1], function(x){
  t.test(df[df$group == "ctrl", "value"], df[df$group == x, "value"])$p.value 
})

 cond_A     cond_B     cond_C 
0.07417990 0.01477836 0.17957429 

See also Looping through t.tests for data frame subsets in r

coffeinjunky
  • 11,254
  • 39
  • 57
  • works with values in the example, but not in my real data, see below; structure(list(key = c("ctrl_1", "ctrl_2", "ctrl_3", "cond_A_1", "cond_A_2", "cond_A_3", "cond_B_1", "cond_B_2", "cond_B_3", "cond_C_1", "cond_C_2", "cond_C_3"), value = c("13.382", "12.9152", "14.719", "13.3822", "12.9152", " 8.788", " 9.3765", "17.1525", " 6.664", "11.2885", "10.5390", "37.030"), group = c("ctrl", "ctrl", "ctrl", "cond_A", "cond_A", "cond_A", "cond_B", "cond_B", "cond_B", "cond_C", "cond_C", "cond_C")), .Names = c("key", "value", "group" ), row.names = c(NA, -12L), class = "data.frame") – user2904120 Aug 11 '17 at 21:48
  • include: df<-transform(df, value = as.numeric(value)); helped – user2904120 Aug 11 '17 at 21:58
  • 1
    That seems to be an issue of the underlying data. In your `structure`, the values are specified as characters (strings). Glad that it works after transforming the data to numeric. – coffeinjunky Aug 11 '17 at 22:03
  • any idea how to get pval "Pr(>F)" from ANOVA in a similar way? with(df, aov(value ~ group)) – user2904120 Aug 11 '17 at 22:21