1

I know tapply can be used to separate groups and calculate means individually. I was wondering if there is a function to isolate one of these means for separate analysis.

I am comparing my collected data to the class means using a One sample t-test. Here a sample of the data I'm using

#Sample of my data
structure(list(
pKa = c(6.946, 7.1, 6.625, 7.528, 7.102, 6.743,6.936, 6.579, 6.672, 7.27), 

pH = c("pH_6.1", "pH_6.7", "pH_7.3", "pH_8.1", "pH_6.1", "pH_6.7", "pH_7.3", "pH_8.1", "pH_6.1", "pH_6.7"), 

id = c("XAU", "XAU", "XAU", "XAU", "MyData", "MyData", "MyData","MyData", "PQ", "PQ")),
 row.names = c(NA, 10L), class = "data.frame")

I'm trying to extract the means of each "pKa"(response variable) grouped by "pH" (explanatory variable), and use each mean in a t-test to compare 'MyData' Vs the total collected Class data. Below shows the data I want to compare (Mydata Vs the class means at different pH groups: 6.1, 6.7, 7.3, 8.1)

# The data I collected
Exp2MyData

#pKa     pH     id
#5 7.102 pH_6.1 MyData
#6 6.743 pH_6.7 MyData
#7 6.936 pH_7.3 MyData
#8 6.579 pH_8.1 MyData 

#Means of the class data
E2 <- tapply(Exp2$pKa, Exp2$pH, mean)
E2 
#pH_6.1 pH_6.7 pH_7.3 pH_8.1 
# 7.102  6.743  6.936  6.579

The T-test code I am trying to use, to see whether my collected data is significantly different from the class', is:

t.test(mean of whole class pKa for 'X'pH, pKa value I collected)

This is code I am currently trying to use

#Making pH 8.1 into seperatate group
Buff_8.1 <- subset(Exp2,pH=="pH_8.1")

#Obtaining the means for pKa @ pH = 8.1
m8.1 <- mean(Buff_8.1$pKa)
m8.1

#t.test
t.test(m8.1, mu= 6.579)

But I get the error "Error in t.test.default(m8.1, mu = 6.579) : not enough 'x' observations"

Any help would be wonderful.

  • Hi Sophie, welcome to SO! Can you please provide more details on the output you need based on your data sample. Also refer to this - https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example, it will help you to improve your question. – Bulat Mar 27 '21 at 17:55
  • So what exactly is the value you are trying to return here? You can get a single group from `E2` with `E2["ph_7.3"]` – MrFlick Mar 27 '21 at 18:00
  • I think you want `t.test(Exp2$pKa[Exp2$pH=="pH_7.3"], mu=6.936)`, but I'm not sure it is valid to test one group against the combined sample mean since the observations are not independent. It would make more sense to use analysis of variance and then multiple comparisons to test all the pairs of means (4*3/2 = 6 comparisons). – dcarlson Mar 27 '21 at 18:53
  • Do you have more than one observation in the groups you want to test – Dason Mar 27 '21 at 19:06
  • Thank you so much for your answers. I am trying to re-word my question, but I mean compare the class means of each pKa at different pHs, to 'MyData' values (e.g pH 6.1 = pKa7.102). If that makes sense – Frankie Brook Mar 27 '21 at 19:35

1 Answers1

0

I assume that you want to test whether mean pKa is significantly different from the mean pKa of pH-group. In the example df1$class_mean[1] we calcualte the pka mean against the first pH-group [1]. For the second pH-group change to [2] and so on.. or wrap it into a function.

  • We use t_test from rstatix package -> pipe friendly
library(dplyr)
library(rstatix)

# get the group mean by pH group
df1 <- df %>% 
  group_by(pH) %>% 
  summarize(class_mean = mean(pKa))

# one sample t-test against group-pH mean
# df1$class_mean[1] is the first group-pH mean [2], ..etc. until the last [4]

stat.test <- df %>% t_test(pKa ~ 1, mu = df1$class_mean[1], detailed = TRUE)
# stat.test <- df %>% t_test(pKa ~ 1, mu = df1$class_mean[2], detailed = TRUE)
# stat.test <- df %>% t_test(pKa ~ 1, mu = df1$class_mean[3], detailed = TRUE)
# stat.test <- df %>% t_test(pKa ~ 1, mu = df1$class_mean[4], detailed = TRUE)
stat.test

# Output:
# A tibble: 1 x 12
  estimate .y.   group1 group2         n statistic     p    df conf.low conf.high method alternative
*    <dbl> <chr> <chr>  <chr>      <int>     <dbl> <dbl> <dbl>    <dbl>     <dbl> <chr>  <chr>      
1     6.95 pKa   1      null model    10     0.448 0.665     9     6.73      7.17 T-test two.sided  

data:

df <- structure(list(
  pKa = c(6.946, 7.1, 6.625, 7.528, 7.102, 6.743,6.936, 6.579, 6.672, 7.27), 
  
  pH = c("pH_6.1", "pH_6.7", "pH_7.3", "pH_8.1", "pH_6.1", "pH_6.7", "pH_7.3", "pH_8.1", "pH_6.1", "pH_6.7"), 
  
  id = c("XAU", "XAU", "XAU", "XAU", "MyData", "MyData", "MyData","MyData", "PQ", "PQ")),
  row.names = c(NA, 10L), class = "data.frame")
TarJae
  • 72,363
  • 6
  • 19
  • 66