-1

I have a data frame with 3 variables and 50 instances (ID,pre and post).somewhat like this

ID<- c("1","2","3","4","5","6","7","8","9","10")
pre<- c("2.56802","2.6686","1.0145","0.2568","2.369","1.2365","0.6989","0.98745","1.09878","2.454658")
post<-c("3.3323","2.66989","1.565656","2.58989","5.96987","3.12145","1.23565","2.74741","2.54101","0.23568")

dfw1<-data.frame(ID,pre,post)

Pre and post columns are mean of other population. I want to run two-tailed t-test between first elements of both pre and post.(pre against post). I want this to loop over all 50 rows. I have tried writing loops as shown below,

t<-0
for (i in 1:nrow(dfw$ID)) {
  t[i]<-t.test(dfw$pre,dfw$post,alternative = c("two.sided"), conf.level = 0.95)
  print(t)
}

it returned an error I want to extract statistics of above such as df,p-value, t-value for each row and so on. How do I write this code in R?

rawr
  • 20,481
  • 4
  • 44
  • 78
shanky
  • 11
  • What is the mathematical reasoning behind this? What is the variance? – Roman Luštrik Aug 20 '18 at 16:22
  • This might be useful: https://stackoverflow.com/questions/51920287/r-extracting-p-value-for-each-row-from-t-test/51920870#51920870. You'll need to use your original dataset and not just the means, so `t.test` can calculate the std dev and see how many observations you had in each group. Also, I can see pre vs. post, so maybe you need to used a paired `t.test`? – AntoniosK Aug 20 '18 at 16:24
  • Stack Overflow is a question and answer site, not a code-writing service, please [see here](http://stackoverflow.com/help/how-to-ask) to learn how to write effective questions. – 000andy8484 Aug 20 '18 at 16:28
  • You introduce the `pre` and `post` numeric values as characters (quotation marks), this is the likely nature of the returned error. Also, you shouldn't compute a t-ttest between two observations, but between two samples. – 000andy8484 Aug 20 '18 at 16:29

1 Answers1

1

This code shows that you cannot reject the null hypothesis of 0 difference at the conventional 5% confidence level:

ID<- c("1","2","3","4","5","6","7","8","9","10")
pre<- as.numeric(c("2.56802","2.6686","1.0145","0.2568","2.369","1.2365","0.6989","0.98745","1.09878","2.454658"))
post<-as.numeric(c("3.3323","2.66989","1.565656","2.58989","5.96987","3.12145","1.23565","2.74741","2.54101","0.23568"))
dfw1<-data.frame(ID,pre,post)
t.test(dfw1$pre,dfw1$post,alternative = c("two.sided"), conf.level = 0.95, paired=TRUE)

Output (giving you the df, t-stat and p-value):

Paired t-test

data:  dfw1$pre and dfw1$post
t = -2.1608, df = 9, p-value = 0.05899
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.18109315  0.04997355
sample estimates:
mean of the differences 
               -1.06556
000andy8484
  • 563
  • 3
  • 16
  • Thank you for the answer. I had written this code row_t_welch(dfw[,c("pre")], dfw[,c("post")]) earlier and this gave me the result for entire column. I wanted results for each row of data. – shanky Aug 20 '18 at 17:20
  • Since the points are pre and post, I would assume the original question might consider this a match paired test. May want to consider added `paired = TRUE` to the t.test function. – Dave2e Aug 20 '18 at 17:21
  • @Dave2e: correct, addressed. shanky: do you mean that your vectors are placed as rows instead of columns? Consider that you cannot test for populations' differences point-to-point, as you say. A point has a mean of its own value and a variance of 0: i.e. it will always be statistically different from another point. – 000andy8484 Aug 20 '18 at 17:47
  • @000andy8484 -- Dave I have 50 values in a column and my pre values are nothing but mean of first 6 values and my post is mean of values of 6th to 10th. I am planning to conduct t-test to know where there is a significant change. For this I wrote couple of loops and made two vectors one with all the pre-values and one with all post values. I am planning to conduct t-test with pre against post values. – shanky Aug 20 '18 at 23:09