0

I have a data set such as the following (this is only a subset in which the forloop might produce invalid results).

tableData <- data.frame(Fruits = character(), Ripeness = character(), Mean = numeric()) %>%
  add_row(Fruits = "Apple", Ripeness = "yes", Mean = 5) %>%
  add_row(Fruits = "Apple", Ripeness = "no", Mean = 6) %>%
  add_row(Fruits = "Apple", Ripeness = "yes", Mean = 2) %>%
  add_row(Fruits = "Banana", Ripeness = "yes", Mean = 1) %>%
  add_row(Fruits = "Banana", Ripeness = "yes", Mean = 7) %>%
  add_row(Fruits = "Orange", Ripeness = "no", Mean = 8) 

The following for loop produces a t-test of the Mean for each category of fruits by their ripeness.

finalOut <- data.frame(Fruits = character(), Mean = numeric())
fruitLoop <- function(x) {
  fruit <- unique(x$Fruits)
  for(i in 1:length(fruit)){
    df<-filter(x, Fruits==fruit[i])
    fruit[i]
    ripe <- unique(df$Ripeness)
    if(length(ripe)<2) {
      next
    }
    tryResult <- tryCatch(
      {
        t.test(Mean ~ Ripeness, data = df)$p.value
      },
      error=function(cond){
      }
    )
    finalOut[i,] <- c(fruit[i],tryResult)
  }
}  

I want the results of the t-test printed into the finalOut, however it doesn't seem to print into it. How can I achieve this? Again, the data is only a subset, and might be insufficient for the forloop to run.

Phil
  • 7,287
  • 3
  • 36
  • 66
NickL
  • 103
  • 6
  • Add `finalOut` or `return(finalOut)` at the end of your function call to specify what gets returned. – Phil Jun 09 '20 at 18:10
  • I have tried this, but it only returns the finalOut as a print, and does not have it stored into the finalOut data frame. – NickL Jun 09 '20 at 18:12
  • I misread your code, never mind my comment above. – Phil Jun 09 '20 at 18:15

1 Answers1

0

In the example data there's not enough observations to conduct a t.test, so I created a new example. You don't need a for loop, you can use group_by with summarise to get a t-test of the Mean for each category of fruits by their ripeness.

library(dplyr)

tableData <- tibble(Fruits = sample(c('Apple', 'Banana', 'Orange'), 30, T),
                        Ripeness = sample(c('yes', 'no'), 30, T),
                        Mean = ifelse(Ripeness == 'yes', 1.4 + runif(30), 1.6 + runif(30)))

tableData %>% 
  group_by(Fruits) %>% 
  summarise(t_test_pval = t.test(Mean ~ Ripeness)$p.value)
# # A tibble: 3 x 2
#   Fruits t_test_pval
#   <chr>        <dbl>
# 1 Apple        0.277
# 2 Banana       0.747
# 3 Orange       0.837

Example with tryCatch

tableData <- tibble(Fruits = sample(c('Apple', 'Banana', 'Orange'), 30, T),
                        Ripeness = sample(c('yes', 'no'), 30, T),
                        Mean = ifelse(Ripeness == 'yes', 1.4 + runif(30), 1.6 + runif(30))) %>% 
 add_row(Fruits = "Peach", Ripeness = "yes", Mean = 5)

get_t_test_pval <- function(formula){
  tryCatch({t.test(formula)$p.value}, error = function(cond) NA)
}

tableData %>% 
  group_by(Fruits) %>% 
  summarise(t_test_pval = get_t_test_pval(Mean ~ Ripeness))

# # A tibble: 4 x 2
#   Fruits t_test_pval
#   <chr>        <dbl>
# 1 Apple        0.108
# 2 Banana       0.171
# 3 Orange       0.169
# 4 Peach       NA    
IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38
  • Wow! That's much more concise. With that I could just plug the summarised results into another table right? – NickL Jun 09 '20 at 18:29
  • 1
    Yeah, you can add the `t_test_pval` column to another table with `left_join` (or with base R, `merge`). See [this question](https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) – IceCreamToucan Jun 09 '20 at 18:30
  • I just tried it and ran into another problem. I had the tryCatch in place because in the real set of data, there were still various fruits that only had one type of ripeness, or even not enough observations. Thus, the t-test wouldn't work on every single fruit? So it goes back to the original question of how to save the results from the for loop. – NickL Jun 09 '20 at 18:36
  • 1
    Added an example with `tryCatch` – IceCreamToucan Jun 09 '20 at 18:47
  • Is it also possible to summarize a column that shows the mean difference (ie mean of yes ripeness - mean of no ripeness) to each fruit? (with trycatch enabled as well)? – NickL Jun 10 '20 at 18:14