How to put the results of the summarise() function into the dataframe, using r?

Question

This question is from (how to put the results of summarise() function into the dataframe in r)

in the previous question, I think I did not convey my question well. so, I added more details.

I made a minimal reproducible example, but my real data is really huge

a_p_ <-c(0.1, 0.3, 0.03, 0.03)
b_p_ <-c(0.2, 0.003, 0.1, 0.00001)
c_2<-c(1,2,5,23)
c_p_<-c(0.001, 0.002,0.002,0.00001)
results_1<-data.frame(a_p_,b_p_,c_2,c_p_)

a_p_ <-c(0.3, 0.02, 0.43, 0.44)
b_p_ <-c(0.00002, 0.3, 0.8, 0.005)
c_2 <-c(88,4,55,88)
c_p_<-c(0.1, 0.002,0.002,0.1)

results_2<-data.frame(a_p_,b_p_,c_2,c_p_)

so, I have two dataset. the one is "results_1" and the other is "results_2" But, this is just an reproducible dataset. In my real dataset, I have 200 results files. (from "results_1" to "results_200")

and then, I want to create new dataframe (data frame name is type1error) that contains the following examples.

More specific, I want this to be the first row of my new dataframe (type1error)

>   results_1 %>%
+     summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
  a_p_ b_p_ c_p_
1  0.5  0.5    0

and this to be my second row of my dataframe (type 1 error)

> results_2 %>%
+     summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
  a_p_ b_p_ c_p_
1 0.75  0.5  0.5

so what I did is..

# make empty holder

type1error<-as.data.frame(matrix(nrow = 2))

for(i in 1:2){
  # read the data 
  if(i==1){
    results<-results_1
  }
  if(i==2){
    results<-results_2
  }
  

  
  # mean() You can use mean() to get the proportion of TRUE of a logical vector.
  type1error[i,]<-results %>%
    summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
  
  type1error$conditions[i] <- i 
  
}

but I got warning message like this, and the results does not seems to be what I was expected (summarise results for each row)

Warning messages:
1: In `[<-.data.frame`(`*tmp*`, i, , value = list(a_p_ = 0.5, b_p_ = 0.5,  :
  provided 3 variables to replace 2 variables
2: In `[<-.data.frame`(`*tmp*`, i, , value = list(a_p_ = 0.75, b_p_ = 0.5,  :
  provided 3 variables to replace 2 variables

How can I fix this?

The below code is not for this example dataset, but for my real dataset which generates the same error.

#FYI, Not reproducible, but the code that I did use for my real, huge,data is as follows:

ncond<-200

#empty holder 

type1error<-as.data.frame(matrix(nrow = ncond))

for(i in 1:ncond){
# read the data 
results <- read.csv(paste0("model_results/results_",i,".csv"))
 

# mean() You can use mean() to get the proportion of TRUE of a logical vector.
type1error[i,]<-results %>%
  summarise(across(contains("_p_"), ~ mean(.x > 0.05)))

type1error$conditions[i] <- i 

}
# one csv file in type 1 error rate 
# fixed
write.csv(type1error,"type1error/type1error.csv")

#and this code chunk did not work well.

I appreciate all the answers in the previous question page!

In the answer from the previous question webpage, it is all for "results_1" and "results_2",becuase my reproducible example have only two dataset.

However, in reality, I have 200 dataset (from "results_1" to "results_200"..),

and I have to make a new dataframe, not a list.

alexrai93 · Answer 1 · 2022-05-12T07:19:40.040

You can use map and bind_rows in order to work with a list and output as a dataframe.

Map (purrr package) takes a list/vector does some function to it and then outputs a list, and then bind_rows (dplyr) can append the elements as a dataframe.

ResultList <-list(results_1, results_2)

sumit <- function(x) {
  summarise(x, across(contains("_p_"), ~ mean(.x > 0.05)))
}

FinalResult <- map(ResultList, ~sumit(.x))

Type1Error <- bind_rows(FinalResult)

You can also do it as a one-liner in map: map(ResultList, ~summarise(.x, across(contains("_p_"), ~ mean(.x > 0.05))))

In order to get all of your files into list format you could use map or lapply.

Edited to include modified version from the linked solution to get csv files into a list assuming you have a folder called "Data" in your R project directory that contains all the files.

setwd("./Data")
filenames <- list.files(full.names=TRUE)  
ResultList <- lapply(filenames,function(i){
read.csv(i)})

Solution for reading csv files into a list

I have 200 results files.. such as "results_1", "results_2"..."results_200". I cannot type 200 dataset like ```list(results_1, results_2)```. is there any clever way that I can use this code? (I appreciate your answer!!) — yoo, May 12 '22 at 06:03
What format are the files in? You can use lapply or map to read a batch of files into a list. — alexrai93, May 12 '22 at 07:04

How to put the results of the summarise() function into the dataframe, using r?

1 Answers1