This question is from (how to put the results of summarise() function into the dataframe in r)
in the previous question, I think I did not convey my question well. so, I added more details.
I made a minimal reproducible example, but my real data is really huge
a_p_ <-c(0.1, 0.3, 0.03, 0.03)
b_p_ <-c(0.2, 0.003, 0.1, 0.00001)
c_2<-c(1,2,5,23)
c_p_<-c(0.001, 0.002,0.002,0.00001)
results_1<-data.frame(a_p_,b_p_,c_2,c_p_)
a_p_ <-c(0.3, 0.02, 0.43, 0.44)
b_p_ <-c(0.00002, 0.3, 0.8, 0.005)
c_2 <-c(88,4,55,88)
c_p_<-c(0.1, 0.002,0.002,0.1)
results_2<-data.frame(a_p_,b_p_,c_2,c_p_)
so, I have two dataset. the one is "results_1" and the other is "results_2" But, this is just an reproducible dataset. In my real dataset, I have 200 results files. (from "results_1" to "results_200")
and then, I want to create new dataframe (data frame name is type1error) that contains the following examples.
More specific, I want this to be the first row of my new dataframe (type1error)
> results_1 %>%
+ summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
a_p_ b_p_ c_p_
1 0.5 0.5 0
and this to be my second row of my dataframe (type 1 error)
> results_2 %>%
+ summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
a_p_ b_p_ c_p_
1 0.75 0.5 0.5
so what I did is..
# make empty holder
type1error<-as.data.frame(matrix(nrow = 2))
for(i in 1:2){
# read the data
if(i==1){
results<-results_1
}
if(i==2){
results<-results_2
}
# mean() You can use mean() to get the proportion of TRUE of a logical vector.
type1error[i,]<-results %>%
summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
type1error$conditions[i] <- i
}
but I got warning message like this, and the results does not seems to be what I was expected (summarise results for each row)
Warning messages:
1: In `[<-.data.frame`(`*tmp*`, i, , value = list(a_p_ = 0.5, b_p_ = 0.5, :
provided 3 variables to replace 2 variables
2: In `[<-.data.frame`(`*tmp*`, i, , value = list(a_p_ = 0.75, b_p_ = 0.5, :
provided 3 variables to replace 2 variables
How can I fix this?
The below code is not for this example dataset, but for my real dataset which generates the same error.
#FYI, Not reproducible, but the code that I did use for my real, huge,data is as follows:
ncond<-200
#empty holder
type1error<-as.data.frame(matrix(nrow = ncond))
for(i in 1:ncond){
# read the data
results <- read.csv(paste0("model_results/results_",i,".csv"))
# mean() You can use mean() to get the proportion of TRUE of a logical vector.
type1error[i,]<-results %>%
summarise(across(contains("_p_"), ~ mean(.x > 0.05)))
type1error$conditions[i] <- i
}
# one csv file in type 1 error rate
# fixed
write.csv(type1error,"type1error/type1error.csv")
#and this code chunk did not work well.
I appreciate all the answers in the previous question page!
In the answer from the previous question webpage, it is all for "results_1" and "results_2",becuase my reproducible example have only two dataset.
However, in reality, I have 200 dataset (from "results_1" to "results_200"..),
and I have to make a new dataframe, not a list.