0

im super newbie on R and i have been learning for myself for a few weeks already due my work degree. Im almost done with the statistical analysis that i need, but it is through an ugly and messy code, that is, repeating lot of codes for several data frames, to apply different statistical tests, save results, etc. Well now, for personal interest, want to write this better, but im totally trapped in my ignorance and really need a push to get the idea, please. For example, i want to create a function that measure the correlation on all the data tables im using and save those results as a tables using the input name as part of the output name. I mean, if we had the iris data but measured on different seasons, e.g. iris_fall, iris_winter, iris_spring and iris_summer, after apply cor(X) method to each one, i want to save those results as tables called like "mCoriris_fall.txt", "mCoriris_winter.txt", "mCoriris_spring.txt" and "mCoriris_summer.txt" respectively. My useless code for now say:

cor_PQ<-function(X) {
  cor_PQ<-cor(X, use="pairwise.complete.obs")
  return(cor_PQ)
}
savecor<-function(t) {
  outputname<-(paste0("mCor",t)) #HOW DO I CALL THE NAME OF THE INPUT? t is cor_PQ result matrix.
  savecor<-write.table(t, file=paste0(outputname,".txt"))
  return(savecor)
}
cor_PQ(Iris_fall)

I expect to get cor result and save it as a table in my workspace, using the input name as part of the output name. Im aware this are 2 separates functions and the one to write table should be inside the function for cor(x), but i cant understand how. I have been reading a lot but i just cant fit all in my head. Thanks to anyone who can help me. Regards.

UNTIL HERE IT HAS BEEN SOLVED... But after making a list with my 14 data frames to apply cor and other methods, the write.table function overwrite the 14 cor results on 1 single doc. This is my code.

PQ_files<-list.files(path="C:/Users/Sol/Documents/ProyectoTítulo/CalidadAgua/Matrices/Regs",pattern="\\_PQ.txt")

PQ_data<-lapply(PQ_files, read.table)

names(PQ_data)<-gsub("\\_PQ.txt","", PQ_files)

PQ_data

cor_PQ<-function(X) {
  cor_PQ<-cor(X, use="pairwise.complete.obs")
  outputname.txt<-paste0("mCor",deparse(substitute(X)),".txt")
  write.table(cor_PQ, file=outputname.txt)
  outputname.pdf<-paste0("Cor",deparse(substitute(X)),".pdf")
  pdf(outputname.pdf)
  plot(X)
  dev.off()
  return(cor_PQ)
}

for (i in seq_along(PQ_data)){
  Correlaciones<-lapply(PQ_data,cor_PQ)
  }

Correlaciones

On SUM: seems to work almost good, until the write.table and plot(x) overwrite the outputs from the 14 dataframes on my PQ_data withe the name mCor[[i]] and CorX[[i]], respectively. Should i define [i] somehow to have each results with the right name? Also, when i run Correlaciones at the end, i can see the cor result for the 14 dataframes in one single dataframe, but i dont know how to split them correctly. I guess almost there. THANKS AGAIN!

1 Answers1

0

You can combine the two functions and use deparse substitute to get input names as string

cor_PQ <- function(X) {
   cor_PQ<-cor(X, use="pairwise.complete.obs")
   outputname<- paste0("mCor",deparse(substitute(X)), ".txt")
   write.table(t, file=outputname)
   return(cor_PQ)
}

and then call

cor_PQ(Iris_fall)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • awesome! thats exactly i was looking for. thank you Ronak. considering i dont really know how all this work... could be possible to create a list of my data frames (data bases) to use when run the function that way i do it only once and not 4 (iris seasons) or 12 (my work). Thanks again! – Cristóbal Jaraba Nilo Dec 16 '19 at 02:47
  • @CristóbalJarabaNilo Yes, if there is no pattern in the names of dataframe you can add them in list manually and apply the function with `lapply` like `lapply(list(Iris_fall, Iris_winter, Iris_spring, Iris_summer), cor_PQ)` Or if they have pattern in their names like all of them start with `Iris` we can do `lapply(mget(ls(pattern = '^Iris')), cor_PQ)` – Ronak Shah Dec 16 '19 at 02:56
  • Well, that didnt gave me exactly what i was looking for but it has helped on finding the solution.... but now im stuck in a problem again. OK, to give you the context of my study, im using 14 ```data.frames```, to which i want to apply several statistical methods... So, the issue related to add a ```write.table``` function inside a ```cor``` function is solved, than i made a list with my 14 data.frames and i can run the ```cor``` function to all elements (dataframes) on the list. So, the issue now is that ```write.table``` is overwritting every ```cor``` result with a variable [[i]]. – Cristóbal Jaraba Nilo Dec 16 '19 at 16:19
  • Hi Ronak, thanks again for your help. I just edited the original Ask trying to explain my next issue. If i dont use ```for```, it works too, but the problem is that ```write.table``` inside the ```function``` for ```cor``` is overwriting the result of every ```cor``` method applied to each dataframe inside my list (PQ_files). Thanks! – Cristóbal Jaraba Nilo Dec 17 '19 at 00:10
  • Should i define 'X' from the cor_PQ ```function``` as a list? .... because when i run ```cor``` it takes the name of each dataframe inside my list (PQ_data) as an [i], so it gives me only one output table called mCor[[i]], which is, by the way, overwrited through the 14 dataframes in my list... consequently the saved file contain only the values from ```cor``` applied over the last dataframe from my list. – Cristóbal Jaraba Nilo Dec 17 '19 at 00:15
  • I guess you should ask a new question with these information as one post should be focussed on only one question. Thanks. – Ronak Shah Dec 17 '19 at 00:39
  • Ok, i understand. I should than. Thanks again Ronak. – Cristóbal Jaraba Nilo Dec 17 '19 at 02:05
  • Hi Ronak! can you check my last question please. Thanks! https://stackoverflow.com/questions/59376151/write-table-inside-a-function-applied-to-a-list-of-data-frames-overwrite-outputs – Cristóbal Jaraba Nilo Dec 17 '19 at 16:20