I have many equally structured text files containing experimental data (641*976). At the beginning I define the correct "working directory" and order all the files in a list. Thereby I generate two different lists. Once the file.listx containing my sample data and once the file.listy containing reference data. Afterwards I rearrange the data in order to conduct the correlation analysis. Here the code shows how I generate the "x" list. The "y" list was generated exactly the same way with the reference data.
file.listx <- list.files(pattern="*.txt", full.names=T)
datalist = lapply(file.listx, FUN=read.table, header = F, sep = "\t", skip = 2)
cmbn = expand.grid(1:641, 1:977)
flen = length(datalist)
x=lapply(1:(nrow(cmbn)),function(t,lst,cmbn){
return(sapply(1:flen,function(i,t1,lst1,cmbn1){
return(lst1[[i]][cmbn1$Var1[t1],cmbn1$Var2[t1]])},t,lst,cmbn))}
,datalist,cmbn)
Now I want to calculate the pearson correlation between the two lists. http://www.datasciencemadesimple.com/pearson-function-in-excel/ According to the pearson correlation formula corresponds my "x" to the sample and my "y" to the reference.
cor(x, y, method = "pearson")
Then the error message pops up that 'x' must be numeric. I do not know how I can solve this problem. When I use,
x = as.numeric(x)
it seems that the list structure gets lost. And the following approach does also not solve the problem.
x = as.matrix(x)
How can I convert my list into a numeric type without loosing the structure? I want to calculate the pearson correlation between the two lists.
Here is the code to generate two dummy lists. This way the error can be reproduced.
x = list(4:10, 10:16, 32:38, 100:106) # sample
y = list(10:16, 20:26, 40:46, 110:116) # reference
cor(x, y, method = "pearson")