Searched a few different topics but am not finding the exact same question. I have a square correlation matrix where the row/column names are genes. Slice of the matrix shown below.
Xelaev15073085m Xelaev15073088m Xelaev15073090m Xelaev15073095m
Xelaev15000002m 0.1250128 -0.6368677 0.3119062 0.3980826
Xelaev15000006m 0.4127414 -0.8805597 0.6435158 0.9629489
Xelaev15000007m 0.4012530 -0.8854113 0.6425895 0.9614517
I have a data frame which has pairs of genes I want to extract from this large matrix.
V1 V2
1 Xelaev15011657m Xelaev15017932m
2 Xelaev15011587m Xelaev15046612m
3 Xelaev15011594m Xelaev15046616m
4 Xelaev15011597m Xelaev15046617m
5 Xelaev15011603m Xelaev15046624m
6 Xelaev15011654m Xelaev15017928m
I am trying to loop through the data frame and output the matrix cell of the pair matrix["gene1","gene2"]
(for example the value 0.1250128 when comparing Xelaev15073085m
and Xelaev15000002m
). Doing this on a single gene basis is easy, however my attempt at a for loop to do this for the thousands of pairs in this list is failing. In the below example headedlist is a sample of the data frame above, and FullcorSM is the full correlation matrix.
for(i in headedlist$V1){
data.frame(i, headedlist[i,2], FullcorSM[i,headedlist[i,2]])
}
The above line was my first attempt and returns null. My 2nd attempt is shown below.
for(i in 1:nrow(stagelist)){
write.table(data.frame(stagelist$V1, stagelist$V2, FullcorSM["stagelist$V1","stagelist$V2"]),
file="sampleout",
sep="\t",quote=F)
}
Which returns an out of bounds error. To do the 2nd example without the quotes in the FullcorSM["stagelist$V1", "stagelist$V2"]
section returns all values of the 2nd column for each of the first column, closer to what I want but still am missing some knowledge of how R is interpreting my matrix/data frame syntax, but it is not clear to me what the fix is. Any insight on how to proceed?