I'm using a function from the genetics package called LD()
. To simplify what it does, it essentially takes a list of genotypes (A/A, A/C, G/A, etc.) and creates a list of values (D, D', r, etc.). It looks something like this:
a=LD(genotype1,genotype2)
with the results looking like:
Pairwise LD
-----------
D D' Corr
Estimates: 0.1419402 0.8110866 0.6029553
X^2 P-value N
LD Test: 10.90665 0.0009581958 15
I only need values from Corr, so I'd call upon it with a$r
.
I have 2 dataframes and I want to use that function on their cartesian product:
df1
and df2
are the 2 dataframes, with each column (col) represents a list of genotypes.
I'm thinking of using a for loop to fill out a matrix:
df1=data.frame(c("A/A","C/C","A/A"),c("G/G","T/T","T/T"))
df2=data.frame(c("A/T","C/T","C/C"),c("A/A","A/T","G/G"))
q=1 # acts as a counter
n=length(df1$col1) # All lists are the same length
k=length(df2$col2) # These are to set the dimensions of the matrix
r=n*k
m=matrix(data=NA, nrow=r, ncol=3, byrow=TRUE, dimnames=list(NULL, c("c14","c19","Link")))
for(i in (1:n))
{
for(j in (1:k))
{
geno1=genotype(df2)[j] #genotype is a function that must be applied to the
geno2=genotype(df1)[i] #lists before the LD() function can be used
d=LD(geno1,geno2)
m=d$r #I only need the values from this section of the output
ld[q,]=c(names(df1),names(df2),m) #This is supposed to fill out the matrix
#I'm also not sure of how to do that part
q=q+1 #this is so that the above line fills in the next row with each iteration
}
}
When I run this, I get an error:
Error in dim(a1) <- a1.d :
dims [product "some number"] do not match the length of object ["another number"]
I'm expecting a 3 column and many rowed matrix with the first column being the name of the first genotype(column names of df1), the second column being the name of the second genotype (column names of df2), and the third column with the values obtained from the LD()
function
Any advice? Thanks!
UPDATE ANSWER: I managed to get it:
q=1 # acts as a counter
n=length(t1$rs.)
k=length(t2$rs.)
r=n*k
ld=matrix(data=NA, nrow=r, ncol=3, byrow=TRUE, dimnames=list(NULL, c("c14","c19","Link")))
for(i in (1:n))
{
for(j in (1:k))
{
deq=LD(genotype(g1[,i]),genotype(g2[,j]))
m=deq$r
ld[q,]=c(i,j,m)
q=q+1
}
}