1

I'm using a function from the genetics package called LD(). To simplify what it does, it essentially takes a list of genotypes (A/A, A/C, G/A, etc.) and creates a list of values (D, D', r, etc.). It looks something like this:

a=LD(genotype1,genotype2)

with the results looking like:

Pairwise LD
-----------
                   D        D'      Corr
Estimates: 0.1419402 0.8110866 0.6029553

              X^2      P-value  N
LD Test: 10.90665 0.0009581958 15

I only need values from Corr, so I'd call upon it with a$r.

I have 2 dataframes and I want to use that function on their cartesian product:

df1 and df2 are the 2 dataframes, with each column (col) represents a list of genotypes. I'm thinking of using a for loop to fill out a matrix:

df1=data.frame(c("A/A","C/C","A/A"),c("G/G","T/T","T/T"))
df2=data.frame(c("A/T","C/T","C/C"),c("A/A","A/T","G/G"))
q=1 # acts as a counter
n=length(df1$col1) # All lists are the same length
k=length(df2$col2) # These are to set the dimensions of the matrix
r=n*k

m=matrix(data=NA, nrow=r, ncol=3, byrow=TRUE, dimnames=list(NULL, c("c14","c19","Link")))

for(i in (1:n))
{
  for(j in (1:k))
  {
    geno1=genotype(df2)[j] #genotype is a function that must be applied to the
    geno2=genotype(df1)[i] #lists before the LD() function can be used
    d=LD(geno1,geno2)

    m=d$r #I only need the values from this section of the output

    ld[q,]=c(names(df1),names(df2),m) #This is supposed to fill out the matrix
                                      #I'm also not sure of how to do that part
    q=q+1 #this is so that the above line fills in the next row with each iteration
  }
}

When I run this, I get an error:

Error in dim(a1) <- a1.d : 
dims [product "some number"] do not match the length of object ["another number"]

I'm expecting a 3 column and many rowed matrix with the first column being the name of the first genotype(column names of df1), the second column being the name of the second genotype (column names of df2), and the third column with the values obtained from the LD() function

Any advice? Thanks!

UPDATE ANSWER: I managed to get it:

q=1 # acts as a counter
n=length(t1$rs.)
k=length(t2$rs.)
r=n*k

ld=matrix(data=NA, nrow=r, ncol=3, byrow=TRUE, dimnames=list(NULL, c("c14","c19","Link")))

for(i in (1:n))
{
  for(j in (1:k))
  {
    deq=LD(genotype(g1[,i]),genotype(g2[,j]))
    m=deq$r
    ld[q,]=c(i,j,m)
    q=q+1
  }
}
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
Amomynous
  • 95
  • 1
  • 2
  • 7
  • Advice 1 - provide reproducible example. – Chase Jul 06 '11 at 20:12
  • I unfortunately don't have nor can I find any. – Amomynous Jul 06 '11 at 20:16
  • 1
    You don't have `df1` and `df2` in your R session? How do you know the error that you get if you don't have data to try your code? Can you add `dput(df1)`, `dput(df2)`, and any other data we need to generate the error you got to your question? Or give us a subset of the data...or generate a sufficiently representative example using `sample`, `letters`, `rnorm`, or whatever your data is supposed to look like. – Chase Jul 06 '11 at 20:24
  • Ahh, that's what you meant. I'll add that in. – Amomynous Jul 06 '11 at 20:27
  • 2
    The package announces "THIS PACKAGE IS NOW OBSOLETE." – IRTFM Jul 06 '11 at 20:51
  • @Anon, a few tips on reproducible examples. http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Roman Luštrik Jul 07 '11 at 08:47
  • @Anon Can you post your solution as an answer, then accept it? It would help curate the [r] tag, removing an unanswered question. – Iterator Feb 08 '12 at 17:33

1 Answers1

1

I have difficulties in understanding the first part of your work. Why do you want to use two data.frames? I usually feed a data.frame with one line per individual and one row per marker, and LD calculates all the possible pairwise comparisons. However, let's assume you estimated LD using the LD package (yes, they say it's obsolete, but it's still the best!) You can proceed as follows:

#extract the correlation r from LD results
tc<-LD.object$"r"
#build a three columns matrix with all the pairwise combination of two markers
pwm<-combn(row.names(tc),2)
pwld<-matrix(NA,nrow=ncol(pwm),ncol=3)
pwld[,1:2]<-pwm[1:2,]
#Fill the matrix
for(aaa in 1:nrow(pwld))
{
pwld[aaa,3]<-tc[pwld[aaa,1],pwld[aaa,2]]
}
Fabio Marroni
  • 423
  • 8
  • 19