-1

How Do I intersect between multiple samples?

I have 29 lists of concatenates I build according to gene name, cc change, coordinate. Each list is 400-800 long. I need to build a table showing how many variants shared among two lists for all 812 combinations. Is there a way I can do this in R?

For example: If I have 4 lists.

A<-c("TSC22112517","SLC141T43309911","RAD51D33446609","WRN31024638")

B<-c("TSC22112517","SLC14A143309911","RHBDF274474996","WRN31024638")

C<-c("TSC22112517","SLC14A143309911","RAD51D33446609","MEN164575556")

D<-c("FANCM45665468","SLC14A143309911","RAD51D33446609","MEN164575556")

I just need to find how many variants are shard among each other.

AB<-length(intersect(A,B))

give me the # of variants shared by A and B which is 3. Then I can get a table like below showing # of shared variants:

    A      B      C      D
A   4      3      2      2
B   3      4      3      2
C   2      3      4      2
D   2      2      2      4

How to do it for large # of lists? I have 29 lists and each has 600 variants.

Nishi
  • 10,634
  • 3
  • 27
  • 36
lance
  • 3
  • 2
  • can you summarize your needs including some minimal datasets and code to load it, it's quite hard to figure out from an explanation – HubertL Feb 25 '16 at 20:31
  • Could you please be more specific? If you post more details on the format of your inputs, what's exactly the results you need, and the code for your best attempt, you will have far more chances to get a good answer. Thanks – lrnzcig Feb 25 '16 at 20:31
  • Welcome to StackOverflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610). This will make it much easier for others to help you. – Jaap Feb 25 '16 at 20:53
  • So, your main problem is the high number of dimensionality? – coffeinjunky Feb 25 '16 at 21:35

2 Answers2

0

You could try something like this: I do a lot of things in lists...

#x is your data in list() format
shared<-list()
for (i in 1:29){
  shared[[i]]<-list()
   for (j in 1:29){
    if (i != j){
      shared[[i]][[j]]<-x[[i]][x[[i]][,2]==x[[j]][,2]]
    }
  }
}
Zafar
  • 1,897
  • 15
  • 33
0

So happy to figure it out

x<- list()
shared<-matrix(1:841,ncol=29)
temp<-NULL
for (i in 1:29){
  for (j in 1:29){
   temp[j] <- length(intersect(x[[i]][[1]],x[[j]][[1]]))
  }
  shared[,i] <- matrix(temp)
}
shared
lance
  • 3
  • 2