I am not really sure where to start here and I could use some pointers.
I have several objects that are character strings of different lengths containing the names of genes. I want to compare all objects pairwise and get the number of shared genes between each pair of lists (using for instance intersect()
). I would like to store all the pairwise comparisons in a matrix to make a heatmap.
But I am not sure how to best perform the comparisons and how to store the results. Should I group all the objects into a dataframe first?
I have 24 objects called names_something
:
> length(names_G63)
[1] 4518
> head(names_G63)
[1] "SARC_00002" "SARC_00004" "SARC_00005" "SARC_00012" "SARC_00022" "SARC_00025"
> length(names_C28)
[1] 9190
> head(names_C28)
[1] "SARC_00001" "SARC_00002" "SARC_00003" "SARC_00004" "SARC_00005" "SARC_00008"
And the comparisons would give a single number showing the number of shared genes between lists:
> length(intersect(names_G63, names_C28))
[1] 4097
I want to store these numbers as a matrix, like:
G63 C28 B124
G63 0
C28 4097 0
B124 3000 345 0