I am trying to find common edges between coexpression networks of genes. Here is a toy example:
Dataset 1 Dataset 2 Dataset 3
A:B A:B A:B
D:E NA D:E
So by intersecting these columns, A:B is an edge to be included, but not D:E.
My issue comes in that my edges can be represented either way round: either A:B or B:A. I also have A and B as separate columns. So any one data frame will look something like this:
Gene1 Gene2 Edge
A B A:B
or this:
Gene1 Gene2 Edge
B A B:A
This means when trying to intersect you could get something like the following:
Dataset 1 Dataset 2 Dataset 3 Dataset 4 Dataset5
B:A A:B A:B B:A A:B
Matching strings wouldn't work as they would be considered different, even though the relationship is still the same
How do I subset a dataframe that allows me to find a gene pair regardless of the order of the gene? Either by querying the string "gene1:gene2" or using the column with Gene1 names and the column with Gene2 names.