Intersection using two columns in data-frame

Question

I want to intersect my data-frame based on two columns single column i can do that using the intersect function but how to go about two columns.

Here is my sample data-frame

head(Region)
          ENSEMBL UP_DOWN
1 ENSG00000000457      UP
2 ENSG00000000460      UP
3 ENSG00000000938      UP
4 ENSG00000000971      UP
5 ENSG00000001084    DOWN
6 ENSG00000001460      UP

The second data-frame

head(gene)
          ENSEMBL UP_DOWN
1 ENSG00000000003    DOWN
2 ENSG00000000938      UP
3 ENSG00000001630    DOWN
4 ENSG00000002822    DOWN
5 ENSG00000004059    DOWN
6 ENSG00000004139    DOWN

So far what im doing is this

c <- as.data.frame(intersect(Region$ENSEMBL,gene$ENSEMBL))

But I lose the information if that respective row is either "UP" or "DOWNN" in either of my data-frame. How do i label that? information

The options in this link should work as well - https://stackoverflow.com/questions/32917934/how-to-find-common-rows-between-two-dataframe-in-r — Ronak Shah, Jun 02 '21 at 06:38

score 2 · Accepted Answer · answered Jun 02 '21 at 05:57

2

You could do an inner join:

library(dplyr)

inner_join(Region, gene, by = c('ENSEMBL','UP_DOWN'))

          ENSEMBL UP_DOWN
1 ENSG00000000938      UP

answered Jun 02 '21 at 05:57

Waldi

39,242
6
30
78

but would it tell me if the "UP" in both cases like both my dataframe is "UP" ? – PesKchan Jun 02 '21 at 06:03
1

checked few of them so it matches but now if one row is UP in one data-frame and "DOWN" in the other then it wont come in the output.. – PesKchan Jun 02 '21 at 06:06
1

You could just join by 'ENSEMBL' only : you'll see both sides. I joined by `c('ENSEMBL','UP_DOWN')` because you asked for an intersect – Waldi Jun 02 '21 at 06:26
1

We can use suffix argument to name the repeated columns: inner_join(Region, gene, by = c('ENSEMBL'), suffix = c('_Region', '_Gene')) – jpdugo17 Jun 02 '21 at 06:32

score 2 · Answer 2 · answered Jun 02 '21 at 09:48

2

A base R option with merge may help

> merge(Region, gene)
          ENSEMBL UP_DOWN
1 ENSG00000000938      UP

answered Jun 02 '21 at 09:48

ThomasIsCoding

96,636
9
24
81

Intersection using two columns in data-frame

2 Answers2