0

I have two tables where the 'ENSG' and 'Ensembl Gene ID' columns are primary keys. I'm interested in combining the average raw counts column to the first table based on matching primary keys.

Condition ENSG avg_rep
Untreated ENSG00000281593 8.702268e-06
Treated ENSG00000280987 5.444327e-06
Ensembl gene id average raw counts
ENSG00000281593 3.635470e-04
ENSG00000280987 4.558828e-04

The end result looks something like this:

Condition ENSG avg_rep average raw counts
Untreated ENSG00000281593 8.702268e-06 3.635470e-04
Treated ENSG00000280987 5.444327e-06 4.558828e-04

2 Answers2

1

Using merge:

merge(df1, df2, by.x="ENSG", by.y="Ensemble gene id")

Or, as mentioned by Onyambu, using full_join:

full_join(df1, df2, c('ENSG'='Ensemble gene id'))

Where df1 is your first table, and df2 the second one

1

I think left_join() might do the trick.

library(tidyverse)

data_left <- data.frame(
  condition = c('untreated', 'treated'),
  ensg = c('a123', 'b456'),
  avg_rep = c(1,2)
)

data_right <- data.frame(
  ensg_gene_id = c('a123', 'b456'),
  average_raw_count = c(4,5)
)

result <- data_left %>%
  left_join(
    data_right,
    by = c('ensg' = 'ensg_gene_id')
  )
Papuha
  • 103
  • 6