I have a gene expression dataset "rna" with probe IDs. I also have a reference data set "ref" with probe IDs and their corresponding entrez ID. I want to map the probe IDs from "rna" to those in "ref" so that I can add entrez ID to "rna". In my reference data set, there are probes mapped to multiple entrez ID, so I would also need to dupicate those rows in "rna" so that each row maps to only one entrez ID (but keep same info). The outcome I am looking for is "org_rna". There are also some duplicated entrez IDs that can be left. TYIA
rna = data.frame("Org1" = c(1.5, 3.5, 2.4, 3.2, 4.5), "Org2" = c(2.5, 3.5,7, 2.6, 7),
"Org3" = c(3.6,7.2,4,5,6), "Probe" = c("11715100_at", "11715101_s_at",
"11715102_x_at", "11715103_x_at", "11715104_s_at"))
ref = data.frame("Probe Set ID" = c("11715100_at", "11715101_s_at", "11715102_x_at", "11715103_x_at",
"11715104_s_at"), "Entrez" = c("8355", "8355", "340307 /// 441294",
"285501", "8263 /// 474383 /// 474384"))
org_rna = data.frame( "Org1" = c(1.5, 3.5, 2.4, 2.4, 3.2, 4.5, 4.5, 4.5), "Org2" = c(2.5, 3.5,7,7, 2.6, 7,7,7),
"Org3" = c(3.6,7.2,7.2,4,5,6,6,6), "Probe" = c("11715100_at", "11715101_s_at",
"11715102_x_at", "11715102_x_at", "11715103_x_at", "11715104_s_at",
"11715104_s_at", "11715104_s_at"), "Entrez" = c("8355", "8355",
"340307", "441294", "285501", "8263", "474383", "474384"))