I have a map table with this structure:
structure(list(REF_ID = structure(1:10, .Label = c("202533_s_at",
"202534_x_at", "202551_s_at", "202552_s_at", "202555_s_at", "202565_s_at",
"202566_s_at", "202580_x_at", "202581_at", "202589_at"), class = "factor"),
GeneSymbol = structure(c(2L, 2L, 1L, 1L, 5L, 6L, 6L, 3L, 4L, 7L), .Label =
c("CRIM1 /// LOC101929500", "DHFR", "FOXM1", "HSPA1A /// HSPA1B", "MYLK",
"SVIL", "TYMS"), class = "factor")), .Names = c("REF_ID", "GeneSymbol"),
class = "data.frame", row.names = c(NA, -10L))
In row 3, 4 and 9, there are multiple GeneSymbol
that matches with a single REF_ID
. (Here ///
is the delimiter). Thus in row 3, two gene symbols matches with a single REF_ID
.
I want a modified table (with all existing mapping) such that the REF_ID
will be repeated as many times as it matches with a separate gene symbol.
Thus I want two separate rows for row 3 with entries: one row with REF_ID == 202551_s_at
and GeneSymbol == CRIM1
and another row with REF_ID == 202551_s_at
and GeneSymbol == LOC101929500
.
Can you help me out please.