1

Possible Duplicate:
“unpacking” a factor list from a data.frame

If I have a data frame which has the same format as something like this

X   geneA,geneB
Y   geneD,geneF
Z   geneH,geneL,geneS

I am trying to find a quick and effiecent way of expanding it so that I can split (by a comma) the second column and assign the corresponding value of the first column to give me something like this

X   geneA
X   geneB
Y   geneD
Y   geneF
Z   geneH
Z   geneL
Z   geneS

Thanks in advanced!

Community
  • 1
  • 1
Omar Wagih
  • 8,504
  • 7
  • 59
  • 75

1 Answers1

2

Here is a solution using melt.list from the package reshape2:

library(reshape2)

dat = read.table(header=FALSE, stringsAsFactors=FALSE,
                 text="X   geneA,geneB
                       Y   geneD,geneF
                       Z   geneH,geneL,geneS")

lst = strsplit(dat$V2, ",")
names(lst) = dat$V1

res = melt(lst)

res
#   value L1
# 1 geneA  X
# 2 geneB  X
# 3 geneD  Y
# 4 geneF  Y
# 5 geneH  Z
# 6 geneL  Z
# 7 geneS  Z
bdemarest
  • 14,397
  • 3
  • 53
  • 56