0

I want to transform my data.frame from this form:

Day Fruits
Monday  Apple, Banana, Orange
Tuesday Apple, Banana
Thursday    Orange, Melone
Saturday    Apple
Sunday  Banana

to that form:

Day  Fruit
Monday  Apple
Monday  Banana
Monday  Orange
Tuesday Apple
Tuesday Banana
Thursday    Orange
Thursday    Melone
Saturday    Apple
Sunday  Banana

The part I don't know how to solve is that the fruits column has a list of different items.

This is my working example:

day <- c("Monday", "Tuesday", "Thursday", "Saturday", "Sunday")
fruits <- c("Apple, Banana, Orange", "Apple, Banana", "Orange, Melone", "Apple", "Banana")


df <- data.frame(day, fruits)
df

library(reshape2)
df[, c(2:4)] <- colsplit(df[, 2], ", ", c("fruit1", "fruit2", "fruit3"))
df
melt(df, id.vars="day")

I can split the Fruits column into multiple columns but the thing is that there might be more than just three fruits. And I still don't get the table into long format with melt as I want it to be.

Thanks for help!

Tensibai
  • 15,557
  • 1
  • 37
  • 57
qg7el
  • 87
  • 1
  • 4

1 Answers1

1

You can simply use the package splitstackshape. It does all the work for you. :-) The cSplit function splits concatenated data into separate values. you should mention splitCols = "fruits" which is the column that needs to be split. The comma serves as a delimiter within each column in here. and we specify the desired direction as "long" because we are working colum-wise

library(splitstackshape)

df2 <- cSplit(df, splitCols = "fruits", sep = ", ", direction = "long")

> df2
#        day fruits
#1:   Monday  Apple
#2:   Monday Banana
#3:   Monday Orange
#4:  Tuesday  Apple
#5:  Tuesday Banana
#6: Thursday Orange
#7: Thursday Melone
#8: Saturday  Apple
#9:   Sunday Banana