0

I want to convert, with as little code as possible, the following DF so that the date variable remains as a column, but the factor levels each become variable names and the data of each level becomes the data in each of these respective variables. For example, the variable red should contain the values 50, 45 and 55. I have considered filtering, transposing and melting and casting, but I cannot figure out a way.

Please help! Thank you!

df <- data.frame(factor1 = factor(x=rep(1:3, 3), levels=1:3, labels=c("red", "blue", "green")), 
                 data1 = c(50, 35, 15, 45, 40, 25, 55, 35, 10), 
                 date1 = rep(c(2000, 2004, 2008), each=3))
Spaniel
  • 329
  • 3
  • 15
  • 2
    With `dplyr` and `tidyr`, you can try: `df %>% group_by(factor1) %>% mutate(rowid = row_number()) %>% spread(factor1, data1)`. – tmfmnk Aug 08 '19 at 12:58
  • That did work, thank you. What was the point of rowid = row_number()? Is it needed for spread to work? – Spaniel Aug 08 '19 at 13:08
  • Yes, it creates a unique identifier to be able to spread. – tmfmnk Aug 08 '19 at 13:11
  • Last question, if I may: spread picks up on the identifier without it being stated explicitly in the spread arguments? – Spaniel Aug 08 '19 at 13:15
  • I just checked it. For this example, you can just do `df %>% spread(factor1, data1)`. You need an unique identifier when a combinations of values appears more than once :) – tmfmnk Aug 08 '19 at 13:19

0 Answers0