0

I am starting to use tidyr and dplyr. I have the following data frame:

                            email Assignment   Stage  Grade
1                     foo1@bar.com    course   final  86.28
2                     foo2@bar.com    course   first  68.87
3                     foo3@bar.com    course   resub  38.06
4                     foo3@bar.com    course   final  77.41
...

I would like to restructure this so that based on the value of Stage (first, resub or final) I create three columns from the one Grade column corresponding to the value of Stage

                            email Assignment   first  resub  final
1                     foo1@bar.com    course   100.0  100.0  100.0
2                     foo2@bar.com    course   100.0  100.0  100.0
3                     foo3@bar.com    course   100.0  100.0  100.0
4                     foo3@bar.com    course   100.0  100.0  100.0

(data is obviously not matching because of cut/paste.)

I am confused, do I need a separate() function, but how?

halfer
  • 19,824
  • 17
  • 99
  • 186
pitosalas
  • 10,286
  • 12
  • 72
  • 120

1 Answers1

1

The spread() function from tidyr should get you the results you need.

email <- c("foo1@bar.com","foo2@bar.com","foo3@bar.com","foo3@bar.com")
Assignment <- rep("course",4)
Stage <- c("final","first","resub","final")
Grade <- c(86.28,68.87,38.06,77.41)

df <- data.frame(email,Assignment,Stage,Grade,stringsAsFactors = FALSE)

df <- df %>% 
      spread(Stage, Grade)
Matt Jewett
  • 3,249
  • 1
  • 14
  • 21