0

I have a dataframe with three columns: one with IDs, one with color IDs, and one with colors. Each Id is associated to two color IDs and colors. What I want to do is to split the color column into two distinct columns, color.1 and color.2, with color.1 containing all colors associated to a color ID of "a" and color.2 containing all colors associated to a color ID of "b". The following code exemplifies my input and desired output:

# Building input dataframe.
df.input <- data.frame(id = c(1, 1, 2, 2, 3, 3),
                       color.id = c("a", "b", "a", "b", "a", "b"),
                       color = c("red", "orange", "green", "blue", "yellow", "purple"))

# Visualizing input dataframe.
df.input

# Building desired output dataframe.
df.output <- data.frame(id = c(1, 2, 3),
                        color.1 = c("red", "green", "yellow"),
                        color.2 = c("orange", "blue", "purple"))

# Visualizing desired output dataframe.
df.output
michaelmccarthy404
  • 498
  • 1
  • 5
  • 19
  • In base R, with `reshape`, use `reshape(df.input, direction="wide", idvar="id", timevar="color.id")`. – lmo Apr 16 '18 at 18:49

1 Answers1

1

Use spread:

library(tidyr)
df.input %>% spread(color.id, color)

  id      a      b
1  1    red orange
2  2  green   blue
3  3 yellow purple
Martin Schmelzer
  • 23,283
  • 6
  • 73
  • 98