How to split a column of a dataframe into two distinct columns based on the value of a second column in R?

Question

I have a dataframe with three columns: one with IDs, one with color IDs, and one with colors. Each Id is associated to two color IDs and colors. What I want to do is to split the color column into two distinct columns, color.1 and color.2, with color.1 containing all colors associated to a color ID of "a" and color.2 containing all colors associated to a color ID of "b". The following code exemplifies my input and desired output:

# Building input dataframe.
df.input <- data.frame(id = c(1, 1, 2, 2, 3, 3),
                       color.id = c("a", "b", "a", "b", "a", "b"),
                       color = c("red", "orange", "green", "blue", "yellow", "purple"))

# Visualizing input dataframe.
df.input

# Building desired output dataframe.
df.output <- data.frame(id = c(1, 2, 3),
                        color.1 = c("red", "green", "yellow"),
                        color.2 = c("orange", "blue", "purple"))

# Visualizing desired output dataframe.
df.output

In base R, with `reshape`, use `reshape(df.input, direction="wide", idvar="id", timevar="color.id")`. — lmo, Apr 16 '18 at 18:49

score 1 · Accepted Answer · answered Apr 16 '18 at 18:41

1

Use spread:

library(tidyr)
df.input %>% spread(color.id, color)

  id      a      b
1  1    red orange
2  2  green   blue
3  3 yellow purple

answered Apr 16 '18 at 18:41

Martin Schmelzer

23,283
6
73
98

How to split a column of a dataframe into two distinct columns based on the value of a second column in R?

1 Answers1