How to expand a data frame factor column into one column per level in R?

Question

The goal I'm trying to achieve is to take a data frame column which is a factor, create a new column for each level and populate the column with the appropriate value for that level from the original data frame.

Here is a sample. In this case, I want to create a new column for each level of the the.name factor column, like so:

Original dataframe:

symbol        the.name          cn    
SYM1          ABC               1
SYM2          ABC               2
SYM1          DEF               3
SYM2          DEF               4

Resulting dataframe:

symbol       ABC       DEF
SYM1         1         3
SYM2         2         4

How can this be done?

EDIT: I have tried to achieve this using a sapply loop with split by the column and thenrbinding the results. However, I have not gotten it to work and chose not to add it into this question as it would generate noise - I'm pretty sure that method is not correct and can be considerably improved.

Curious as to why there's a downvote? Looks like a good question but I could be missing something — Señor O, Oct 15 '14 at 20:15
The downvoter probably wanted to highlight that this is a very common question. — ilir, Oct 15 '14 at 20:16
Not the down-voter, but I presume it's because OP didn't show that they tried anything — Rich Scriven, Oct 15 '14 at 20:17
@ilir That was my only thought. It's hard to downvote a question with expected output though :) — Señor O, Oct 15 '14 at 20:22
I'd love some clarification on the downvote too. @ilir: It's a good point my not showing I tried anything (I did try to do this using a double `sapply` loop but threw up a little in my mouth), so I'll add some clarification. Thanks! — Juan Carlos Coto, Oct 15 '14 at 23:54
@ilir Regarding this being a common question, I'd really appreciate it if you could point me to where it has been asked and answered before. Even though the answers here look great, it's always good to have more info :). — Juan Carlos Coto, Oct 16 '14 at 00:03
You could look at [this question](http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix), or [this one](http://stackoverflow.com/questions/5890584/reshape-data-from-long-to-wide-format-r), or even [this](http://stackoverflow.com/questions/22558677/reshape-panel-data-from-long-to-wide). I think knowing they are called long or wide data is key. Look at `melt()` and `dcast()` help from package `reshape2` for some good examples. — ilir, Oct 16 '14 at 08:37

score 6 · Answer 1 · answered Oct 15 '14 at 20:30

6

Alternatively, the newish tidyr package provides does this with the "spread" function. Using @ilir's data

> tidyr::spread(tmp, key = the.name, value = cn)
  symbol ABC DEF
1   SYM1   1   3
2   SYM2   2   4

answered Oct 15 '14 at 20:30

Gregor Thomas

136,190
20
167
294

Señor O · Answer 2 · 2014-10-15T20:23:03.430

5

This is a job for dcast from the package reshape2:

> dcast(df, symbol~the.name, value.var="cn")
  symbol ABC DEF
1   SYM1   1   3
2   SYM2   2   4

edited Oct 15 '14 at 20:23

answered Oct 15 '14 at 20:14

Señor O

17,049
2
45
47

@BenBolker thanks, been using it so long I forgot it wasn't base. – Señor O Oct 15 '14 at 20:23

score 5 · Accepted Answer · answered Oct 15 '14 at 20:15

This is a reshaping task (from long to wide data). The package reshape2 has some great utilities to do this.

txt="symbol        the.name          cn    
      SYM1          ABC               1
      SYM2          ABC               2
      SYM1          DEF               3
      SYM2          DEF               4"

tmp <- read.table(text=txt, header=TRUE)

library(reshape2)
dcast(tmp, symbol ~ the.name)   ## as easy as that

How to expand a data frame factor column into one column per level in R?

3 Answers3