0

I'm very new to R and in researching online and consulting my textbooks, I couldn't quite come up with an answer to this question.

So I have a census survey dataset broken down by Congressional district that I have uploaded into R. However, for my purposes, I need each observation be one Congressional district with each demographic data point appearing as a variable. So rather than Alabama's 1st district appearing 100 times for each datapoint in Column E, for instance - I want it to appear once with each of those datapoints in Column E becoming variables as columns. I also need a a way to make this apply to the other 434 Districts in the dataset.

Here is a rough schematic of what it looks like:

CD | VARIABLE | DATA |

AL-1 | Black population | 100,000 |

AL-1 | White population | 200,000 |

AL-1 | Married population 75,000 |

I would like it to look like this:

CD | BLACK POPULATION | WHITE POPULATION | MARRIED POPULATION |

AL-1 | 100,000 | 200,000 | 75,000 |

Any ideas on how to accomplish this, or good learning resources you could point me to?

  • 1
    `tidyr::spread`, `data.table::dcast`, `reshape2::dcast`, `stats::reshape`, I could probably go on if I started looking harder. – joran Apr 11 '16 at 21:49

1 Answers1

0

Here is an example using the reshape function. Lots of options out there as joran points out.

DF.long<-data.frame(state = rep(c("A","B","C","D"),each=3), type =c("XX","YY","ZZ"), value = rnorm(12))
DF.long

DF.wide<-reshape(DF.long,timevar = "type",idvar="state",direction="wide")
DF.wide
gtwebb
  • 2,981
  • 3
  • 13
  • 22
  • I see we're not flagging duplicates anymore – alexwhitworth Apr 11 '16 at 22:39
  • It was an excuse to look into this since I don't work with R all that much I figured I might as well post it. [Another duplicate](http://stackoverflow.com/questions/9617348/reshape-three-column-data-frame-to-matrix) here that gives a much more complete answer is here which has a pretty good write up. – gtwebb Apr 11 '16 at 22:57