-2

First post on StackOverflow! I am a beginner at R, and doing several online courses to learn it for data science. I hope you can help me re arrange the format of some data so I can use it for time series analysis and visualisation with ggplot2. I tried search, but it's hard to do a good search when my terminology knowledge is still lacking (I'm not sure of it is data 'manipulation', 'munging', 'wrangling', 'data-cleaning' or something else that I am after).

My data currently looks like this , but want it too look like this. How do I do that in R? Would some package help me?

Note: I don't really mind if the 'net income' and 'year' column are switched. Also, I just used excel to make quick snapshots of the desired dataformat.

Mind you, this dataset continues on and on for ~2000 rows, so whatever code I would use in R, I would need it to work on the huge dataset as well.

thanks!

John

  • Put example data into the question; imgur links probably don't stay around for ever. See [mcve]. Meanwhile, investigate `reshape`. –  Mar 25 '17 at 21:45

2 Answers2

0

going from wide to long format, see package reshape2.

library(reshape2)

#Get a vector that includes the column names you want to change from wide to long format, e.g.
yearsVector=c("2006","2007","2008")

melt(dat, measure.vars=yearsVector, variable="Year", value.name="income")
caw5cv
  • 701
  • 3
  • 9
0
data_melted <- melt(data, id.vars = c("company", "sub.industry"), measured.vars = c("2006","2007","2008", "2009", "2010", "2011", "2012", "2013", "2014", "2015", "2016"), variable = "Year", value.name="income")

did the trick after installing the reshape2 package and loading it in the R script! thanks stackoverflow members dash2 and Cory! Way to go helping out a total novice! Now on to visualising!