I did not find anything specifying how to reshape a data frame with a time column, an id column and a column with several variables which I want to have in different columns.
If only two categories are needed, its trivial:
How to reshape data from long to wide format?
However, I have:
geo time indic_na value
AT 2014Q1 B11 2556
BE 2014Q1 B11 1506.0
... ... ... ...
AT 2014Q1 B1G 72065.1
and I want:
geo time B11 B1G ...
AT 2014Q1 2556 72065.1 ...
AT 2013Q4 2535.4 ...
... ... ... ... ...
BE 2014Q1 1506.0 86513.0 ...
so I want every unique string in indic_na to become one column variable. To get the data:
install.packages("SmarterPoland")
library(zoo)
library(SmarterPoland)
GDP <- getEurostatRCV(kod = "namq_gdp_c")
GDP$time = as.yearqtr(GDP$time)
GDP <- subset(GDP, (s_adj == "SWDA") & (unit == "MIO_EUR") & (time > "1989Q4"))
And then I tried:
testvector <- as.vector(unique(GDP$indic_na))
test <- reshape(data = GDP, direction = "long", idvar = "geo", timevar = "time", varying = testvector)
amongst maaany other things for "varying" ;-) I get this error message:
Error in guess(varying) :
failed to guess time-varying variables from their names
I feel so close! But somehow I can't tell R that the variables are in the 3rd column of my data frame. All examples which I find online already have the different variables in different columns or only have id OR time and a column of variables.
Any help would be great!
Easily reproducible data
> dput(head(GDP))
structure(list(geo = structure(c(1L, 3L, 4L, 5L, 6L, 7L), .Names = c("SWDA,MIO_EUR,B11,AT",
"SWDA,MIO_EUR,B11,BE", "SWDA,MIO_EUR,B11,BG", "SWDA,MIO_EUR,B11,CH",
"SWDA,MIO_EUR,B11,CY", "SWDA,MIO_EUR,B11,CZ"), .Label = c("AT",
"BA", "BE", "BG", "CH", "CY", "CZ", "DE", "DK", "EA", "EA12",
"EA17", "EA18", "EE", "EL", "ES", "EU15", "EU27", "EU28", "FI",
"FR", "HR", "HU", "IE", "IS", "IT", "JP", "LT", "LU", "LV", "ME",
"MK", "MT", "NL", "NO", "PL", "PT", "RO", "RS", "SE", "SI", "SK",
"TR", "UK", "US"), class = "factor"), time = structure(c(2014,
2014, 2014, 2014, 2014, 2014), class = "yearqtr"), indic_na = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Names = c("SWDA,MIO_EUR,B11,AT", "SWDA,MIO_EUR,B11,BE",
"SWDA,MIO_EUR,B11,BG", "SWDA,MIO_EUR,B11,CH", "SWDA,MIO_EUR,B11,CY",
"SWDA,MIO_EUR,B11,CZ"), .Label = c("B11", "B111", "B112", "B1G",
"B1GM", "B1GM_XE", "B1GM_XI", "B1GM_XO", "B2G_B3G", "D1", "D2_M_D3",
"D21_M_D31", "P3", "P3_P5", "P3_S13", "P31_S13", "P31_S14", "P31_S14_S15",
"P31_S15", "P32_S13", "P5", "P51", "P52", "P52_P53", "P53", "P6",
"P7"), class = "factor"), value = c(2556.8, 1506, NA, NA, NA,
3056.1)), .Names = c("geo", "time", "indic_na", "value"), row.names = 7753:7758, class = "data.frame")