I have a .csv table with world bank development data from the years 2016-2020. In the beginning I only used data from 2020, however I had to add years as for some of the variables the data was not available. The .csv table contained countrycodes and series names and the values for each year. So for example
Country.Code | Series.Name |
---|---|
Arg | Gini index |
Arg | Population |
I used the dcast function like this:
ControlM <- dcast(Control, Country.Code ~ Series.Name)
This worked and I had a table with only one row with the Country code and then columns with the values, but it only kept the values for 2020. Then I tried to specify, which columns it should keep by adding:
ControlM <- dcast(
Control,
Country.Code ~ Series.Name,
value.var = c("2016", "2017", "2018", "2019", "2020")
)
However, I get the following Error:
Error in if (!(value.var %in% names(data))) { :
the condition has length > 1
I tried different solutions I found here. For example: Error using dcast with multiple value.var and converted the data frame to a data table first and setting value.var = NULL
. The error message stays the same.
I would be really glad, if someone had a suggestion for me.
Edit: Thank you for your quick response! Here is the ouput for dput(head(Contro)):
structure(list(Country.Name = c("Argentina", "Argentina", "Argentina",
"Argentina", "Argentina", "Armenia"), Country.Code = c("ARG",
"ARG", "ARG", "ARG", "ARG", "ARM"), Series.Name = c("Gini index",
"Trade (% of GDP)", "Population density (people per sq. km of land area)",
"Population, total", "Educational attainment, at least completed post-secondary, population 25+, total (%) (cumulative)",
"Gini index"), Series.Code = c("SI.POV.GINI", "NE.TRD.GNFS.ZS",
"EN.POP.DNST", "SP.POP.TOTL", "SE.SEC.CUAT.PO.ZS", "SI.POV.GINI"
), `2016` = c("42", "26.0938878488799", "15.9281350828921", "43590368",
"..", "32.5"), `2017` = c("41.1", "25.2896011376779", "16.0941907925267",
"44044811", "..", "33.6"), `2018` = c("41.3", "30.7625359549926",
"16.2585100979651", "44494502", "..", "34.4"), `2019` = c("42.9",
"32.6306150458499", "16.4208266190179", "44938712", "..", "29.9"
), `2020` = c("42.3", "30.2197998857878", "16.580892611147",
"45376763", "..", "25.2")), row.names = c(NA, -6L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x00000193c0bd5930>)
The problem is that I want the series names as column names. For example:
Country.Code | Gini Index |
---|---|
ARG | 50 |
ARM | 50 |
as opposed to the example above. The problem is that the dcast function by default only keeps the data for 2020, but I need the data for 2016-2020. I hope that helps to clarify my problem. Oh and as to which packages I used I installed both weeks ago (data.table and reshape2) and am not sure, which one I used here. I am sorry, still really struggling with R :D