I am using the Synth
package to demonstrate the divergence in development between Djibouti and a synthetic model of Djibouti if it didn't have international intervention.
Despite several similar questions and attempts at the offered answers, I have still be struggling with the error:
unit.variable not found as numeric variable in foo
I have tried several different dataprep()
strategies and still cannot run the code.
ddSMI <- as.data.frame(ddSMI) %>%
mutate(LifeYrs = as.numeric(LifeYrs),
PedYrs = as.numeric(PedYrs),
Health.Index.Total = as.numeric(Health.Index.Total),
Income.Index.Total = as.numeric(Income.Index.Total),
SchoolMean = as.numeric(SchoolMean),
Cno = as.numeric(Cno))
I am trying to produce a synthetic control model and have been using different iterations of this code. Though I have changed the class to numeric successfully, I still get the same error.
Here is the head of my data for reprex
head(ddSMI)
# A tibble: 6 x 8
Year Cno Country PedYrs LifeYrs
<dbl> <dbl> <chr> <chr> <chr>
1 2000 1 Algeria 6.31 69.5999999999999…
2 2001 1 Algeria 6.23 69.2
3 2002 1 Algeria 6.28 69.5
4 2003 1 Algeria 6.32 71.0999999999999…
5 2004 1 Algeria 6.36 71.4000000000000…
6 2005 1 Algeria 6.39 71.7
# … with 3 more variables: SchoolMean <chr>,
# Health Index Total <chr>,
# Income Index Total <chr>
Please see the code below.
dataprep.out <- dataprep(foo = ddSMI,
predictors = c("LifeYrs", "PedYrs", "Health.Index.Total", "Income.Index.Total", "SchoolMean"),
predictors.op = "mean", # the operator
time.predictors.prior = 2007:2008, #the entire time frame from the #beginning to the end
special.predictors = list(
list("HDI Rank", 2000:2020, "mean"),
list("LifeYrs", seq(2007,2008,2), "mean"),
list("PedYrs", seq(2007,2008,2), "mean"),
list("Health Index Total", seq(2007, 2008, 2), "mean"),
list("Income Index Total", seq(2007,2008, 2), "mean"),
list("School Mean", seq(2007, 2008, 2), "mean")),
dependent = "HDI Rank", #dv
unit.variable = "Cno", #identifying unit numbers
unit.names.variable = "Country", #identifying unit names
time.variable = "Year", #time period
treatment.identifier = 5,#the treated case
controls.identifier = c(2:4, 6:15),#the control cases; all others #except number 5
time.optimize.ssr = 2007:2008,#the time-period over which to optimize
time.plot = 2000:2020)#the entire time period before/after the treatment
Here is a helpful resource on the Synth package which I used to help guide/ troubleshoot: "Synth: An R Package for Synthetic Control Methodsin Comparative Case Studies"
My data is in the same format and yet...can't get it to run! It would be immensely appreciated if anyone can crack this!