My dataset is constructed as follows:
# A tibble: 20 x 8
iso3 year Var1 Var1_imp Var2 Var2_imp Var1_type Var2_type
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 ATG 2000 NA 144 NA 277 imputed imputed
2 ATG 2001 NA 144 NA 277 imputed imputed
3 ATG 2002 NA 144 NA 277 imputed imputed
4 ATG 2003 NA 144 NA 277 imputed imputed
5 ATG 2004 NA 144 NA 277 imputed imputed
6 ATG 2005 NA 144 NA 277 imputed imputed
7 ATG 2006 NA 144 NA 277 imputed imputed
8 ATG 2007 144 144 277 277 observed observed
9 ATG 2008 45 45 NA 301 observed imputed
10 ATG 2009 NA 71.3 NA 325 imputed imputed
11 ATG 2010 NA 97.7 NA 349 imputed imputed
12 ATG 2011 NA 124 NA 373 imputed imputed
13 ATG 2012 NA 150. NA 397 imputed imputed
14 ATG 2013 NA 177. 421 421 imputed observed
15 ATG 2014 NA 203 434 434 imputed observed
16 ATG 2015 NA 229. 422 422 imputed observed
17 ATG 2016 NA 256. 424 424 imputed observed
18 ATG 2017 282 282 429 429 observed observed
19 ATG 2018 NA 282 435 435 imputed observed
20 EGY 2000 NA 38485 NA 146761 imputed imputed
I am new to R and I would like to create a line chart for each country with time series for variables Var1_imp and Var2_imp on the same chart (I have 193 countries in my database with data from 2000 to 2018) using filled circles when data are observed and unfilled circles when data are imputed (based on Var1_type and VAr2_type). Circles would be joined with lines if two subsequent data points are observed otherwise circles would be joined with dotted lines.
The main goal is to check country by country if the method used to impute missing data is good or bad, depending on whether there are outliers in time series.
I have tried the following:
ggplot(df, aes(x=year, y=Var1_imp, group=Var1_type))
+ geom_point(size=2, shape=21) # shape = 21 for unfilled circles and shape = 19 for filled circles
+ geom_line(linetype = "dashed") # () for not dotted line, otherwise linetype ="dashed"
I have difficulties to find out: 1/ how to do one single chart per country per variable 2/ how to include both Var1_imp and Var2_imp on the same chart 3/ how to use geom_point based on conditions (imputed versus observed in Var1_type) 4/ how to use geom_line based on conditions (plain line if two subsequent observed data points, otherwise dotted).
Thank you very much for your help - I think this exercise is not easy and I would learn a lot from your inputs.