I have a panel data containing NA values. I would like to fill in the NAs by the values of an other data. Let say I want to complete the following panel
with new.df
.
panel <- data.frame("time" = c(rep(2000,5), rep(2001,5)),
"var1" = rep(1:5, times=2),
"var2" = c(NA,'b','c',NA,'d','a1','b1','c1',NA,'d1'))
new.df <- data.frame("time" = c(2000:2001),
"var1" = c(4,4),
"var2" = c('e','e'))
I tried different combination of merge / aggregate / ddplyr etc.. The issue is that merge
or merge.data.frame
creates additional columns indexed by .x
and .y
even tho the colnames are identical.
> merge(panel,new.df,by = c("time","var1"), all=T)
time var1 var2.x var2.y
1 2000 1 <NA> <NA>
2 2000 2 b <NA>
3 2000 3 c <NA>
4 2000 4 <NA> e
5 2000 5 d <NA>
6 2001 1 a1 <NA>
7 2001 2 b1 <NA>
8 2001 3 c1 <NA>
9 2001 4 <NA> e
10 2001 5 d1 <NA>
I tried also to play with the na.action
option without success because the panel will still be incomplete after merging and the remaining NA
must stay as they are. (Depending on the formulation, NA treatment will in some cases replace NA
by 0
, or by NaN
)
I would like to find a way to target the correct indexes in the panel to "insert" new.df$var2
at its right place, knowing that I have a very large panel and it will remain incomplete at the end.
Thanks in advance.