I have a very simple question. All I can find however are very complicated answers that do not do exactly what I need.
What came closest, I found here:
Answer by flodel and eddi (data.table)
I would however like to additionally specify how to treat the NA's in the specified column based on the value in a different column.
I have a data.table which has columns with NA's, where fac
is a factor variable.
df <- fread(
"A B C fac H I J iso year matchcode
0 1 1 NA 0 1 0 NLD 2009 NLD2009
1 0 0 NA 1 0 1 NLD 2014 NLD2014
0 0 0 B 1 0 0 AUS 2011 AUS2011
1 0 1 B 0 1 0 AUS 2007 AUS2007
0 1 0 NA 0 1 1 USA 2007 USA2007
0 0 1 NA 0 0 1 USA 2011 USA2010
0 1 0 NA 0 0 0 USA 2013 USA2013
1 0 1 A 0 1 0 BLG 2007 BLG2007
0 1 0 A 1 0 1 BEL 2009 BEL2009
1 0 1 A 0 1 0 BEL 2012 BEL2012",
header = TRUE
)
What I would like to do is to assign the values D
and E
to the NA's in column fac
based on the values in iso3c
. So when iso3c == NLD
, the NA's in fac
should be replaced by D
and when iso3c == USA
the NA's in fac
should be replaced by E
, leading to the following result.
df <- fread(
"A B C fac H I J iso year matchcode
0 1 1 D 0 1 0 NLD 2009 NLD2009
1 0 0 D 1 0 1 NLD 2014 NLD2014
0 0 0 B 1 0 0 AUS 2011 AUS2011
1 0 1 B 0 1 0 AUS 2007 AUS2007
0 1 0 E 0 1 1 USA 2007 USA2007
0 0 1 E 0 0 1 USA 2011 USA2010
0 1 0 E 0 0 0 USA 2013 USA2013
1 0 1 A 0 1 0 BLG 2007 BLG2007
0 1 0 A 1 0 1 BEL 2009 BEL2009
1 0 1 A 0 1 0 BEL 2012 BEL2012",
header = TRUE
)
EDIT: The fact that fac
is a factor variable gave some issues. What worked is the following:
df$fac<- as.character(df$fac)
df[, fac:= ifelse(is.na(fac) & iso3c == "NLD", "D",
ifelse(is.na(fac) & iso3c == "USA", "E", wbgroup))][]
df[, fac:= factor(fac, levels = c(levels(fac), c('A', 'B', 'C', 'D', 'E', 'F', 'G')))]