Filling NAs with a dataframe merge

Question

I have a data frame c like this

c
             Freq      CTM
000110100111    2       NA
110110100111    1 32.58847
111001011000    2       NA
111111111111    1 25.61041

and a data frame nona_c like this

   nona_c 
             Freq     CTM
000110100111    2 37.0642
111001011000    2 37.0642

I want to replace the NAs in the CTM column of c with the CTM values of nona_c. The rownames of nona_c (the binary strings) will always exist in c.

The output should be

mergedC
             Freq      CTM
000110100111    2  37.0642
110110100111    1 32.58847
111001011000    2  37.0642
111111111111    1 25.61041

I've been trying merge without success here.

mergedC  <- merge(x = c, y = nona_c, by = 0, #rownames
    all.y = TRUE)

as a side note; it's strange to see an object called `c`, particularly given the `?c` function. May cause issues down the line. — SymbolixAU, Jun 22 '16 at 01:43

score 4 · Accepted Answer · answered Jun 22 '16 at 01:01

A match operation might make this more straightforward:

c$CTM[is.na(c$CTM)] <- nona_c$CTM[match(rownames(c)[is.na(c$CTM)], rownames(nona_c))]

#             Freq      CTM           id
#000110100111    2 37.06420 000110100111
#110110100111    1 32.58847 110110100111
#111001011000    2 37.06420 111001011000
#111111111111    1 25.61041 111111111111

akrun · Answer 2 · 2016-06-22T02:22:16.030

We can do this with data.table using a join on the variable of interest. Here we are joining on the row name column. The values of "i.CTM" are assigned (:=) to the 'CTM'.

library(data.table)
setDT(c, keep.rownames=TRUE)[]
setDT(nona_c, keep.rownames=TRUE)[]

c[nona_c, CTM := i.CTM , on = "rn"]
c
#             rn Freq      CTM
#1: 000110100111    2 37.06420
#2: 110110100111    1 32.58847
#3: 111001011000    2 37.06420
#4: 111111111111    1 25.61041

NOTE: The row.names are not retained in data.table or dplyr. So, while converting the 'data.frame' to 'data.table', we use the keep.rownames = TRUE.

Filling NAs with a dataframe merge

2 Answers2