0

I have a data.table which has lots of NAs. I want to squeeze the data.table so that it can be represented in a compact way

dt <- data.table(id=1:5, c('x',NA,'y',NA,'y'), c(NA,NA,NA,NA,'z'), c('y','z','y','x',NA))
dt
>   id V2 V3 V4
>1:  1  x NA  y
>2:  2 NA NA  z
>3:  3  y NA  y
>4:  4 NA NA  x
>5:  5  y  z NA

The output I am looking for is

dt
>   id V2 V3
>1:  1  x y
>2:  2  z NA
>3:  3  y y
>4:  4  x NA
>5:  5  y  z
imsc
  • 7,492
  • 7
  • 47
  • 69
  • Something similar: http://stackoverflow.com/questions/36639599/remove-leading-nas-to-align-data (there is also a `data.table`-answer) But you want to do it for rows. – jogo May 13 '16 at 12:16

2 Answers2

5

We group by "id", unlist the "Subset of Data.table" (.SD), remove the "NA" with na.omit, create a sequence column by 'id', and then dcast from 'long' to 'wide'.

library(data.table)
dcast(dt[, na.omit(unlist(.SD)) , id][, N:= paste0("V", 1:.N), id], id~N, value.var="V1")
#    id V1 V2
#1:  1  x  y
#2:  2  z NA
#3:  3  y  y
#4:  4  x NA
#5:  5  y  z

Or we can use melt (as @Frank suggested)

melt(dt, id = "id", na.rm = TRUE)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 3
    Nice. In my opinion, they should stop at `dt[, na.omit(unlist(.SD)) , id]` which is roughly equivalent to `melt(dt, id = "id", na.rm = TRUE)` – Frank May 13 '16 at 12:13
-1
dt$v2 <-ifelse(is.na(dt$v2),dt$v4,dt$v2)
dt$v3[1] <- replace(dt$v3[1],is.na(dt$v3[1]),dt$v4[1]) 
dt$v3[3] <- replace(dt$v3[3],is.na(dt$v3[3]),dt$v4[3]) 
dt$v4 <- NULL
Arun kumar mahesh
  • 2,289
  • 2
  • 14
  • 22