0

I have this dataset:

 df<-  structure(list(p1 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA), p2 = structure(c(NA, NA, 5L, 6L, 
    NA, 2L, 7L, NA, NA, 4L, NA, 3L, NA, 1L, 1L, 1L, 1L), .Label = c("", 
    "R16", "R29", "R3", "R36", "R40", "R56"), class = "factor"), 
        p3 = structure(c(NA, 1L, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA), .Label = "R33", class = "factor")), .Names = c("p1", 
    "p2", "p3"), class = "data.frame", row.names = c(NA, -17L))

I would like to remove the "cells" not the rows which contain na.

I tried this:

na.omit(df)

but this is not working I guess because it is for the whole row.

How can I remove NA from cells and not the whole row?

Example output:

p2  p3
R36 R33
R40 
R16 
R56 
R3  
R29 
Sasak
  • 189
  • 5
  • 15

1 Answers1

2

You can't do that when the object is a data frame since a data frame is a bundle of list objects of equal length. So a data frame is basically a formatted list! And your output requires that the list items are not of the same length.

So first you should convert the data frame into a list and then lapply through the list items:

dfl <- as.list(df)
dfn <- lapply(dfl, function(x) x[!is.na(x)])

And the output is:

> dfn
$p1
logical(0)

$p2
 [1] R36 R40 R16 R56 R3  R29                
Levels:  R16 R29 R3 R36 R40 R56

$p3
[1] R33
Levels: R33

When you try to convert it into a data frame again:

> as.data.frame(dfn)
    Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
      arguments imply differing number of rows: 0, 10, 1
Serhat Cevikel
  • 720
  • 3
  • 11