2

Here is a data frame:

df <- data.frame('a' = c('NULL',1,4,5), 'b' = c(5,6,3,'NULL'), 'c' = c(9,'NULL',9,3))

Output:

     a    b    c
1 NULL    5    9
2    1    6 NULL
3    4    3    9
4    5 NULL    3

What I am trying to do remove all cells with a null value. One method is like this:

df2 <- data.frame('a' = subset(df, !(a == 'NULL'))$a, 
                 'b' = subset(df, !(b == 'NULL'))$b,
                 'c' = subset(df, !(c == 'NULL'))$c)

Output:

  a b c
1 1 5 9
2 4 6 9
3 5 3 3

However, this is inefficient. Is there away to remove any cell with a null value?

joran
  • 169,992
  • 32
  • 429
  • 468
B C
  • 318
  • 3
  • 16
  • 1
    May be a repeat of this http://stackoverflow.com/questions/34619124/how-to-clean-or-remove-na-values-from-a-dataset-without-remove-the-column-or-row – Pierre L May 16 '16 at 16:28

2 Answers2

2
as.data.frame(sapply(df, function(x) x[x != "NULL"]))
#   a b c
# 1 1 5 9
# 2 4 6 9
# 3 5 3 3

From this link How to clean or remove NA values from a dataset without remove the column or row

Since it is not an exact duplicate, an answer was added

Community
  • 1
  • 1
Pierre L
  • 28,203
  • 6
  • 47
  • 69
1

Another solution that gives the same answer, but also works if there are a different number of "NULL" values in each column:

#Works for original problem
df <- data.frame('a' = c('NULL',1,4,5), 'b' = c(5,6,3,'NULL'), 'c' = c(9,'NULL',9,3))
res<-as.data.frame(sapply(df, function(x) {
                                      c(as.numeric(levels(x))[x[x != "NULL"]],
                                        rep("NULL",sum(x == "NULL")))
                                      }))
res2 <- res[rowSums(res=="NULL")==0,]    

res2
#  a b c
#1 1 5 9
#2 4 6 9
#3 5 3 3

Now for an example with different number of "NULL" in each column:

df <- data.frame('a' = c('NULL',1,4,5,6), 'b' = c("NULL",6,3,'NULL',7), 'c' = c(9,'NULL',"NULL","NULL",10))
res<-as.data.frame(sapply(df, function(x) {
                                      c(as.numeric(levels(x))[x[x != "NULL"]],
                                        rep("NULL",sum(x == "NULL")))
                                      }))
res2 <- res[rowSums(res=="NULL")==0,]       

res2
#  a b  c
#1 1 6  9
#2 4 3 10
Mike H.
  • 13,960
  • 2
  • 29
  • 39