Remove null value from nested list in R

Question

I have a nested list of data frames. In those data frames I have NA variables (vectors now?). I want to remove those elements.

EDIT: actually I have NULL instead of NA.

df.ls <- list(list(id = NULL, x = 3, works = NULL),
                 list(id = 2, x = 4, works = NULL),
              NULL)

I tried this code, but don't know how to tell which level should it use.

df.ls[sapply(df.ls, is.null)] <- NULL

Possible duplicate of https://stackoverflow.com/q/26539441/4137985 — Cath, Jul 08 '19 at 12:22
@Cath This solution worked for me. I didn't found it, because I searched for NA instead for NULL, which I have in my case. — Mar, Jul 08 '19 at 12:27
Whether this is a duplicate depends on whether the OPs data is a list of lists with single length elements or a list of data.frames. — TimTeaFan, Jul 08 '19 at 12:48

Ronak Shah · Answer 1 · 2019-07-08T12:25:07.923

3

For NULL values we can do

l1 <- lapply(df.ls, function(x) x[lengths(x) > 0])

For NAs we can do

l1 <- lapply(df.ls, function(x) x[!is.na(x)])
l1

#[[1]]
#[[1]]$x
#[1] 3


#[[2]]
#[[2]]$id
#[1] 2

#[[2]]$x
#[1] 4


#[[3]]
#list()

If you want to remove the empty list, you can do

l1[lengths(l1) >  0]

edited Jul 08 '19 at 12:25

answered Jul 08 '19 at 12:01

Ronak Shah

377,200
20
156
213

TimTeaFan · Answer 2 · 2019-07-08T12:23:39.440

I am not sure what you are trying to do, since you say you have a list of data.frames but the example you provide is only a list of lists with elements of length one.

Lets assume you have a list of data.frames, which in turn contain vectors of length > 1, and you want to drop all columns that "only" contain NAs.

df.ls <- list(data.frame(id = c(NA,NA,NA),
                         x = c(NA,3,5),
                         works = c(4,5,NA)),
              data.frame(id = c("a","b","c"),
                         x = c(NA,3,5),
                         works = c(NA,NA,NA)),
              data.frame(id = c("e","d",NA),
                         x = c(NA,3,5),
                         works = c(4,5,NA)))



>   [[1]]
      id  x works
    1 NA NA     4
    2 NA  3     5
    3 NA  5    NA

    [[2]]
      id  x works
    1  a NA    NA
    2  b  3    NA
    3  c  5    NA

    [[3]]
        id  x works
    1    e NA     4
    2    d  3     5
    3 <NA>  5    NA

Then this approach will work:

    library(dplyr)
    library(purrr)
    non_empty_col <- function(x) {
        sum(is.na(x)) != length(x)
    }

    map(df.ls, ~ .x %>% select_if(non_empty_col))

Which returns your list of data.frames without columns that contain only NA.

[[1]]
   x works
1 NA     4
2  3     5
3  5    NA

[[2]]
  id  x
1  a NA
2  b  3
3  c  5

[[3]]
    id  x works
1    e NA     4
2    d  3     5
3 <NA>  5    NA

If you, however, prefer your list to have only complete cases in each data.frame (rows with no NAs), then the following code will work.

library(dplyr)
map(df.ls, ~ .x[complete.cases(.x), ])

Leaving you, in case of my example data, only with row 2 of data.frame 3.

akrun · Answer 3 · 2019-07-08T12:59:16.210

To remove the NULL

discard(map(df.ls, ~ discard(.x, is.null)), is.null)
#[[1]]
#[[1]]$x
#[1] 3


#[[2]]
#[[2]]$id
#[1] 2

#[[2]]$x
#[1] 4

Or in base R with Filter and is.null

Filter(Negate(is.null), lapply(df.ls, function(x) Filter(Negate(is.null), x)))

Earlier version before the OP's update

library(purrr)
map(df.ls, ~ .x[!is.na(.x)])
#[[1]]
#[[1]]$x
#[1] 3


#[[2]]
#[[2]]$id
#[1] 2

#[[2]]$x
#[1] 4


#[[3]]
#list()

Remove null value from nested list in R

3 Answers3