1

I have a nested list of data frames. In those data frames I have NA variables (vectors now?). I want to remove those elements.

EDIT: actually I have NULL instead of NA.

df.ls <- list(list(id = NULL, x = 3, works = NULL),
                 list(id = 2, x = 4, works = NULL),
              NULL)

I tried this code, but don't know how to tell which level should it use.

df.ls[sapply(df.ls, is.null)] <- NULL
Mar
  • 117
  • 10
  • 1
    Possible duplicate of https://stackoverflow.com/q/26539441/4137985 – Cath Jul 08 '19 at 12:22
  • 1
    @Cath This solution worked for me. I didn't found it, because I searched for NA instead for NULL, which I have in my case. – Mar Jul 08 '19 at 12:27
  • Whether this is a duplicate depends on whether the OPs data is a list of lists with single length elements or a list of data.frames. – TimTeaFan Jul 08 '19 at 12:48

3 Answers3

3

For NULL values we can do

l1 <- lapply(df.ls, function(x) x[lengths(x) > 0])

For NAs we can do

l1 <- lapply(df.ls, function(x) x[!is.na(x)])
l1

#[[1]]
#[[1]]$x
#[1] 3


#[[2]]
#[[2]]$id
#[1] 2

#[[2]]$x
#[1] 4


#[[3]]
#list()

If you want to remove the empty list, you can do

l1[lengths(l1) >  0]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

I am not sure what you are trying to do, since you say you have a list of data.frames but the example you provide is only a list of lists with elements of length one.

Lets assume you have a list of data.frames, which in turn contain vectors of length > 1, and you want to drop all columns that "only" contain NAs.

df.ls <- list(data.frame(id = c(NA,NA,NA),
                         x = c(NA,3,5),
                         works = c(4,5,NA)),
              data.frame(id = c("a","b","c"),
                         x = c(NA,3,5),
                         works = c(NA,NA,NA)),
              data.frame(id = c("e","d",NA),
                         x = c(NA,3,5),
                         works = c(4,5,NA)))



>   [[1]]
      id  x works
    1 NA NA     4
    2 NA  3     5
    3 NA  5    NA

    [[2]]
      id  x works
    1  a NA    NA
    2  b  3    NA
    3  c  5    NA

    [[3]]
        id  x works
    1    e NA     4
    2    d  3     5
    3 <NA>  5    NA

Then this approach will work:

    library(dplyr)
    library(purrr)
    non_empty_col <- function(x) {
        sum(is.na(x)) != length(x)
    }

    map(df.ls, ~ .x %>% select_if(non_empty_col))

Which returns your list of data.frames without columns that contain only NA.

[[1]]
   x works
1 NA     4
2  3     5
3  5    NA

[[2]]
  id  x
1  a NA
2  b  3
3  c  5

[[3]]
    id  x works
1    e NA     4
2    d  3     5
3 <NA>  5    NA

If you, however, prefer your list to have only complete cases in each data.frame (rows with no NAs), then the following code will work.

library(dplyr)
map(df.ls, ~ .x[complete.cases(.x), ])

Leaving you, in case of my example data, only with row 2 of data.frame 3.

TimTeaFan
  • 17,549
  • 4
  • 18
  • 39
0

To remove the NULL

discard(map(df.ls, ~ discard(.x, is.null)), is.null)
#[[1]]
#[[1]]$x
#[1] 3


#[[2]]
#[[2]]$id
#[1] 2

#[[2]]$x
#[1] 4

Or in base R with Filter and is.null

Filter(Negate(is.null), lapply(df.ls, function(x) Filter(Negate(is.null), x)))

Earlier version before the OP's update

library(purrr)
map(df.ls, ~ .x[!is.na(.x)])
#[[1]]
#[[1]]$x
#[1] 3


#[[2]]
#[[2]]$id
#[1] 2

#[[2]]$x
#[1] 4


#[[3]]
#list()

akrun
  • 874,273
  • 37
  • 540
  • 662